全部博文(24)
分类: 嵌入式
2014-12-28 22:35:35
网络上的介绍webkit资源加载过程的文档已经不少了,看了之后有一点似懂非懂的感觉,况且看完了就看完了,过几天又还给作者了,所以我还是决定自己对着代码把资源加载的流程走一下,加深一些记忆。
首先了解一下MemoryCache
MemoryCache维护着所有缓存的资源列表,这是一个hash表,hash key是资源的url, value是CachedResource*,如果你了解MemCached(用于服务器缓存)的话,完全可以用它的概念来理解这里的MemoryCache,完全是一回事。
HTML支持的资源主要包含以下这些类型:
1. HTML页面
2. 字体文件
3. 图片
4. CSS Shader
5. 视频,音频和字幕
6. Script
7. CSS样式表
8. XSL样式表
9. SVG:用来绘制SVG的2D图形
在Webkit当中对这9种类型的资源都有对应的类来表示,这些类都是以CachedResource为父类
CachedResource主要有下面几个子类用来表示具体的资源:
CachedRawResource
CachedFont
CachedImage
CachedShader
CachedTextTrack
CachedScript
CachedCSSStyleSheet
CachedXSLStyleSheet
CachedSVGDocument
资源加载过程当中,首先与外界接触的是CachedResourceLoader类,它会先访问MemoryCache,如果找到对应的资源,则直接返回,否则就定义ResourceRequest,进而调用ResourceLoader去从网络访问资源。
Webkit中的三类Loader:
1. ResourceLoader 从网络栈或本地磁盘访问资源
2. CachedResourceLoader从MemoryCache访问资源
3. 专有Loader,例如FontLoader, LinkLoader,ImageLoader,它们在Element当中用于与使用者打交道的
ImageLoader加载图片的整个过程:
HTMLImageElement被创建时同时会创建ImageLoader实例,它用来加载图片的。那就从updateFromElement()中看看这个ImageLoader实例m_imageLoader是如何使用的:
void ImageLoader::updateFromElement()
{
Document* document = m_element->document();
CachedResourceHandle
if (!attr.isNull() && !stripLeadingAndTrailingHTMLSpaces(attr).isEmpty()) {
CachedResourceRequest request(ResourceRequest(document->completeURL(sourceURI(attr))));
……
if (m_loadManually) {
……
} else {
//从document取cachedResourceLoader,从而调用CachedResourceLoader的requestImage方法,下面看requestImage方法
newImage = document->cachedResourceLoader()->requestImage(request);
}
// If we do not have an image here, it means that a cross-site
// violation occurred, or that the image was blocked via Content
// Security Policy, or the page is being dismissed. Trigger an
// error event if the page is not being dismissed.
if (!newImage && !pageIsBeingDismissed(document)) {
m_failedLoadURL = attr;
m_hasPendingErrorEvent = true;
errorEventSender().dispatchEventSoon(this);
} else
clearFailedLoadURL();
} else if (!attr.isNull()) {
……
}
}
省掉了大部分代码,只看如何请求图片资源的,在CachedResourceLoader当中的资源请求最终都会调用到requestResource
CachedResourceHandle
{
……
return
static_cast
}
CachedResourceHandle
{
KURL url = request.resourceRequest().url();
……
// 查MemoryCache
resource = memoryCache()->resourceForRequest(request.resourceRequest());
// 决定接下来要做的动作
const RevalidationPolicy policy = determineRevalidationPolicy(type, request.mutableResourceRequest(), request.forPreload(), resource.get(), request.defer());
switch (policy) {
case Reload:
memoryCache()->remove(resource.get());
// Fall through
case Load:
resource = loadResource(type, request, request.charset());
break;
case Revalidate:
resource = revalidateResource(request, resource.get());
break;
case Use:
if (!shouldContinueAfterNotifyingLoadedFromMemoryCache(resource.get()))
return 0;
memoryCache()->resourceAccessed(resource.get());
break;
}
if (!resource)
return 0;
if (!request.forPreload() || policy != Use)
resource->setLoadPriority(request.priority());
if ((policy != Use || resource->stillNeedsLoad()) && CachedResourceRequest::NoDefer == request.defer()) {
// 加载资源
resource->load(this, request.options());
// We don't support immediate loads, but we do support immediate failure.
if (resource->errorOccurred()) {
if (resource->inCache())
memoryCache()->remove(resource.get());
return 0;
}
}
……
return resource;
}
CachedResourceLoader::requestResource(…)暂且不关心资源已经在MemeoryCache中的情况,这里只看资源从网络栈请求的情况下是如何走的。资源的加载最终会走到resource->load(this, request.options());,而在这里调用的应该CachedImage::load(…),稍微留言一下这里的两个参数,第一个是this,意指CachedResourceLoaderr指针,第二个是一些参数,暂不关心。
void CachedImage::load(CachedResourceLoader* cachedResourceLoader, const ResourceLoaderOptions& options)
{
if (!cachedResourceLoader || cachedResourceLoader->autoLoadImages())
CachedResource::load(cachedResourceLoader, options);
else
setLoading(false);
}
又调用了CachedResource::load(…),在此函数中我只关心那一行最长的代码:
m_loader = platformStrategies()->loaderStrategy()->resourceLoadScheduler()->scheduleSubresourceLoad(cachedResourceLoader->frame(), this, request, request.priority(), options);
搜遍了整个代码发现有两个地方有:scheduleSubresourceLoad,一处是在WebCore当中,另一处是在Webkit2的WebProcess当中实现的,这里暂且不针对webkit2的多进程进行分析,所以就直接看WebCore当中的实现:
PassRefPtr
{
RefPtr
if (loader)
scheduleLoad(loader.get(), priority);
return loader.release();
}
这个函数很简单,创建loader并进行调度就结束了。其实就是将资源请求放到队列当中,然后再决定是意即调度还是让他慢慢排队等待被调度,看看scheduleLoad:
void ResourceLoadScheduler::scheduleLoad(ResourceLoader* resourceLoader, ResourceLoadPriority priority)
{
……
HostInformation* host = hostForURL(resourceLoader->url(), CreateIfNotFound);
bool hadRequests = host->hasRequests();
host->schedule(resourceLoader, priority);
if (priority > ResourceLoadPriorityLow || !resourceLoader->url().protocolIsInHTTPFamily() || (priority == ResourceLoadPriorityLow && !hadRequests)) {
// Try to request important resources immediately.
servePendingRequests(host, priority);
return;
}
notifyDidScheduleResourceRequest(resourceLoader);
// Handle asynchronously so early low priority requests don't
// get scheduled before later high priority ones.
scheduleServePendingRequests();
}
不管是意即被调度还是过一会再被调度,最终都会调用到函数ResourceLoadScheduler::servePendingRequests
void ResourceLoadScheduler::servePendingRequests(HostInformation*
host, ResourceLoadPriority minimumPriority)
{
LOG(ResourceLoading, "ResourceLoadScheduler::servePendingRequests HostInformation.m_name='%s'", host->name().latin1().data());
for (int priority = ResourceLoadPriorityHighest; priority >= minimumPriority; --priority) {
HostInformation::RequestQueue& requestsPending = host->requestsPending(ResourceLoadPriority(priority));
while (!requestsPending.isEmpty()) {
RefPtr
……
resourceLoader->start();
}
}
}
-------------------------------从这里开始要上演从网络请求资源的过程了。-------------------------------------------
ResourceLoader是直接被ResourceLoadScheduler使用的类,他将ResourceLoadScheduler和ResourceHandle联系起来了,在ResourceLoadScheduler当中会按一定的优先级调用每一个资源请求job,而每一个请求job被封成了ResourceLoader实例去请求,当调度到某一个ResourceLoader时,就会调用ResourceLoader::start()方法,而在ResourceLoader::start()当中创建了ResourceHandle,如下面代码段:
void ResourceLoader::start()
{
ASSERT(!m_handle);
ASSERT(!m_request.isNull());
ASSERT(m_deferredRequest.isNull());
#if ENABLE(WEB_ARCHIVE) || ENABLE(MHTML)
if (m_documentLoader->scheduleArchiveLoad(this, m_request))
return;
#endif
if (m_documentLoader->applicationCacheHost()->maybeLoadResource(this, m_request, m_request.url()))
return;
if (m_defersLoading) {
m_deferredRequest = m_request;
return;
}
// 这里创建了ResourceHandle实例
if (!m_reachedTerminalState)
m_handle = ResourceHandle::create(m_frame->loader()->networkingContext(), m_request, this, m_defersLoading, m_options.sniffContent == SniffContent);
}
ResourceHandle创建即会运行,下面是ResourceHandle的创建函数:
PassRefPtr
{
BuiltinResourceHandleConstructorMap::iterator protocolMapItem = builtinResourceHandleConstructorMap().find(request.url().protocol());
if (protocolMapItem != builtinResourceHandleConstructorMap().end())
return protocolMapItem->value(request, client);
RefPtr
if (newHandle->d->m_scheduledFailureType != NoFailure)
return newHandle.release();
//这里调用了newHandle 的start()方法
if (newHandle->start())
return newHandle.release();
return 0;
}
直接来看ResourceHandle的start方法,此函数在ResourceHandleQt.cpp当中实现:
bool ResourceHandle::start()
{
printf("ResourceHandle::start\n");
// If NetworkingContext is invalid then we are no longer attached to a Page,
// this must be an attempted load from an unload event handler, so let's just block it.
if (d->m_context && !d->m_context->isValid())
return false;
if (!d->m_user.isEmpty() || !d->m_pass.isEmpty()) {
// If credentials were specified for this request, add them to the url,
// so that they will be passed to QNetworkRequest.
KURL urlWithCredentials(firstRequest().url());
urlWithCredentials.setUser(d->m_user);
urlWithCredentials.setPass(d->m_pass);
d->m_firstRequest.setURL(urlWithCredentials);
}
ResourceHandleInternal *d = getInternal();
//下面这行创建了QNetworkReplyHandler实例,并且指定了参数是AsynchronousLoad(异步加载)
d->m_job = new QNetworkReplyHandler(this, QNetworkReplyHandler::AsynchronousLoad, d->m_defersLoading);
return true;
}
查看QNetworkReplyHandler的构造函数:
QNetworkReplyHandler::QNetworkReplyHandler(ResourceHandle* handle, LoadType loadType, bool deferred)
: QObject(0)
, m_resourceHandle(handle)
, m_loadType(loadType)
, m_redirectionTries(gMaxRedirections)
, m_queue(this, deferred)
{
const ResourceRequest &r = m_resourceHandle->firstRequest();
if (r.httpMethod() == "GET")
m_method = QNetworkAccessManager::GetOperation;
else if (r.httpMethod() == "HEAD")
m_method = QNetworkAccessManager::HeadOperation;
else if (r.httpMethod() == "POST")
m_method = QNetworkAccessManager::PostOperation;
else if (r.httpMethod() == "PUT")
m_method = QNetworkAccessManager::PutOperation;
else if (r.httpMethod() == "DELETE")
m_method = QNetworkAccessManager::DeleteOperation;
else
m_method = QNetworkAccessManager::CustomOperation;
m_request = r.toNetworkRequest(m_resourceHandle->getInternal()->m_context.get());
// 注意这一句: 传递的参数是一个函数
m_queue.push(&QNetworkReplyHandler::start);
}
m_queue是什么东西,我先不管他,从这里给的参数是一个函数,就能知道这个函数某个时间点肯定会被调用,那我就假设他进入队列后马上就被调用了,所以跳过来看:
void QNetworkReplyHandler::start()
{
printf("QNetworkReplyHandler::start\n");
ResourceHandleInternal* d = m_resourceHandle->getInternal();
if (!d || !d->m_context)
return;
// ###这里发送请求了哦
QNetworkReply* reply = sendNetworkRequest(d->m_context->networkAccessManager(), d->m_firstRequest);
if (!reply)
return;
// ###这里的QNetworkReplyWrapper里面注册了数据ready的回调
m_replyWrapper = adoptPtr(new QNetworkReplyWrapper(&m_queue, reply, m_resourceHandle->shouldContentSniff() && d->m_context->mimeSniffingEnabled(), this));
if (m_loadType == SynchronousLoad) {
m_replyWrapper->synchronousLoad();
// If supported, a synchronous request will be finished at this point, no need to hook up the signals.
// 如果是同步,直接在这里完成就返回
return;
}
// 异步情况会走到这里,先启动一个定时器,如果有必要注册一个进度更新回调
double timeoutInSeconds = d->m_firstRequest.timeoutInterval();
if (timeoutInSeconds > 0 && timeoutInSeconds < (INT_MAX / 1000))
m_timeoutTimer.start(timeoutInSeconds * 1000, this);
if (m_resourceHandle->firstRequest().reportUploadProgress())
connect(m_replyWrapper->reply(), SIGNAL(uploadProgress(qint64, qint64)), this, SLOT(uploadProgress(qint64, qint64)));
}
sendNetworkRequest到底做了些什么呢?我也不知道,因为我对 QNetworkAccessManager类不熟悉,所以暂且不分析细节,暂时就认为他发送了http请求,然后就马上退出了(因为是异步请求)。
好了,请求发出去了,数据怎么样得到呢?在调用完sendNetworkRequest()函数后,创建了一个QNetworkReplyWrapper实例,而QNetworkReplyWrapper类就是专用来处理数据reply的。这里只需要知道他是处理reply就行,内容细节不研究。
数据历经磨难,终于来到了我们的眼前,最后会调用到void QNetworkReplyHandler::forwardData(),此时我们就知道有数据来了,而当数据全部接收结束后,又会调用到void QNetworkReplyHandler::finish(),来看看forwardData()
void QNetworkReplyHandler::forwardData()
{
ASSERT(m_replyWrapper && m_replyWrapper->reply() && !wasAborted() && !m_replyWrapper->wasRedirected());
printf("QNetworkReplyHandler::forwardData\n");
ResourceHandleClient* client = m_resourceHandle->client();
if (!client)
return;
qint64 bytesAvailable = m_replyWrapper->reply()->bytesAvailable();
char* buffer = new char[8128 + 1]; // smaller than 8192 to fit within 8k including overhead.
while (bytesAvailable > 0 && !m_queue.deferSignals()) {
buffer[8128] = 0x00;
qint64 readSize = m_replyWrapper->reply()->read(buffer, 8128);
if (readSize <= 0)
break;
bytesAvailable -= readSize;
//我在这里加了打印,输出结果就是我们请求的html文档内容
printf("bytesAvailable = %d, readSize %d\n", bytesAvailable, readSize);
printf("%s\n", buffer);
printf("didReceiveData %d\n", readSize);
// FIXME:
// -1 means we do not provide any data about transfer size to inspector so it would use
// Content-Length headers or content size to show transfer size.
//这里上报数据,这里的client是谁呢?回去看void ResourceLoader::start()
client->didReceiveData(m_resourceHandle, buffer, readSize, -1);
}
delete[] buffer;
if (bytesAvailable > 0)
m_queue.requeue(&QNetworkReplyHandler::forwardData);
}
在void ResourceLoader::start()当中通过ResourceHandle::create创建ResourceHandle实例时传入的第3个参数是this,而这个参数就是这里所创建的这个ResourceHandle的client,既然明确了这个client是谁,来看他的didReceiveData()方法,ResourceLoader有2个didReceiveData()方法,不过最终都会调用到didReceiveDataOrBuffer()方法:
void
ResourceLoader::didReceiveDataOrBuffer(const char* data, int length,
PassRefPtr
{
// This method should only get data+length *OR* a SharedBuffer.
ASSERT(!prpBuffer || (!data && !length));
// Protect this in this delegate method since the additional processing can do
// anything including possibly derefing this; one example of this is Radar 3266216.
RefPtr
RefPtr
// 数据存入buffer当中先
addDataOrBuffer(data, length, buffer.get(), dataPayloadType);
// FIXME: If we get a resource with more than 2B bytes, this code won't do the right thing.
// However, with today's computers and networking speeds, this won't happen in practice.
// Could be an issue with a giant local file.
// 提交数据
if (m_options.sendLoadCallbacks == SendCallbacks && m_frame)
frameLoader()->notifier()->didReceiveData(this,
buffer ? buffer->data() : data, buffer ? buffer->size() : length,
static_cast
}
数据是怎么提交上去的呢?
frameLoader()->notifier()->didReceiveData(…)
frameLoader()->notifier()返回的是ResourceLoadNotifier,那就去找找ResourceLoadNotifier:: didReceiveData(…)
void ResourceLoadNotifier::didReceiveData(ResourceLoader* loader, const char* data, int dataLength, int encodedDataLength)
{ // 更新进度
if (Page* page = m_frame->page())
page->progress()->incrementProgress(loader->identifier(), data, dataLength);
// 继续上报数据
dispatchDidReceiveData(loader->documentLoader(), loader->identifier(), data, dataLength, encodedDataLength);
}
到这里还没有走出ResourceLoadNotifier,继续往下跟,看看ResourceLoadNotifier:: dispatchDidReceiveData
void ResourceLoadNotifier::dispatchDidReceiveData(DocumentLoader* loader, unsigned long identifier, const char* data, int dataLength, int encodedDataLength)
{
m_frame->loader()->client()->dispatchDidReceiveContentLength(loader, identifier, dataLength);
InspectorInstrumentation::didReceiveData(m_frame, identifier, data, dataLength, encodedDataLength);
}
InspectorInstrumentation::didReceiveData可以不用关心。看到这里让我感觉有点奇怪了,为什么m_frame->loader()->client()->dispatchDidReceiveContentLength(loader, identifier, dataLength);没有送data参数,而只是上报了长度,难道这里不是真正上报数据吗?于是我查看了void FrameLoaderClientQt::dispatchDidReceiveContentLength ,发现这个函数是空实现,看来这里确实没有真正上报数据,那真正上报数据在哪里做的呢?肯定是上面某个环节出问题了。
经常验证证明从QnetworkReplyHandler::forwardData(…)当中调用client->didReceiveData(…)时,这个client不是ResourceLoader,而是SubResourceLoader,SubResourceLoader是ResourceLoader的子类,ResourceLoader只是一个接口类,实例了大部分通用功能,但并没有实现上报数据的功能,所以数据上报完成在SubresourceLoader::didReceiveDataOrBuffer(…)当中,其中调用了父类的ResourceLoader::didReceiveDataOrBuffer(…),原因是数据保存、上报进度、还有inspector相关的功能都是在父类当中实现,子类完全可以不用关心。
void
SubresourceLoader::didReceiveDataOrBuffer(const char* data, int length,
PassRefPtr
{ // 1.参数检查
……
printf("SubresourceLoader::didReceiveDataOrBuffer\n");
// Reference the object in this method since the additional processing can do
// anything including removing the last reference to this object; one example of this is 3266216.
RefPtr
RefPtr
// 2.调用父类didReceiveDataOrBuffer函数
ResourceLoader::didReceiveDataOrBuffer(data, length, buffer, encodedDataLength, dataPayloadType);
// 3.提交数据
if (!m_loadingMultipartContent) {
if (ResourceBuffer* resourceData = this->resourceData())
m_resource->addDataBuffer(resourceData);
else
m_resource->addData(buffer ? buffer->data() : data, buffer ? buffer->size() : length);
}
}
看来可以继续往上跟了。看最后几行代码发现调用了m_resource->addDataBuffer / addData,m_resource是CachedResource的指针,CachedResource是接口类,所以这里的实例肯定是他的某个子类,是哪个子类要看具体请求的资源是什么了,因为我这里测试的页面是一个空白页,所以对应的子类应该是CachedRawResource,那这里应该会调用到了CachedRawResource::addDataBuffer(…)。
下面再来理一理往上调用的类层次关系:
CachedRawResource::addDataBuffer
CachedRawResource ::notifyClientsDataWasReceived
{
while (CachedRawResourceClient* c = w.next())
c->dataReceived(this, data, length);
}
// 这里的c就是DocumentLoader
DocumentLoader::dataReceived
DocumentLoader::commitLoad
FrameLoaderClientQt::committedLoad
DocumentLoader::commitData
DocumentWriter::addData
DecodedDataDocumentParser::appendBytes
HTMLDocumentParser::append
从HTMLDocumentParser::append开始已经属于解析部分的内容的。
哎呀,不好。我发现我的分析思路跑偏了,我刚开始是在分析ImageLoader如何加载图片数据,但数据上报时却当html页面去分析了,思路有点混乱了,不过不要紧,都是资源,请求过程大同小异,等有时间再来分析Image的显示过程,一并将数据的上报过程走一下。