Chinaunix首页 | 论坛 | 博客
  • 博客访问: 167709
  • 博文数量: 24
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 399
  • 用 户 组: 普通用户
  • 注册时间: 2013-03-04 15:36
文章分类

全部博文(24)

文章存档

2017年(2)

2015年(5)

2014年(9)

2013年(8)

我的朋友

分类: 嵌入式

2014-12-28 22:35:35

网络上的介绍webkit资源加载过程的文档已经不少了,看了之后有一点似懂非懂的感觉,况且看完了就看完了,过几天又还给作者了,所以我还是决定自己对着代码把资源加载的流程走一下,加深一些记忆。

首先了解一下MemoryCache

MemoryCache维护着所有缓存的资源列表,这是一个hash表,hash key是资源的url, valueCachedResource*,如果你了解MemCached(用于服务器缓存)的话,完全可以用它的概念来理解这里的MemoryCache,完全是一回事。

 

HTML支持的资源主要包含以下这些类型:

1.       HTML页面

2.       字体文件

3.       图片

4.       CSS Shader

5.       视频,音频和字幕

6.       Script

7.       CSS样式表

8.       XSL样式表

9.       SVG:用来绘制SVG2D图形

Webkit当中对这9种类型的资源都有对应的类来表示,这些类都是以CachedResource为父类

CachedResource主要有下面几个子类用来表示具体的资源:

CachedRawResource

CachedFont

CachedImage

CachedShader

CachedTextTrack

CachedScript

CachedCSSStyleSheet

CachedXSLStyleSheet

CachedSVGDocument

 

资源加载过程当中,首先与外界接触的是CachedResourceLoader类,它会先访问MemoryCache,如果找到对应的资源,则直接返回,否则就定义ResourceRequest,进而调用ResourceLoader去从网络访问资源。

 

Webkit中的三类Loader

1.       ResourceLoader 从网络栈或本地磁盘访问资源

2.       CachedResourceLoaderMemoryCache访问资源

3.       专有Loader,例如FontLoader, LinkLoader,ImageLoader,它们在Element当中用于与使用者打交道的

 

ImageLoader加载图片的整个过程:

HTMLImageElement被创建时同时会创建ImageLoader实例,它用来加载图片的。那就从updateFromElement()中看看这个ImageLoader实例m_imageLoader是如何使用的:

void ImageLoader::updateFromElement()

{

         Document* document = m_element->document();

CachedResourceHandle newImage = 0;

    if (!attr.isNull() && !stripLeadingAndTrailingHTMLSpaces(attr).isEmpty()) {

        CachedResourceRequest request(ResourceRequest(document->completeURL(sourceURI(attr))));

        ……

 

        if (m_loadManually) {

            ……

        } else {

            //从document取cachedResourceLoader,从而调用CachedResourceLoader的requestImage方法,下面看requestImage方法

            newImage = document->cachedResourceLoader()->requestImage(request);

        }

 

        // If we do not have an image here, it means that a cross-site

        // violation occurred, or that the image was blocked via Content

        // Security Policy, or the page is being dismissed. Trigger an

        // error event if the page is not being dismissed.

        if (!newImage && !pageIsBeingDismissed(document)) {

            m_failedLoadURL = attr;

            m_hasPendingErrorEvent = true;

            errorEventSender().dispatchEventSoon(this);

        } else

            clearFailedLoadURL();

} else if (!attr.isNull()) {

……

    }

}

省掉了大部分代码,只看如何请求图片资源的,在CachedResourceLoader当中的资源请求最终都会调用到requestResource

CachedResourceHandle CachedResourceLoader::requestImage(CachedResourceRequest& request)

{

    ……

    return static_cast(requestResource(CachedResource::ImageResource, request).get());

}

 

CachedResourceHandle CachedResourceLoader::requestResource(CachedResource::Type type, CachedResourceRequest& request)

{

    KURL url = request.resourceRequest().url();

……
    
// 查MemoryCache

    resource = memoryCache()->resourceForRequest(request.resourceRequest());

    // 决定接下来要做的动作

    const RevalidationPolicy policy = determineRevalidationPolicy(type, request.mutableResourceRequest(), request.forPreload(), resource.get(), request.defer());

    switch (policy) {

    case Reload:

        memoryCache()->remove(resource.get());

        // Fall through

    case Load:

        resource = loadResource(type, request, request.charset());

        break;

    case Revalidate:

        resource = revalidateResource(request, resource.get());

        break;

    case Use:

        if (!shouldContinueAfterNotifyingLoadedFromMemoryCache(resource.get()))

            return 0;

        memoryCache()->resourceAccessed(resource.get());

        break;

    }

 

    if (!resource)

        return 0;

 

    if (!request.forPreload() || policy != Use)

        resource->setLoadPriority(request.priority());

 

if ((policy != Use || resource->stillNeedsLoad()) && CachedResourceRequest::NoDefer == request.defer()) {

        // 加载资源

              resource->load(this, request.options());

                  

        // We don't support immediate loads, but we do support immediate failure.

        if (resource->errorOccurred()) {

            if (resource->inCache())

                memoryCache()->remove(resource.get());

            return 0;

        }

    }

    ……

    return resource;

}

CachedResourceLoader::requestResource(…)暂且不关心资源已经在MemeoryCache中的情况,这里只看资源从网络栈请求的情况下是如何走的。资源的加载最终会走到resource->load(this, request.options());,而在这里调用的应该CachedImage::load(…),稍微留言一下这里的两个参数,第一个是this,意指CachedResourceLoaderr指针,第二个是一些参数,暂不关心。

void CachedImage::load(CachedResourceLoader* cachedResourceLoader, const ResourceLoaderOptions& options)

{

    if (!cachedResourceLoader || cachedResourceLoader->autoLoadImages())

        CachedResource::load(cachedResourceLoader, options);

    else

        setLoading(false);

}

又调用了CachedResource::load(…),在此函数中我只关心那一行最长的代码:

m_loader = platformStrategies()->loaderStrategy()->resourceLoadScheduler()->scheduleSubresourceLoad(cachedResourceLoader->frame(), this, request, request.priority(), options);

搜遍了整个代码发现有两个地方有:scheduleSubresourceLoad,一处是在WebCore当中,另一处是在Webkit2WebProcess当中实现的,这里暂且不针对webkit2的多进程进行分析,所以就直接看WebCore当中的实现:

PassRefPtr ResourceLoadScheduler::scheduleSubresourceLoad(Frame* frame, CachedResource* resource, const ResourceRequest& request, ResourceLoadPriority priority, const ResourceLoaderOptions& options)

{

    RefPtr loader = SubresourceLoader::create(frame, resource, request, options);

    if (loader)

        scheduleLoad(loader.get(), priority);

    return loader.release();

}

这个函数很简单,创建loader并进行调度就结束了。其实就是将资源请求放到队列当中,然后再决定是意即调度还是让他慢慢排队等待被调度,看看scheduleLoad

void ResourceLoadScheduler::scheduleLoad(ResourceLoader* resourceLoader, ResourceLoadPriority priority)

{

         ……

    HostInformation* host = hostForURL(resourceLoader->url(), CreateIfNotFound);   

    bool hadRequests = host->hasRequests();

    host->schedule(resourceLoader, priority);

 

    if (priority > ResourceLoadPriorityLow || !resourceLoader->url().protocolIsInHTTPFamily() || (priority == ResourceLoadPriorityLow && !hadRequests)) {

        // Try to request important resources immediately.

        servePendingRequests(host, priority);

        return;

    }

 

    notifyDidScheduleResourceRequest(resourceLoader);

 

    // Handle asynchronously so early low priority requests don't

    // get scheduled before later high priority ones.

    scheduleServePendingRequests();

}

不管是意即被调度还是过一会再被调度,最终都会调用到函数ResourceLoadScheduler::servePendingRequests
void ResourceLoadScheduler::servePendingRequests(HostInformation* host, ResourceLoadPriority minimumPriority)

{

    LOG(ResourceLoading, "ResourceLoadScheduler::servePendingRequests HostInformation.m_name='%s'", host->name().latin1().data());

 

    for (int priority = ResourceLoadPriorityHighest; priority >= minimumPriority; --priority) {

        HostInformation::RequestQueue& requestsPending = host->requestsPending(ResourceLoadPriority(priority));

 

        while (!requestsPending.isEmpty()) {

            RefPtr resourceLoader = requestsPending.first();

            ……

            resourceLoader->start();

        }

    }

}

-------------------------------从这里开始要上演从网络请求资源的过程了。-------------------------------------------

ResourceLoader是直接被ResourceLoadScheduler使用的类,他将ResourceLoadSchedulerResourceHandle联系起来了,在ResourceLoadScheduler当中会按一定的优先级调用每一个资源请求job,而每一个请求job被封成了ResourceLoader实例去请求,当调度到某一个ResourceLoader时,就会调用ResourceLoader::start()方法,而在ResourceLoader::start()当中创建了ResourceHandle,如下面代码段:

void ResourceLoader::start()

{

    ASSERT(!m_handle);

    ASSERT(!m_request.isNull());

    ASSERT(m_deferredRequest.isNull());

 

#if ENABLE(WEB_ARCHIVE) || ENABLE(MHTML)

    if (m_documentLoader->scheduleArchiveLoad(this, m_request))

        return;

#endif

 

    if (m_documentLoader->applicationCacheHost()->maybeLoadResource(this, m_request, m_request.url()))

        return;

 

    if (m_defersLoading) {

        m_deferredRequest = m_request;

        return;

    }


    // 这里创建了ResourceHandle实例

    if (!m_reachedTerminalState)

        m_handle = ResourceHandle::create(m_frame->loader()->networkingContext(), m_request, this, m_defersLoading, m_options.sniffContent == SniffContent);

}

ResourceHandle创建即会运行,下面是ResourceHandle的创建函数:

PassRefPtr ResourceHandle::create(NetworkingContext* context, const ResourceRequest& request, ResourceHandleClient* client, bool defersLoading, bool shouldContentSniff)

{

    BuiltinResourceHandleConstructorMap::iterator protocolMapItem = builtinResourceHandleConstructorMap().find(request.url().protocol());

 

    if (protocolMapItem != builtinResourceHandleConstructorMap().end())

        return protocolMapItem->value(request, client);

 

    RefPtr newHandle(adoptRef(new ResourceHandle(context, request, client, defersLoading, shouldContentSniff)));

 

    if (newHandle->d->m_scheduledFailureType != NoFailure)

        return newHandle.release();

    //这里调用了newHandle 的start()方法

    if (newHandle->start())

        return newHandle.release();

 

    return 0;

}

 

直接来看ResourceHandle的start方法,此函数在ResourceHandleQt.cpp当中实现:

bool ResourceHandle::start()

{

         printf("ResourceHandle::start\n");

    // If NetworkingContext is invalid then we are no longer attached to a Page,

    // this must be an attempted load from an unload event handler, so let's just block it.

    if (d->m_context && !d->m_context->isValid())

        return false;

 

    if (!d->m_user.isEmpty() || !d->m_pass.isEmpty()) {

        // If credentials were specified for this request, add them to the url,

        // so that they will be passed to QNetworkRequest.

        KURL urlWithCredentials(firstRequest().url());

        urlWithCredentials.setUser(d->m_user);

        urlWithCredentials.setPass(d->m_pass);

        d->m_firstRequest.setURL(urlWithCredentials);

    }

 

ResourceHandleInternal *d = getInternal();

//下面这行创建了QNetworkReplyHandler实例,并且指定了参数是AsynchronousLoad(异步加载)

    d->m_job = new QNetworkReplyHandler(this, QNetworkReplyHandler::AsynchronousLoad, d->m_defersLoading);

    return true;

}

查看QNetworkReplyHandler的构造函数:

QNetworkReplyHandler::QNetworkReplyHandler(ResourceHandle* handle, LoadType loadType, bool deferred)

    : QObject(0)

    , m_resourceHandle(handle)

    , m_loadType(loadType)

    , m_redirectionTries(gMaxRedirections)

    , m_queue(this, deferred)

{

    const ResourceRequest &r = m_resourceHandle->firstRequest();

 

    if (r.httpMethod() == "GET")

        m_method = QNetworkAccessManager::GetOperation;

    else if (r.httpMethod() == "HEAD")

        m_method = QNetworkAccessManager::HeadOperation;

    else if (r.httpMethod() == "POST")

        m_method = QNetworkAccessManager::PostOperation;

    else if (r.httpMethod() == "PUT")

        m_method = QNetworkAccessManager::PutOperation;

    else if (r.httpMethod() == "DELETE")

        m_method = QNetworkAccessManager::DeleteOperation;

    else

        m_method = QNetworkAccessManager::CustomOperation;

 

    m_request = r.toNetworkRequest(m_resourceHandle->getInternal()->m_context.get());

    // 注意这一句: 传递的参数是一个函数

    m_queue.push(&QNetworkReplyHandler::start);

}

m_queue是什么东西,我先不管他,从这里给的参数是一个函数,就能知道这个函数某个时间点肯定会被调用,那我就假设他进入队列后马上就被调用了,所以跳过来看:

void QNetworkReplyHandler::start()

{

         printf("QNetworkReplyHandler::start\n");

    ResourceHandleInternal* d = m_resourceHandle->getInternal();

    if (!d || !d->m_context)

        return;

    // ###这里发送请求了哦

    QNetworkReply* reply = sendNetworkRequest(d->m_context->networkAccessManager(), d->m_firstRequest);

    if (!reply)

        return;

    // ###这里的QNetworkReplyWrapper里面注册了数据ready的回调

    m_replyWrapper = adoptPtr(new QNetworkReplyWrapper(&m_queue, reply, m_resourceHandle->shouldContentSniff() && d->m_context->mimeSniffingEnabled(), this));

 

    if (m_loadType == SynchronousLoad) {

        m_replyWrapper->synchronousLoad();

        // If supported, a synchronous request will be finished at this point, no need to hook up the signals.

        // 如果是同步,直接在这里完成就返回

        return;

    }

    // 异步情况会走到这里,先启动一个定时器,如果有必要注册一个进度更新回调

    double timeoutInSeconds = d->m_firstRequest.timeoutInterval();

    if (timeoutInSeconds > 0 && timeoutInSeconds < (INT_MAX / 1000))

        m_timeoutTimer.start(timeoutInSeconds * 1000, this);

 

    if (m_resourceHandle->firstRequest().reportUploadProgress())

        connect(m_replyWrapper->reply(), SIGNAL(uploadProgress(qint64, qint64)), this, SLOT(uploadProgress(qint64, qint64)));

}

 

sendNetworkRequest到底做了些什么呢?我也不知道,因为我对 QNetworkAccessManager类不熟悉,所以暂且不分析细节,暂时就认为他发送了http请求,然后就马上退出了(因为是异步请求)

好了,请求发出去了,数据怎么样得到呢?在调用完sendNetworkRequest()函数后,创建了一个QNetworkReplyWrapper实例,而QNetworkReplyWrapper类就是专用来处理数据reply的。这里只需要知道他是处理reply就行,内容细节不研究。

数据历经磨难,终于来到了我们的眼前,最后会调用到void QNetworkReplyHandler::forwardData(),此时我们就知道有数据来了,而当数据全部接收结束后,又会调用到void QNetworkReplyHandler::finish(),来看看forwardData()

void QNetworkReplyHandler::forwardData()

{

    ASSERT(m_replyWrapper && m_replyWrapper->reply() && !wasAborted() && !m_replyWrapper->wasRedirected());

 

         printf("QNetworkReplyHandler::forwardData\n");

    ResourceHandleClient* client = m_resourceHandle->client();

    if (!client)

        return;

 

    qint64 bytesAvailable = m_replyWrapper->reply()->bytesAvailable();

    char* buffer = new char[8128 + 1]; // smaller than 8192 to fit within 8k including overhead.

    while (bytesAvailable > 0 && !m_queue.deferSignals()) {

                   buffer[8128] = 0x00;

        qint64 readSize = m_replyWrapper->reply()->read(buffer, 8128);

        if (readSize <= 0)

            break;

        bytesAvailable -= readSize;

        //我在这里加了打印,输出结果就是我们请求的html文档内容

                   printf("bytesAvailable = %d, readSize %d\n", bytesAvailable, readSize);

                  printf("%s\n", buffer);

                   printf("didReceiveData %d\n", readSize);

        // FIXME:

        // -1 means we do not provide any data about transfer size to inspector so it would use

        // Content-Length headers or content size to show transfer size.

        //这里上报数据,这里的client是谁呢?回去看void ResourceLoader::start()

        client->didReceiveData(m_resourceHandle, buffer, readSize, -1);

    }

    delete[] buffer;

    if (bytesAvailable > 0)

        m_queue.requeue(&QNetworkReplyHandler::forwardData);

}

void ResourceLoader::start()当中通过ResourceHandle::create创建ResourceHandle实例时传入的第3个参数是this,而这个参数就是这里所创建的这个ResourceHandleclient,既然明确了这个client是谁,来看他的didReceiveData()方法,ResourceLoader2didReceiveData()方法,不过最终都会调用到didReceiveDataOrBuffer()方法:

void ResourceLoader::didReceiveDataOrBuffer(const char* data, int length, PassRefPtr prpBuffer, long long encodedDataLength, DataPayloadType dataPayloadType)

{

    // This method should only get data+length *OR* a SharedBuffer.

    ASSERT(!prpBuffer || (!data && !length));

 

    // Protect this in this delegate method since the additional processing can do

    // anything including possibly derefing this; one example of this is Radar 3266216.

    RefPtr protector(this);

    RefPtr buffer = prpBuffer;

    // 数据存入buffer当中先

    addDataOrBuffer(data, length, buffer.get(), dataPayloadType);

   

    // FIXME: If we get a resource with more than 2B bytes, this code won't do the right thing.

    // However, with today's computers and networking speeds, this won't happen in practice.

// Could be an issue with a giant local file.

    // 提交数据

    if (m_options.sendLoadCallbacks == SendCallbacks && m_frame)

        frameLoader()->notifier()->didReceiveData(this, buffer ? buffer->data() : data, buffer ? buffer->size() : length, static_cast(encodedDataLength));

}

数据是怎么提交上去的呢?

frameLoader()->notifier()->didReceiveData(…)

frameLoader()->notifier()返回的是ResourceLoadNotifier,那就去找找ResourceLoadNotifier:: didReceiveData(…)

 

void ResourceLoadNotifier::didReceiveData(ResourceLoader* loader, const char* data, int dataLength, int encodedDataLength)

{   // 更新进度

    if (Page* page = m_frame->page())

        page->progress()->incrementProgress(loader->identifier(), data, dataLength);

    // 继续上报数据

    dispatchDidReceiveData(loader->documentLoader(), loader->identifier(), data, dataLength, encodedDataLength);

}

到这里还没有走出ResourceLoadNotifier,继续往下跟,看看ResourceLoadNotifier:: dispatchDidReceiveData

void ResourceLoadNotifier::dispatchDidReceiveData(DocumentLoader* loader, unsigned long identifier, const char* data, int dataLength, int encodedDataLength)

{

    m_frame->loader()->client()->dispatchDidReceiveContentLength(loader, identifier, dataLength);

    InspectorInstrumentation::didReceiveData(m_frame, identifier, data, dataLength, encodedDataLength);

}

InspectorInstrumentation::didReceiveData可以不用关心。看到这里让我感觉有点奇怪了,为什么m_frame->loader()->client()->dispatchDidReceiveContentLength(loader, identifier, dataLength);没有送data参数,而只是上报了长度,难道这里不是真正上报数据吗?于是我查看了void FrameLoaderClientQt::dispatchDidReceiveContentLength ,发现这个函数是空实现,看来这里确实没有真正上报数据,那真正上报数据在哪里做的呢?肯定是上面某个环节出问题了。

经常验证证明从QnetworkReplyHandler::forwardData(…)当中调用client->didReceiveData(…)时,这个client不是ResourceLoader,而是SubResourceLoaderSubResourceLoaderResourceLoader的子类,ResourceLoader只是一个接口类,实例了大部分通用功能,但并没有实现上报数据的功能,所以数据上报完成在SubresourceLoader::didReceiveDataOrBuffer(…)当中,其中调用了父类的ResourceLoader::didReceiveDataOrBuffer(…),原因是数据保存、上报进度、还有inspector相关的功能都是在父类当中实现,子类完全可以不用关心。

void SubresourceLoader::didReceiveDataOrBuffer(const char* data, int length, PassRefPtr prpBuffer, long long encodedDataLength, DataPayloadType dataPayloadType)

{    // 1.参数检查

         ……

         printf("SubresourceLoader::didReceiveDataOrBuffer\n");

    // Reference the object in this method since the additional processing can do

    // anything including removing the last reference to this object; one example of this is 3266216.

    RefPtr protect(this);

    RefPtr buffer = prpBuffer;

    // 2.调用父类didReceiveDataOrBuffer函数

    ResourceLoader::didReceiveDataOrBuffer(data, length, buffer, encodedDataLength, dataPayloadType);

    // 3.提交数据

    if (!m_loadingMultipartContent) {

        if (ResourceBuffer* resourceData = this->resourceData())

            m_resource->addDataBuffer(resourceData);

        else

            m_resource->addData(buffer ? buffer->data() : data, buffer ? buffer->size() : length);

    }

}

看来可以继续往上跟了。看最后几行代码发现调用了m_resource->addDataBuffer / addDatam_resourceCachedResource的指针,CachedResource是接口类,所以这里的实例肯定是他的某个子类,是哪个子类要看具体请求的资源是什么了,因为我这里测试的页面是一个空白页,所以对应的子类应该是CachedRawResource,那这里应该会调用到了CachedRawResource::addDataBuffer(…)

下面再来理一理往上调用的类层次关系:

CachedRawResource::addDataBuffer

      CachedRawResource ::notifyClientsDataWasReceived

      {

             while (CachedRawResourceClient* c = w.next())

                 c->dataReceived(this, data, length);

}

// 这里的c就是DocumentLoader

DocumentLoader::dataReceived

DocumentLoader::commitLoad

                  FrameLoaderClientQt::committedLoad

                            DocumentLoader::commitData

                                     DocumentWriter::addData

                                               DecodedDataDocumentParser::appendBytes

                                                        HTMLDocumentParser::append

HTMLDocumentParser::append开始已经属于解析部分的内容的。

 

哎呀,不好。我发现我的分析思路跑偏了,我刚开始是在分析ImageLoader如何加载图片数据,但数据上报时却当html页面去分析了,思路有点混乱了,不过不要紧,都是资源,请求过程大同小异,等有时间再来分析Image的显示过程,一并将数据的上报过程走一下。

阅读(2551) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~