1) An embedded GPU core excecutes multiple threads but all threads run the same set of instruction,operating on different data
所有的线程执行同样的代码但是不同的数据
2) 基于嵌入式GPU的图形处理
2.1 内存传输带宽
2.2 GPU更适合浮点运算如Gaussian filter
2.3 shader 指令数目和渲染次数
5.性能优化
5.1 浮点精度
hightp: 单精度32为浮点值
mediump: half-precision floating point value(16bit) -65520,65520
lowp: 【-2.0 2.0 精度为1/256
5.2 loop unrolling
用vector
5.3 分支
极大的恶化了性能
5.4 load sharing between vetex and fragment shaders
by moving the calculations to the vertx shader and directly using the vertx shader's computed texture coordinates , the fragment shader avoids the dependent texture read
output from the vertex shader is represented by varying modifier which is first interpolated by the rasterizer and then fed into the fragment shader.
阅读(788) | 评论(0) | 转发(0) |