這篇文章主要介紹“PostgreSQL中ReadBuffer_common函數(shù)有什么作用”,在日常操作中,相信很多人在PostgreSQL中ReadBuffer_common函數(shù)有什么作用問題上存在疑惑,小編查閱了各式資料,整理出簡單好用的操作方法,希望對大家解答”PostgreSQL中ReadBuffer_common函數(shù)有什么作用”的疑惑有所幫助!接下來,請跟著小編一起來學(xué)習(xí)吧!

在余杭等地區(qū),都構(gòu)建了全面的區(qū)域性戰(zhàn)略布局,加強(qiáng)發(fā)展的系統(tǒng)性、市場前瞻性、產(chǎn)品創(chuàng)新能力,以專注、極致的服務(wù)理念,為客戶提供做網(wǎng)站、成都網(wǎng)站建設(shè) 網(wǎng)站設(shè)計(jì)制作按需網(wǎng)站設(shè)計(jì),公司網(wǎng)站建設(shè),企業(yè)網(wǎng)站建設(shè),成都品牌網(wǎng)站建設(shè),成都營銷網(wǎng)站建設(shè),成都外貿(mào)網(wǎng)站建設(shè),余杭網(wǎng)站建設(shè)費(fèi)用合理。
BufferDesc
共享緩沖區(qū)的共享描述符(狀態(tài))數(shù)據(jù)
/*
* Flags for buffer descriptors
* buffer描述器標(biāo)記
*
* Note: TAG_VALID essentially means that there is a buffer hashtable
* entry associated with the buffer's tag.
* 注意:TAG_VALID本質(zhì)上意味著有一個與緩沖區(qū)的標(biāo)記相關(guān)聯(lián)的緩沖區(qū)散列表?xiàng)l目。
*/
//buffer header鎖定
#define BM_LOCKED (1U << 22) /* buffer header is locked */
//數(shù)據(jù)需要寫入(標(biāo)記為DIRTY)
#define BM_DIRTY (1U << 23) /* data needs writing */
//數(shù)據(jù)是有效的
#define BM_VALID (1U << 24) /* data is valid */
//已分配buffer tag
#define BM_TAG_VALID (1U << 25) /* tag is assigned */
//正在R/W
#define BM_IO_IN_PROGRESS (1U << 26) /* read or write in progress */
//上一個I/O出現(xiàn)錯誤
#define BM_IO_ERROR (1U << 27) /* previous I/O failed */
//開始寫則變DIRTY
#define BM_JUST_DIRTIED (1U << 28) /* dirtied since write started */
//存在等待sole pin的其他進(jìn)程
#define BM_PIN_COUNT_WAITER (1U << 29) /* have waiter for sole pin */
//checkpoint發(fā)生,必須刷到磁盤上
#define BM_CHECKPOINT_NEEDED (1U << 30) /* must write for checkpoint */
//持久化buffer(不是unlogged或者初始化fork)
#define BM_PERMANENT (1U << 31) /* permanent buffer (not unlogged,
* or init fork) */
/*
* BufferDesc -- shared descriptor/state data for a single shared buffer.
* BufferDesc -- 共享緩沖區(qū)的共享描述符(狀態(tài))數(shù)據(jù)
*
* Note: Buffer header lock (BM_LOCKED flag) must be held to examine or change
* the tag, state or wait_backend_pid fields. In general, buffer header lock
* is a spinlock which is combined with flags, refcount and usagecount into
* single atomic variable. This layout allow us to do some operations in a
* single atomic operation, without actually acquiring and releasing spinlock;
* for instance, increase or decrease refcount. buf_id field never changes
* after initialization, so does not need locking. freeNext is protected by
* the buffer_strategy_lock not buffer header lock. The LWLock can take care
* of itself. The buffer header lock is *not* used to control access to the
* data in the buffer!
* 注意:必須持有Buffer header鎖(BM_LOCKED標(biāo)記)才能檢查或修改tag/state/wait_backend_pid字段.
* 通常來說,buffer header lock是spinlock,它與標(biāo)記位/參考計(jì)數(shù)/使用計(jì)數(shù)組合到單個原子變量中.
* 這個布局設(shè)計(jì)允許我們執(zhí)行原子操作,而不需要實(shí)際獲得或者釋放spinlock(比如,增加或者減少參考計(jì)數(shù)).
* buf_id字段在初始化后不會出現(xiàn)變化,因此不需要鎖定.
* freeNext通過buffer_strategy_lock鎖而不是buffer header lock保護(hù).
* LWLock可以很好的處理自己的狀態(tài).
* 務(wù)請注意的是:buffer header lock不用于控制buffer中的數(shù)據(jù)訪問!
*
* It's assumed that nobody changes the state field while buffer header lock
* is held. Thus buffer header lock holder can do complex updates of the
* state variable in single write, simultaneously with lock release (cleaning
* BM_LOCKED flag). On the other hand, updating of state without holding
* buffer header lock is restricted to CAS, which insure that BM_LOCKED flag
* is not set. Atomic increment/decrement, OR/AND etc. are not allowed.
* 假定在持有buffer header lock的情況下,沒有人改變狀態(tài)字段.
* 持有buffer header lock的進(jìn)程可以執(zhí)行在單個寫操作中執(zhí)行復(fù)雜的狀態(tài)變量更新,
* 同步的釋放鎖(清除BM_LOCKED標(biāo)記).
* 換句話說,如果沒有持有buffer header lock的狀態(tài)更新,會受限于CAS,
* 這種情況下確保BM_LOCKED沒有被設(shè)置.
* 比如原子的增加/減少(AND/OR)等操作是不允許的.
*
* An exception is that if we have the buffer pinned, its tag can't change
* underneath us, so we can examine the tag without locking the buffer header.
* Also, in places we do one-time reads of the flags without bothering to
* lock the buffer header; this is generally for situations where we don't
* expect the flag bit being tested to be changing.
* 一種例外情況是如果我們已有buffer pinned,該buffer的tag不能改變(在本進(jìn)程之下),
* 因此不需要鎖定buffer header就可以檢查tag了.
* 同時,在執(zhí)行一次性的flags讀取時不需要鎖定buffer header.
* 這種情況通常用于我們不希望正在測試的flag bit將被改變.
*
* We can't physically remove items from a disk page if another backend has
* the buffer pinned. Hence, a backend may need to wait for all other pins
* to go away. This is signaled by storing its own PID into
* wait_backend_pid and setting flag bit BM_PIN_COUNT_WAITER. At present,
* there can be only one such waiter per buffer.
* 如果其他進(jìn)程有buffer pinned,那么進(jìn)程不能物理的從磁盤頁面中刪除items.
* 因此,后臺進(jìn)程需要等待其他pins清除.這可以通過存儲它自己的PID到wait_backend_pid中,
* 并設(shè)置標(biāo)記位BM_PIN_COUNT_WAITER.
* 目前,每個緩沖區(qū)只能由一個等待進(jìn)程.
*
* We use this same struct for local buffer headers, but the locks are not
* used and not all of the flag bits are useful either. To avoid unnecessary
* overhead, manipulations of the state field should be done without actual
* atomic operations (i.e. only pg_atomic_read_u32() and
* pg_atomic_unlocked_write_u32()).
* 本地緩沖頭部使用同樣的結(jié)構(gòu),但并不需要使用locks,而且并不是所有的標(biāo)記位都使用.
* 為了避免不必要的負(fù)載,狀態(tài)域的維護(hù)不需要實(shí)際的原子操作
* (比如只有pg_atomic_read_u32() and pg_atomic_unlocked_write_u32())
*
* Be careful to avoid increasing the size of the struct when adding or
* reordering members. Keeping it below 64 bytes (the most common CPU
* cache line size) is fairly important for performance.
* 在增加或者記錄成員變量時,小心避免增加結(jié)構(gòu)體的大小.
* 保持結(jié)構(gòu)體大小在64字節(jié)內(nèi)(通常的CPU緩存線大小)對于性能是非常重要的.
*/
typedef struct BufferDesc
{
//buffer tag
BufferTag tag; /* ID of page contained in buffer */
//buffer索引編號(0開始)
int buf_id; /* buffer's index number (from 0) */
/* state of the tag, containing flags, refcount and usagecount */
//tag狀態(tài),包括flags/refcount和usagecount
pg_atomic_uint32 state;
//pin-count等待進(jìn)程ID
int wait_backend_pid; /* backend PID of pin-count waiter */
//空閑鏈表鏈中下一個空閑的buffer
int freeNext; /* link in freelist chain */
//緩沖區(qū)內(nèi)容鎖
LWLock content_lock; /* to lock access to buffer contents */
} BufferDesc;BufferTag
Buffer tag標(biāo)記了buffer存儲的是磁盤中哪個block
/*
* Buffer tag identifies which disk block the buffer contains.
* Buffer tag標(biāo)記了buffer存儲的是磁盤中哪個block
*
* Note: the BufferTag data must be sufficient to determine where to write the
* block, without reference to pg_class or pg_tablespace entries. It's
* possible that the backend flushing the buffer doesn't even believe the
* relation is visible yet (its xact may have started before the xact that
* created the rel). The storage manager must be able to cope anyway.
* 注意:BufferTag必須足以確定如何寫block而不需要參照pg_class或者pg_tablespace數(shù)據(jù)字典信息.
* 有可能后臺進(jìn)程在刷新緩沖區(qū)的時候深圳不相信關(guān)系是可見的(事務(wù)可能在創(chuàng)建rel的事務(wù)之前).
* 存儲管理器必須可以處理這些事情.
*
* Note: if there's any pad bytes in the struct, INIT_BUFFERTAG will have
* to be fixed to zero them, since this struct is used as a hash key.
* 注意:如果在結(jié)構(gòu)體中有填充的字節(jié),INIT_BUFFERTAG必須將它們固定為零,因?yàn)檫@個結(jié)構(gòu)體用作散列鍵.
*/
typedef struct buftag
{
//物理relation標(biāo)識符
RelFileNode rnode; /* physical relation identifier */
ForkNumber forkNum;
//相對于relation起始的塊號
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;ReadBuffer_common函數(shù)是所有ReadBuffer相關(guān)的通用邏輯,其實(shí)現(xiàn)邏輯如下:
1.初始化相關(guān)變量和執(zhí)行相關(guān)判斷(是否擴(kuò)展isExtend?是否臨時表isLocalBuf?)
2.如為臨時表,則調(diào)用LocalBufferAlloc獲取描述符;否則調(diào)用BufferAlloc獲取描述符;
同時,設(shè)置是否在緩存命中的標(biāo)記(變量found)
3.如在緩存中命中
3.1如非擴(kuò)展buffer,更新統(tǒng)計(jì)信息,如有需要,鎖定buffer并返回
3.2如為擴(kuò)展buffer,則獲取block
3.2.1如PageIsNew返回F,則報錯
3.2.2如為本地buffer(臨時表),則調(diào)整標(biāo)記
3.2.3如非本地buffer,則清除BM_VALID標(biāo)記
4.沒有在緩存中命中,則獲取block
4.1如為擴(kuò)展buffer,通過填充0初始化buffer,調(diào)用smgrextend擴(kuò)展
4.2如為普通buffer
4.2.1如模式為RBM_ZERO_AND_LOCK/RBM_ZERO_AND_CLEANUP_LOCK,填充0
4.2.2否則,通過smgr(存儲管理器)讀取block,如需要,則跟蹤I/O時間,同時檢查垃圾數(shù)據(jù)
5.已擴(kuò)展了buffer或者已讀取了block
5.1如需要,鎖定buffer
5.2如為臨時表,則調(diào)整標(biāo)記;否則設(shè)置BM_VALID,中斷IO,喚醒等待的進(jìn)程
5.3更新統(tǒng)計(jì)信息
5.4返回buffer
/*
* ReadBuffer_common -- common logic for all ReadBuffer variants
* ReadBuffer_common -- 所有ReadBuffer相關(guān)的通用邏輯
*
* *hit is set to true if the request was satisfied from shared buffer cache.
* *hit設(shè)置為T,如shared buffer中已存在此buffer
*/
static Buffer
ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
BlockNumber blockNum, ReadBufferMode mode,
BufferAccessStrategy strategy, bool *hit)
{
BufferDesc *bufHdr;//buffer描述符
Block bufBlock;//相應(yīng)的block
bool found;//是否命中?
bool isExtend;//擴(kuò)展?
bool isLocalBuf = SmgrIsTemp(smgr);//本地buffer?
*hit = false;
/* Make sure we will have room to remember the buffer pin */
//確保有空間存儲buffer pin
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
//如為P_NEW,則需擴(kuò)展
isExtend = (blockNum == P_NEW);
//跟蹤
TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
smgr->smgr_rnode.node.relNode,
smgr->smgr_rnode.backend,
isExtend);
/* Substitute proper block number if caller asked for P_NEW */
//如調(diào)用方要求P_NEW,則替換適當(dāng)?shù)膲K號
if (isExtend)
blockNum = smgrnblocks(smgr, forkNum);
if (isLocalBuf)
{
//本地buffer(臨時表)
bufHdr = LocalBufferAlloc(smgr, forkNum, blockNum, &found);
if (found)
pgBufferUsage.local_blks_hit++;
else if (isExtend)
pgBufferUsage.local_blks_written++;
else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||
mode == RBM_ZERO_ON_ERROR)
pgBufferUsage.local_blks_read++;
}
else
{
//非臨時表
/*
* lookup the buffer. IO_IN_PROGRESS is set if the requested block is
* not currently in memory.
* 搜索buffer.
* 如請求的block不在內(nèi)存中,則IO_IN_PROGRESS設(shè)置為T
*/
//獲取buffer描述符
bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum,
strategy, &found);
if (found)
//在內(nèi)存中命中
pgBufferUsage.shared_blks_hit++;
else if (isExtend)
//新的buffer
pgBufferUsage.shared_blks_written++;
else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||
mode == RBM_ZERO_ON_ERROR)
//讀取block
pgBufferUsage.shared_blks_read++;
}
/* At this point we do NOT hold any locks. */
//這時候,我們還沒有持有任何鎖.
/* if it was already in the buffer pool, we're done */
//---------- 如果buffer已在換沖池中,工作已完成
if (found)
{
//------------- buffer已在緩沖池中
//已在換沖池中
if (!isExtend)
{
//非擴(kuò)展buffer
/* Just need to update stats before we exit */
//在退出前,更新統(tǒng)計(jì)信息
*hit = true;
VacuumPageHit++;
if (VacuumCostActive)
VacuumCostBalance += VacuumCostPageHit;
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
smgr->smgr_rnode.node.relNode,
smgr->smgr_rnode.backend,
isExtend,
found);
/*
* In RBM_ZERO_AND_LOCK mode the caller expects the page to be
* locked on return.
* RBM_ZERO_AND_LOCK模式,調(diào)用者期望page鎖定后才返回
*/
if (!isLocalBuf)
{
//非臨時表buffer
if (mode == RBM_ZERO_AND_LOCK)
LWLockAcquire(BufferDescriptorGetContentLock(bufHdr),
LW_EXCLUSIVE);
else if (mode == RBM_ZERO_AND_CLEANUP_LOCK)
LockBufferForCleanup(BufferDescriptorGetBuffer(bufHdr));
}
//根據(jù)buffer描述符讀取buffer并返回buffer
//#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)
return BufferDescriptorGetBuffer(bufHdr);
}
/*
* We get here only in the corner case where we are trying to extend
* the relation but we found a pre-existing buffer marked BM_VALID.
* This can happen because mdread doesn't complain about reads beyond
* EOF (when zero_damaged_pages is ON) and so a previous attempt to
* read a block beyond EOF could have left a "valid" zero-filled
* buffer. Unfortunately, we have also seen this case occurring
* because of buggy Linux kernels that sometimes return an
* lseek(SEEK_END) result that doesn't account for a recent write. In
* that situation, the pre-existing buffer would contain valid data
* that we don't want to overwrite. Since the legitimate case should
* always have left a zero-filled buffer, complain if not PageIsNew.
* 程序執(zhí)行來到這里,進(jìn)程嘗試擴(kuò)展relation但發(fā)現(xiàn)了先前已存在的標(biāo)記為BM_VALID的buffer.
* 這種情況之所以發(fā)生是因?yàn)閙dread對于在EOF之后的讀不會報錯(zero_damaged_pages設(shè)置為ON),
* 并且先前嘗試讀取EOF的block遺留了"valid"的已初始化(填充0)的buffer.
* 不幸的是,我們同樣發(fā)現(xiàn)因?yàn)長inux內(nèi)核的bug(有時候會返回lseek/SEEK_END結(jié)果)導(dǎo)致這種情況.
* 在這種情況下,先前已存在的buffer會存儲有效的數(shù)據(jù),這些數(shù)據(jù)不希望被覆蓋.
* 由于合法的情況下應(yīng)該總是留下一個零填充的緩沖區(qū),如果不是PageIsNew,則報錯。
*/
//獲取block
bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);
if (!PageIsNew((Page) bufBlock))
//不是PageIsNew,則報錯
ereport(ERROR,
(errmsg("unexpected data beyond EOF in block %u of relation %s",
blockNum, relpath(smgr->smgr_rnode, forkNum)),
errhint("This has been seen to occur with buggy kernels; consider updating your system.")));
/*
* We *must* do smgrextend before succeeding, else the page will not
* be reserved by the kernel, and the next P_NEW call will decide to
* return the same page. Clear the BM_VALID bit, do the StartBufferIO
* call that BufferAlloc didn't, and proceed.
* 在成功執(zhí)行前,必須執(zhí)行smgrextend,否則的話page不能被內(nèi)核保留,
* 同時下一個P_NEW調(diào)用會確定返回同樣的page.
* 清除BM_VALID位,執(zhí)行BufferAlloc沒有執(zhí)行的StartBufferIO調(diào)用,然后繼續(xù)。
*/
if (isLocalBuf)
{
//臨時表
/* Only need to adjust flags */
//只需要調(diào)整標(biāo)記
uint32 buf_state = pg_atomic_read_u32(&bufHdr->state);
Assert(buf_state & BM_VALID);
buf_state &= ~BM_VALID;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
}
else
{
//非臨時表
/*
* Loop to handle the very small possibility that someone re-sets
* BM_VALID between our clearing it and StartBufferIO inspecting
* it.
* 循環(huán),直至StartBufferIO返回T為止
*/
do
{
uint32 buf_state = LockBufHdr(bufHdr);
Assert(buf_state & BM_VALID);
//清除BM_VALID標(biāo)記
buf_state &= ~BM_VALID;
UnlockBufHdr(bufHdr, buf_state);
} while (!StartBufferIO(bufHdr, true));
}
}
//------------- buffer不在緩沖池中
/*
* if we have gotten to this point, we have allocated a buffer for the
* page but its contents are not yet valid. IO_IN_PROGRESS is set for it,
* if it's a shared buffer.
* 如果到了這個份上,我們已經(jīng)為page分配了buffer,但其中的內(nèi)容還沒有生效.
* 如果是共享內(nèi)存,那么設(shè)置IO_IN_PROGRESS標(biāo)記.
*
* Note: if smgrextend fails, we will end up with a buffer that is
* allocated but not marked BM_VALID. P_NEW will still select the same
* block number (because the relation didn't get any longer on disk) and
* so future attempts to extend the relation will find the same buffer (if
* it's not been recycled) but come right back here to try smgrextend
* again.
* 注意:如果smgrextend失敗,我們將以一個已分配但為設(shè)置為BM_VALID的buffer結(jié)束這次調(diào)用
*/
//驗(yàn)證
Assert(!(pg_atomic_read_u32(&bufHdr->state) & BM_VALID)); /* spinlock not needed */
//獲取block
bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);
if (isExtend)
{
//-------- 擴(kuò)展block
/* new buffers are zero-filled */
//新buffers使用0填充
MemSet((char *) bufBlock, 0, BLCKSZ);
/* don't set checksum for all-zero page */
//對于使用全0填充的page,不要設(shè)置checksum
smgrextend(smgr, forkNum, blockNum, (char *) bufBlock, false);
/*
* NB: we're *not* doing a ScheduleBufferTagForWriteback here;
* although we're essentially performing a write. At least on linux
* doing so defeats the 'delayed allocation' mechanism, leading to
* increased file fragmentation.
* 注意:這里我們不會執(zhí)行ScheduleBufferTagForWriteback.雖然我們實(shí)質(zhì)上正在執(zhí)行寫操作.
* 起碼,在Linux平臺,執(zhí)行這個操作會破壞“延遲分配”機(jī)制,導(dǎo)致文件碎片.
*/
}
else
{
//-------- 普通block
/*
* Read in the page, unless the caller intends to overwrite it and
* just wants us to allocate a buffer.
* 讀取page,除非調(diào)用者期望覆蓋它并且希望我們分配buffer.
*
*/
if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
//如為RBM_ZERO_AND_LOCK或者RBM_ZERO_AND_CLEANUP_LOCK模式,初始化為0
MemSet((char *) bufBlock, 0, BLCKSZ);
else
{
//其他模式
instr_time io_start,//io的起止時間
io_time;
if (track_io_timing)
INSTR_TIME_SET_CURRENT(io_start);
//smgr(存儲管理器)讀取block
smgrread(smgr, forkNum, blockNum, (char *) bufBlock);
if (track_io_timing)
{
//需要跟蹤io時間
INSTR_TIME_SET_CURRENT(io_time);
INSTR_TIME_SUBTRACT(io_time, io_start);
pgstat_count_buffer_read_time(INSTR_TIME_GET_MICROSEC(io_time));
INSTR_TIME_ADD(pgBufferUsage.blk_read_time, io_time);
}
/* check for garbage data */
//檢查垃圾數(shù)據(jù)
if (!PageIsVerified((Page) bufBlock, blockNum))
{
//如果page為通過驗(yàn)證
if (mode == RBM_ZERO_ON_ERROR || zero_damaged_pages)
{
//出錯,則初始化
ereport(WARNING,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s; zeroing out page",
blockNum,
relpath(smgr->smgr_rnode, forkNum))));
//初始化
MemSet((char *) bufBlock, 0, BLCKSZ);
}
else
//出錯,報錯
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid page in block %u of relation %s",
blockNum,
relpath(smgr->smgr_rnode, forkNum))));
}
}
}
//--------- 已擴(kuò)展了buffer或者已讀取了block
/*
* In RBM_ZERO_AND_LOCK mode, grab the buffer content lock before marking
* the page as valid, to make sure that no other backend sees the zeroed
* page before the caller has had a chance to initialize it.
* 在RBM_ZERO_AND_LOCK模式下,在標(biāo)記page為有效之前獲取buffer content lock,
* 確保在調(diào)用者初始化之前沒有其他進(jìn)程看到已初始化為0的page
*
* Since no-one else can be looking at the page contents yet, there is no
* difference between an exclusive lock and a cleanup-strength lock. (Note
* that we cannot use LockBuffer() or LockBufferForCleanup() here, because
* they assert that the buffer is already valid.)
* 由于沒有其他進(jìn)程可以搜索page內(nèi)容,因此獲取獨(dú)占鎖和cleanup-strength鎖沒有區(qū)別.
* (注意不能在這里使用LockBuffer()或者LockBufferForCleanup(),因?yàn)檫@些函數(shù)假定buffer有效)
*/
if ((mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK) &&
!isLocalBuf)
{
//鎖定
LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_EXCLUSIVE);
}
if (isLocalBuf)
{
//臨時表
/* Only need to adjust flags */
//只需要調(diào)整標(biāo)記
uint32 buf_state = pg_atomic_read_u32(&bufHdr->state);
buf_state |= BM_VALID;
pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
}
else
{
//普通表
/* Set BM_VALID, terminate IO, and wake up any waiters */
//設(shè)置BM_VALID,中斷IO,喚醒等待的進(jìn)程
TerminateBufferIO(bufHdr, false, BM_VALID);
}
//更新統(tǒng)計(jì)信息
VacuumPageMiss++;
if (VacuumCostActive)
VacuumCostBalance += VacuumCostPageMiss;
//跟蹤
TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
smgr->smgr_rnode.node.spcNode,
smgr->smgr_rnode.node.dbNode,
smgr->smgr_rnode.node.relNode,
smgr->smgr_rnode.backend,
isExtend,
found);
//返回buffer
//#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)
return BufferDescriptorGetBuffer(bufHdr);
}測試場景一:Block不在緩沖區(qū)中
腳本:
16:42:48 (xdb@[local]:5432)testdb=# select * from t1 limit 10;
啟動gdb,設(shè)置斷點(diǎn)
(gdb) b ReadBuffer_common Breakpoint 1 at 0x876e28: file bufmgr.c, line 711. (gdb) c Continuing. Breakpoint 1, ReadBuffer_common (smgr=0x2b7cce0, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=0, mode=RBM_NORMAL, strategy=0x0, hit=0x7ffc7761dfab) at bufmgr.c:711 711 bool isLocalBuf = SmgrIsTemp(smgr); (gdb)
1.初始化相關(guān)變量和執(zhí)行相關(guān)判斷(是否擴(kuò)展isExtend?是否臨時表isLocalBuf?)
(gdb) n 713 *hit = false; (gdb) 716 ResourceOwnerEnlargeBuffers(CurrentResourceOwner); (gdb) 718 isExtend = (blockNum == P_NEW); (gdb) 720 TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum, (gdb) 728 if (isExtend) (gdb) 731 if (isLocalBuf) (gdb) 745 bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum, (gdb)
2.調(diào)用BufferAlloc獲取buffer描述符
(gdb)
747 if (found)
(gdb) p *bufHdr
$1 = {tag = {rnode = {spcNode = 1663, dbNode = 16402, relNode = 51439}, forkNum = MAIN_FORKNUM, blockNum = 0},
buf_id = 108, state = {value = 2248409089}, wait_backend_pid = 0, freeNext = -2, content_lock = {tranche = 54, state = {
value = 536870912}, waiters = {head = 2147483647, tail = 2147483647}}}
(gdb) p found
$2 = false
(gdb)
(gdb) n
750 pgBufferUsage.shared_blks_read++; --> 更新統(tǒng)計(jì)信息
(gdb)4.沒有在緩存中命中,則獲取block
756 if (found) (gdb) 856 Assert(!(pg_atomic_read_u32(&bufHdr->state) & BM_VALID)); /* spinlock not needed */ (gdb) 858 bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr); (gdb) 860 if (isExtend) (gdb) p bufBlock $4 = (Block) 0x7fe8c240e380
4.2如為普通buffer
4.2.1如模式為RBM_ZERO_AND_LOCK/RBM_ZERO_AND_CLEANUP_LOCK,填充0
4.2.2否則,通過smgr(存儲管理器)讀取block,如需要,則跟蹤I/O時間,同時檢查垃圾數(shù)據(jù)
(gdb) p mode
$5 = RBM_NORMAL
(gdb)
(gdb) n
880 if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
(gdb)
887 if (track_io_timing)
(gdb)
890 smgrread(smgr, forkNum, blockNum, (char *) bufBlock);
(gdb)
892 if (track_io_timing)
(gdb) p *smgr
$6 = {smgr_rnode = {node = {spcNode = 1663, dbNode = 16402, relNode = 51439}, backend = -1}, smgr_owner = 0x7fe8ee2bc7b8,
smgr_targblock = 4294967295, smgr_fsm_nblocks = 4294967295, smgr_vm_nblocks = 4294967295, smgr_which = 0,
md_num_open_segs = {1, 0, 0, 0}, md_seg_fds = {0x2b0dd78, 0x0, 0x0, 0x0}, next_unowned_reln = 0x0}
(gdb) p forkNum
$7 = MAIN_FORKNUM
(gdb) p blockNum
$8 = 0
(gdb) p (char *) bufBlock
$9 = 0x7fe8c240e380 "\001"
(gdb)5.已擴(kuò)展了buffer或者已讀取了block
5.1如需要,鎖定buffer
5.2如為臨時表,則調(diào)整標(biāo)記;否則設(shè)置BM_VALID,中斷IO,喚醒等待的進(jìn)程
(gdb) n 901 if (!PageIsVerified((Page) bufBlock, blockNum)) (gdb) 932 if ((mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK) && (gdb) n 938 if (isLocalBuf) (gdb) 949 TerminateBufferIO(bufHdr, false, BM_VALID); (gdb)
5.3更新統(tǒng)計(jì)信息
5.4返回buffer
(gdb) 952 VacuumPageMiss++; (gdb) 953 if (VacuumCostActive) (gdb) 956 TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum, (gdb) 964 return BufferDescriptorGetBuffer(bufHdr); (gdb) 965 } (gdb)
buf為109
(gdb) ReadBufferExtended (reln=0x7fe8ee2bc7a8, forkNum=MAIN_FORKNUM, blockNum=0, mode=RBM_NORMAL, strategy=0x0) at bufmgr.c:666 666 if (hit) (gdb) 668 return buf; (gdb) p buf $10 = 109 (gdb)
測試場景二:Block已在緩沖區(qū)中
再次執(zhí)行上面的SQL語句,這時候相應(yīng)的block已讀入到buffer中
(gdb) del Delete all breakpoints? (y or n) y (gdb) c Continuing. ^C Program received signal SIGINT, Interrupt. 0x00007fe8ec448903 in __epoll_wait_nocancel () at ../sysdeps/unix/syscall-template.S:81 81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) (gdb) b ReadBuffer_common Breakpoint 2 at 0x876e28: file bufmgr.c, line 711. (gdb)
found變量為T
... (gdb) 745 bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum, (gdb) 747 if (found) (gdb) p found $11 = true (gdb) (gdb) n 748 pgBufferUsage.shared_blks_hit++; (gdb)
進(jìn)入相應(yīng)的邏輯
3.如在緩存中命中
3.1如非擴(kuò)展buffer,更新統(tǒng)計(jì)信息,如有需要,鎖定buffer并返回
3.2如為擴(kuò)展buffer,則獲取block
3.2.1如PageIsNew返回F,則報錯
3.2.2如為本地buffer(臨時表),則調(diào)整標(biāo)記
3.2.3如非本地buffer,則清除BM_VALID標(biāo)記
(gdb) 756 if (found) (gdb) 758 if (!isExtend) (gdb) 761 *hit = true; (gdb) 762 VacuumPageHit++; (gdb) 764 if (VacuumCostActive) (gdb) 767 TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum, (gdb) 779 if (!isLocalBuf) (gdb) 781 if (mode == RBM_ZERO_AND_LOCK) (gdb) 784 else if (mode == RBM_ZERO_AND_CLEANUP_LOCK) (gdb) 788 return BufferDescriptorGetBuffer(bufHdr); (gdb) 965 } (gdb)
到此,關(guān)于“PostgreSQL中ReadBuffer_common函數(shù)有什么作用”的學(xué)習(xí)就結(jié)束了,希望能夠解決大家的疑惑。理論與實(shí)踐的搭配能更好的幫助大家學(xué)習(xí),快去試試吧!若想繼續(xù)學(xué)習(xí)更多相關(guān)知識,請繼續(xù)關(guān)注創(chuàng)新互聯(lián)網(wǎng)站,小編會繼續(xù)努力為大家?guī)砀鄬?shí)用的文章!
分享標(biāo)題:PostgreSQL中ReadBuffer_common函數(shù)有什么作用
文章URL:http://chinadenli.net/article0/phopio.html
成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供面包屑導(dǎo)航、靜態(tài)網(wǎng)站、企業(yè)建站、關(guān)鍵詞優(yōu)化、網(wǎng)站改版、網(wǎng)站設(shè)計(jì)
聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請盡快告知,我們將會在第一時間刪除。文章觀點(diǎn)不代表本網(wǎng)站立場,如需處理請聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時需注明來源: 創(chuàng)新互聯(lián)