Page Refcount
- page::_mapcount 初始化为 - 1
- 但是,page_mapcount 初始返回 0, 因为函数计算中加上了 1
- 增加一个 mapcount 不会导致 refcount 增加 1 : 这是错误的,增加一个 mapcount 必然会增加一个 refcount
- mm/memory.c 中存在 folio_get 的地方就是证据。
refcount
增加 mapcount 的位置
这两个在 13900K 静态的环境中,调用次数都非常多,
- folio_add_new_anon_rmap
- page_add_file_rmap
@[
page_add_file_rmap+5
do_set_pte+460
finish_fault+545
do_fault+798
__handle_mm_fault+1618
handle_mm_fault+341
do_user_addr_fault+561
exc_page_fault+109
asm_exc_page_fault+38
]: 3200
@[
folio_add_new_anon_rmap+5
do_anonymous_page+755
__handle_mm_fault+2090
handle_mm_fault+341
do_user_addr_fault+342
exc_page_fault+109
asm_exc_page_fault+38
]: 2601
其实 anon private 的 mapcount 总是 0 所以,在 page_add_file_rmap 是可以看到 mapcount 增加的,但是 folio_add_new_anon_rmap 不会增加。
如何理解 can_split_folio 中的计数问题
/* Racy check whether the huge page can be split */
bool can_split_folio(struct folio *folio, int *pextra_pins)
{
int extra_pins;
/* Additional pins from page cache */
if (folio_test_anon(folio))
extra_pins = folio_test_swapcache(folio) ?
folio_nr_pages(folio) : 0;
else
extra_pins = folio_nr_pages(folio);
if (pextra_pins)
*pextra_pins = extra_pins;
return folio_mapcount(folio) == folio_ref_count(folio) - extra_pins - 1;
}
其实这就是要求 mapcount 等于 refcount ,因为 refcount 初始化为 1 的
- 但是为什么 mapcount 等于 refcount 的时候才可以。
try_to_unmap_one 中存在类似逻辑
/*
* The only page refs must be one from isolation
* plus the rmap(s) (dropped by discard:).
*/
if (ref_count == 1 + map_count &&
!folio_test_dirty(folio)) {
/* Invalidate as we cleared the pte */
mmu_notifier_invalidate_range(mm,
address, address + PAGE_SIZE);
dec_mm_counter(mm, MM_ANONPAGES);
goto discard;
}
那么就说明,map 一个
如何理解 is_page_cache_freeable
static inline int is_page_cache_freeable(struct folio *folio)
{
/*
* A freeable page cache folio is referenced only by the caller
* that isolated the folio, the page cache and optional filesystem
* private data at folio->private.
*/
return folio_ref_count(folio) - folio_test_private(folio) ==
1 + folio_nr_pages(folio);
}
- 因为 private 会增加 refcount
- 为什么需要加上 1 ,我猜测是因为因为在 lru list 中导致的
加入 swap 是必须增加 reference 的
注意看: add_to_swap_cache
folio_ref_add(folio, nr);
folio_set_swapcache(folio);
- 这个 reference 什么时候去掉?
- 为什么加入到 swapcache 中的时候,需要增加额外的 reference ?
folio_ref_add 以及 page_ref_add 的位置
加入到 page cache 中的时候一定会成为:
- shmem_add_to_page_cache
- add_to_swap_cache
__filemap_add_folio- folio_migrate_mapping
folio_test_anon 指的是 anon private 的吧
是的,从 PAGE_MAPPING_ANON 的定义看上面的注释。
如何调试一下各种常见的 page 的 refcount 吧
- 应该可以通过那几个 /proc 接口
制作一个 mapcount 远远大约 refcount 的时候
folio 的 ref count 如何计算的
- shmem_add_to_page_cache 就是 folio 中包含的 page 的个数,而不是一个个的计算的
- folio_ref_add(folio, nr);
可以同时是 map 给 userspace ,但是同时又成为 page cache 吗?
page ref
_refcount和_mapcount的关系是什么 ?
[^27] : Situations where _count can exceed _mapcount include pages mapped for DMA and pages mapped into the kernel’s address space with a function like get_user_pages().
Locking a page into memory with mlock() will also increase _count.
The relative value of these two counters is important;
if _count equals _mapcount, the page can be reclaimed by locating and removing all of the page table entries.
But if _count is greater than _mapcount, the page is “pinned” and cannot be reclaimed until the extra references are removed.
- so every time, we have to increase
_countand_mapcountsyncronizely ? That’s ugly, there are something uncovered yet!
- page_ref_sub 调查一下,为什么 swap 会使用这个机制
/*
* Methods to modify the page usage count.
*
* What counts for a page usage:
* - cache mapping (page->mapping)
* - private data (page->private)
* - page mapped in a task's page tables, each mapping
* is counted separately
*
* Also, many kernel routines increase the page count before a critical
* routine so they can be sure the page doesn't go away from under them.
*/
/*
* Drop a ref, return true if the refcount fell to zero (the page has no users)
*/
static inline int put_page_testzero(struct page *page)
{
VM_BUG_ON_PAGE(page_ref_count(page) == 0, page);
return page_ref_dec_and_test(page);
}
/*
* Try to grab a ref unless the page has a refcount of zero, return false if
* that is the case.
* This can be called when MMU is off so it must not access
* any of the virtual mappings.
*/
static inline int get_page_unless_zero(struct page *page)
{
return page_ref_add_unless(page, 1, 0);
}
- understand this function and it’s reference
static bool is_refcount_suitable(struct page *page) { int expected_refcount; expected_refcount = total_mapcount(page); if (PageSwapCache(page)) expected_refcount += compound_nr(page); return page_count(page) == expected_refcount; } - put_page : rather difficult than expected
_mapcount是在 union 中间,当该页面给用户使用的时候,才有意义
原来 thp 的 reference counting 是一个专门的问题
https://lwn.net/Articles/619738/
[ ] 分析下,当决定换出的时候,首先 check 一下 mapcount 和 refcount
- shrink_folio_list : 这里是最终处理的位置
应该就是这部分了:
/*
* If the folio has buffers, try to free the buffer
* mappings associated with this folio. If we succeed
* we try to free the folio as well.
*
* We do this even if the folio is dirty.
* filemap_release_folio() does not perform I/O, but it
* is possible for a folio to have the dirty flag set,
* but it is actually clean (all its buffers are clean).
* This happens if the buffers were written out directly,
* with submit_bh(). ext3 will do this, as well as
* the blockdev mapping. filemap_release_folio() will
* discover that cleanness and will drop the buffers
* and mark the folio clean - it can be freed.
*
* Rarely, folios can have buffers and no ->mapping.
* These are the folios which were not successfully
* invalidated in truncate_cleanup_folio(). We try to
* drop those buffers here and if that worked, and the
* folio is no longer mapped into process address space
* (refcount == 1) it can be freed. Otherwise, leave
* the folio on the LRU so it is swappable.
*/
if (folio_has_private(folio)) {
if (!filemap_release_folio(folio, sc->gfp_mask))
goto activate_locked;
if (!mapping && folio_ref_count(folio) == 1) {
folio_unlock(folio);
if (folio_put_testzero(folio))
goto free_it;
else {
/*
* rare race with speculative reference.
* the speculative reference will free
* this folio shortly, so we may
* increment nr_reclaimed here (and
* leave it off the LRU).
*/
nr_reclaimed += nr_pages;
continue;
}
}
}
if (folio_test_anon(folio) && !folio_test_swapbacked(folio)) {
/* follow __remove_mapping for reference */
if (!folio_ref_freeze(folio, 1))
goto keep_locked;
/*
* The folio has only one reference left, which is
* from the isolation. After the caller puts the
* folio back on the lru and drops the reference, the
* folio will be freed anyway. It doesn't matter
* which lru it goes on. So we don't bother checking
* the dirty flag here.
*/
count_vm_events(PGLAZYFREED, nr_pages);
count_memcg_folio_events(folio, PGLAZYFREED, nr_pages);
} else if (!mapping || !__remove_mapping(mapping, folio, true,
sc->target_mem_cgroup))
goto keep_locked;
但是不是完全的正确,对于 folio_has_private 的理解不到位啊。
本站所有文章转发 CSDN 将按侵权追究法律责任,其它情况随意。