virtio-iommu
什么参数都不加
[ 0.458729] iommu: Default domain type: Translated
[ 0.459021] iommu: DMA domain TLB invalidation policy: lazy mode
加上 virtio 之后:
[ 1.799559] iommu: Default domain type: Translated
[ 1.803487] iommu: DMA domain TLB invalidation policy: lazy mode
[ 2.554608] pci 0000:00:00.0: Adding to iommu group 0
[ 2.554817] pci 0000:00:01.0: Adding to iommu group 1
[ 2.555020] pci 0000:00:02.0: Adding to iommu group 2
[ 2.555223] pci 0000:00:03.0: Adding to iommu group 3
[ 2.555421] pci 0000:00:04.0: Adding to iommu group 4
[ 2.555620] pci 0000:00:05.0: Adding to iommu group 5
[ 2.555825] pci 0000:00:06.0: Adding to iommu group 6
[ 2.556026] pci 0000:00:07.0: Adding to iommu group 7
[ 2.556230] pci 0000:00:08.0: Adding to iommu group 8
[ 2.556431] pci 0000:00:09.0: Adding to iommu group 9
[ 2.556639] pci 0000:00:1f.0: Adding to iommu group 10
[ 2.556844] pci 0000:00:1f.2: Adding to iommu group 10
[ 2.557049] pci 0000:00:1f.3: Adding to iommu group 10
[ 2.557252] pci 0000:01:01.0: Adding to iommu group 3
[ ] vIOMMU 和 virtio iommu 是一个东西吗
- QEMU 的解释 : https://wiki.qemu.org/Features/VT-d
https://www.usenix.org/legacy/event/atc11/tech/final_files/Amit.pdf
https://www.youtube.com/watch?v=KlBgB4br1HM
https://www.youtube.com/watch?v=7aZAsanbKwI
不添加任何参数
根本不会调用到 iommu_iova_to_phys 中
透传 host iommu 进去
-device intel-iommu,intremap=on,caching-mode=on \
https://gist.github.com/mcastelino/08f6e49f2faba295eb690a3a8ee44c70
这是含有:
#0 iommu_iova_to_phys (domain=0xffff8880059bcb90, iova=4289724416) at drivers/iommu/iommu.c:2280
#1 0xffffffff819583b2 in iommu_dma_unmap_page (dev=0xffff8880056e80c8, dma_handle=4289724416, size=131072, dir=DMA_FROM_DEVICE, attrs=0) at drivers/iommu/dma-iommu.c:1045
#2 0xffffffff81a253c5 in nvme_pci_unmap_rq (req=0xffff8880116b0000) at drivers/nvme/host/pci.c:975
#3 nvme_complete_batch (fn=<optimized out>, iob=0xffffc900001d4df0, iob@entry=0xffffc900001d4dc8) at drivers/nvme/host/nvme.h:732
#4 nvme_pci_complete_batch (iob=iob@entry=0xffffc900001d4df0) at drivers/nvme/host/pci.c:986
#5 0xffffffff81a263e2 in nvme_irq (irq=<optimized out>, data=<optimized out>) at drivers/nvme/host/pci.c:1087
$3 = (phys_addr_t (*)(struct iommu_domain *,
dma_addr_t)) 0xffffffff8193ca60 <intel_iommu_iova_to_phys>
真正的模拟 virtio iommu
arg_machine+=” -device virtio-iommu-pci”
$ p domain->ops->iova_to_phys
$1 = (phys_addr_t (*)(struct iommu_domain *, dma_addr_t)) 0xffffffff8195be70 <viommu_iova_to_phys>
不知道为什么,使用这种方法,启动特别慢。
[ 2.306892] virtio-pci 0000:00:0b.0: Adding to iommu group 8
[ 2.307239] iommu: Failed to allocate default IOMMU domain of type 11 for group (null) - Falling back to IOMMU_DOMAIN_DMA
[ 2.316790] virtio_blk virtio6: 31/0/0 default/read/poll queues
[ 2.322378] virtio_blk virtio6: [vdc] 209715200 512-byte logical blocks (107 GB/100 GiB)
[ 2.325449] PM: Magic number: 11:459:734
[ 2.325730] hwmon hwmon1: hash matches
[ 2.326050] printk: console [netcon0] enabled
[ 2.326327] netconsole: network logging started
[ 2.326649] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[ 2.329037] modprobe (294) used greatest stack depth: 13376 bytes left
[ 2.330604] Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[ 2.331038] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[ 2.331570] cfg80211: failed to load regulatory.db
[ 2.332000] ALSA device list:
[ 2.332185] No soundcards found.
在这里会卡很久。
中间存在内核中
qemu-system-x86_64: virtio_iommu_translate no mapping for 0xa86b100 for sid=32
虚拟机中测试中断 remapping ?
如何实现 int remapping 吗?
https://michael2012z.medium.com/virtio-iommu-789369049443
似乎这个是不需要的:
,iommu_platform=true,disable-legacy=on
真的有这个参数吗?
gdb 中分析下
dma_map_page_attrs
$ p dev->dma_ops
$1 = (const struct dma_map_ops *) 0xffffffff824c1960 <iommu_dma_ops>
增加一个 virtio-iommu ,为什么有这么大的开机延迟?
[ 0.545222] sd 3:0:0:10: Attached scsi generic sg3 type 0
[ 0.545257] sd 3:0:0:10: Power-on or device reset occurred
[ 0.546038] sd 3:0:0:10: [sdd] 3774873600 512-byte logical blocks: (1.93 TB/1.76 TiB)
[ 0.546848] sd 3:0:0:10: [sdd] Write Protect is off
[ 0.547385] sd 3:0:0:10: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 0.679263] sdd: sdd1 sdd2 sdd3
[ 0.679646] sd 3:0:0:10: [sdd] Attached SCSI disk
[ 0.933552] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
[ 4.217152] virtio-pci 0000:00:04.0: Adding to iommu group 5
[ 4.218889] virtio-pci 0000:00:05.0: Adding to iommu group 6
[ 4.219447] ACPI: \_SB_.LNKA: Enabled at IRQ 10
[ 4.221282] virtio-pci 0000:00:06.0: Adding to iommu group 7
[ 4.223271] virtio-pci 0000:00:07.0: Adding to iommu group 8
[ 4.224752] virtio-pci 0000:00:09.0: Adding to iommu group 9
[ 4.226577] virtio-pci 0000:00:0c.0: Adding to iommu group 10
[ 4.246345] virtio-pci 0000:00:0d.0: Adding to iommu group 11
[ 4.251415] virtio-pci 0000:00:0e.0: Adding to iommu group 12
[ 4.256408] virtio-pci 0000:00:0f.0: Adding to iommu group 13
[ 4.257354] PM: Magic number: 4:70:675
[ 4.257594] misc mpt2ctl: hash matches
原来 iommu domain 的分配是在 pci 探测的时候
[ 3.024665] CPU: 0 UID: 0 PID: 69 Comm: kworker/u258:2 Not tainted 6.12.1 #43
[ 3.025136] Hardware name: Martins3 Inc Hacking Alpine, BIOS 12 2022-2-2
[ 3.025136] Workqueue: events_unbound deferred_probe_work_func
[ 3.025136] Call Trace:
[ 3.025136] <TASK>
[ 3.025136] dump_stack_lvl+0x86/0xc0
[ 3.025136] viommu_domain_alloc+0x11/0x90 [virtio_iommu]
[ 3.025136] __iommu_domain_alloc+0x74/0x140
[ 3.025136] iommu_setup_default_domain+0x2d0/0x540
[ 3.028089] __iommu_probe_device+0x388/0x4a0
[ 3.028089] iommu_probe_device+0x24/0x70
[ 3.028089] acpi_dma_configure_id+0x9e/0xc0
[ 3.028089] pci_dma_configure+0x71/0xc0
[ 3.028089] really_probe+0x9d/0x440
[ 3.028089] __driver_probe_device+0x7c/0x140
[ 3.028089] driver_probe_device+0x1e/0x190
[ 3.028089] __device_attach_driver+0x11e/0x1a0
[ 3.028089] ? __pfx___device_attach_driver+0x10/0x10
[ 3.028089] bus_for_each_drv+0x113/0x170
[ 3.028089] __device_attach+0xc7/0x190
[ 3.028089] bus_probe_device+0x9e/0x120
[ 3.028089] deferred_probe_work_func+0x87/0xd0
[ 3.028089] process_scheduled_works+0x1bc/0x3f0
[ 3.028089] worker_thread+0x2c8/0x370
[ 3.028089] ? __pfx_worker_thread+0x10/0x10
[ 3.028089] kthread+0xf8/0x120
[ 3.028089] ? __pfx_kthread+0x10/0x10
[ 3.028089] ret_from_fork+0x37/0x50
[ 3.028089] ? __pfx_kthread+0x10/0x10
[ 3.028089] ret_from_fork_asm+0x1a/0x30
[ 3.028089] </TASK>
[ 3.045092] CPU: 0 UID: 0 PID: 69 Comm: kworker/u258:2 Not tainted 6.12.1 #43
[ 3.045689] Hardware name: Martins3 Inc Hacking Alpine, BIOS 12 2022-2-2
[ 3.046085] Workqueue: events_unbound deferred_probe_work_func
[ 3.046085] Call Trace:
[ 3.046085] <TASK>
[ 3.046085] dump_stack_lvl+0x86/0xc0
[ 3.046085] viommu_attach_dev+0x6a/0x580 [virtio_iommu]
[ 3.046085] ? iommu_create_device_direct_mappings+0x230/0x310
[ 3.046085] __iommu_device_set_domain+0x6e/0x1b0
[ 3.046085] iommu_setup_default_domain+0x434/0x540
[ 3.046085] __iommu_probe_device+0x388/0x4a0
[ 3.046085] iommu_probe_device+0x24/0x70
[ 3.046085] acpi_dma_configure_id+0x9e/0xc0
[ 3.046085] pci_dma_configure+0x71/0xc0
[ 3.046085] really_probe+0x9d/0x440
[ 3.046085] __driver_probe_device+0x7c/0x140
[ 3.046085] driver_probe_device+0x1e/0x190
[ 3.046085] __device_attach_driver+0x11e/0x1a0
[ 3.046085] bus_for_each_drv+0x113/0x170
[ 3.046085] __device_attach+0xc7/0x190
[ 3.046085] bus_probe_device+0x9e/0x120
[ 3.046085] deferred_probe_work_func+0x87/0xd0
[ 3.046085] process_scheduled_works+0x1bc/0x3f0
[ 3.046085] worker_thread+0x2c8/0x370
[ 3.046085] ? __pfx_worker_thread+0x10/0x10
[ 3.046085] kthread+0xf8/0x120
[ 3.046085] ? __pfx_kthread+0x10/0x10
[ 3.046085] ret_from_fork+0x37/0x50
[ 3.046085] ? __pfx_kthread+0x10/0x10
[ 3.046085] ret_from_fork_asm+0x1a/0x30
[ 3.046085] </TASK>
启动过程中也会调用这个:
sudo bpftrace -e "kprobe:viommu_device_group { @[kstack] = count(); }"
mdpy
为什么 mdpy 完全不会有这个? 他甚至不需要 vfio_pci 啊,那么他的 page 是如何映射上的?
sudo bpftrace -e "kprobe:viommu_map_pages { @[kstack] = count(); }"
@[
viommu_map_pages+5
__iommu_map+366
iommu_map+129
vfio_pin_map_dma+338
vfio_iommu_type1_ioctl+3690
__se_sys_ioctl+107
do_syscall_64+237
entry_SYSCALL_64_after_hwframe+119
]: 1857
mdpy_mmap
使用 iova_stress 的也可以
sudo bpftrace -e "kprobe:viommu_attach_dev { @[kstack] = count(); }"
Attaching 1 probe...
^C
@[
viommu_attach_dev+5
__iommu_device_set_domain+110
__iommu_group_set_domain_internal+110
__iommu_take_dma_ownership+420
iommu_group_claim_dma_owner+64
vfio_container_attach_group+140
vfio_group_fops_unl_ioctl+1183
__se_sys_ioctl+107
do_syscall_64+237
entry_SYSCALL_64_after_hwframe+119
]: 1
@[
viommu_attach_dev+5
__iommu_device_set_domain+110
__iommu_group_set_domain_internal+110
iommu_attach_group+125
vfio_iommu_type1_attach_group+405
vfio_fops_unl_ioctl+589
__se_sys_ioctl+107
do_syscall_64+237
entry_SYSCALL_64_after_hwframe+119
]: 1
原来这样就是一个完整的流程了:
@[
viommu_domain_free+5
vfio_iommu_type1_detach_group+631
vfio_group_detach_container+82
vfio_group_fops_release+67
__fput+133
task_work_run+137
do_exit+745
do_group_exit+120
get_signal+1707
arch_do_signal_or_restart+142
syscall_exit_to_user_mode+85
do_syscall_64+250
entry_SYSCALL_64_after_hwframe+119
]: 1
所以,大致的逻辑就是有的设备必需在一个 domain 下,有的设备可以配置在一个 domain 下。
如何 attach
kernel 侧发送 viommu_attach_dev 发送命令
VIRTIO_IOMMU_T_ATTACH 给 qemu ,在 qemu 中使用 virtio_iommu_attach 就可以了。
继续推进一步,就很容易可以看到才可以实现多个 iommu_group 共享一个 domain 了
64k 也是有这个问题,用 4.19 内核
[drm] features: -context_init
virtio-pci 0000:00:0c.0: Adding to iommu group 0
virtio-pci 0000:00:0c.0: granule 0x10000 larger than system page size 0x1000
[drm] number of scanouts: 1
------------[ cut here ]------------
[drm] number of cap sets: 0
WARNING: CPU: 7 PID: 82 at drivers/iommu/iommu.c:2979 iommu_setup_default_domain+0x3c4/0x564
[drm] Initialized virtio_gpu 0.1.0 for 0000:00:08.0 on minor 0
Modules linked in: virtio_gpu(+) virtio_dma_buf drm_shmem_helper drm_kms_helper virtio_console drm i2c_core drm_panel_orientation_quirks 9pnet_virtio virtio_balloon virtio_net net_failover failover virtio_blk virtio_iommu vfat fat nls_iso8859_1 nls_cp437 dm_mod bridge stp llc rpcsec_gss_krb5 af_packet auth_rpcgss oid_registry openvswitch libcrc32c crc32c_generic nsh tun 9p 9pnet netfs nfsv4 dns_resolver nfs lockd grace sunrpc configs efivarfs virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev virtio virtio_ring autofs4
CPU: 7 UID: 0 PID: 82 Comm: kworker/u56:0 Tainted: G W 6.13.1 #17
Tainted: [W]=WARN
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
Workqueue: events_unbound deferred_probe_work_func
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : iommu_setup_default_domain+0x3c4/0x564
lr : iommu_setup_default_domain+0x3bc/0x564
sp : ffff800081c6ba60
x29: ffff800081c6ba60 x28: ffff0000c112f600 x27: ffff0000c1351098
x26: ffff8000807e7000 x25: ffff800080976a60 x24: ffff800080e2afdc
x23: 0000000000000000 x22: ffff0000c2737100 x21: ffff0000c2737000
x20: ffff0000c2737048 x19: 00000000ffffffed x18: ffffffffffffffff
x17: 65676170206d6574 x16: 737973206e616874 x15: 2072656772616c20
x14: 3030303031783020 x13: 3030303178302065 x12: 7a69732065676170
x11: 206d657473797320 x10: 6e61687420726567 x9 : 656c756e61726720
x8 : 3a302e63303a3030 x7 : 3a30303030206963 x6 : 702d6f6974726976
x5 : ffff0000fff65908 x4 : 0000000000000000 x3 : 0000000000000000
x2 : 0000000000000000 x1 : 0000000000000000 x0 : 00000000ffffffed
Call trace:
iommu_setup_default_domain+0x3c4/0x564 (P)
__iommu_probe_device+0x2f8/0x3a0
iommu_probe_device+0x34/0x78
acpi_dma_configure_id+0x8c/0x100
pci_dma_configure+0xec/0xf4
really_probe+0x60/0x298
__driver_probe_device+0x78/0x12c
driver_probe_device+0x3c/0x15c
__device_attach_driver+0xb8/0x134
bus_for_each_drv+0x88/0xe8
__device_attach+0xa0/0x190
device_initial_probe+0x14/0x20
bus_probe_device+0xac/0xb0
deferred_probe_work_func+0x88/0xc0
process_one_work+0x144/0x38c
worker_thread+0x27c/0x458
kthread+0xe0/0xe4
ret_from_fork+0x10/0x20
---[ end trace 0000000000000000 ]---
virtio-pci 0000:00:0c.0: enabling device (0000 -> 0003)
virtio-pci 0000:00:0d.0: Adding to iommu group 0
virtio-pci 0000:00:0d.0: granule 0x10000 larger than system page size 0x1000
------------[ cut here ]------------
WARNING: CPU: 7 PID: 82 at drivers/iommu/iommu.c:2979 iommu_setup_default_domain+0x3c4/0x564
Modules linked in: virtio_gpu virtio_dma_buf drm_shmem_helper drm_kms_helper virtio_console drm i2c_core drm_panel_orientation_quirks 9pnet_virtio virtio_balloon virtio_net net_failover failover virtio_blk virtio_iommu vfat fat nls_iso8859_1 nls_cp437 dm_mod bridge stp llc rpcsec_gss_krb5 af_packet auth_rpcgss oid_registry openvswitch libcrc32c crc32c_generic nsh tun 9p 9pnet netfs nfsv4 dns_resolver nfs lockd grace sunrpc configs efivarfs virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev virtio virtio_ring autofs4
CPU: 7 UID: 0 PID: 82 Comm: kworker/u56:0 Tainted: G W 6.13.1 #17
Tainted: [W]=WARN
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
Workqueue: events_unbound deferred_probe_work_func
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : iommu_setup_default_domain+0x3c4/0x564
lr : iommu_setup_default_domain+0x3bc/0x564
sp : ffff800081c6ba60
x29: ffff800081c6ba60 x28: ffff0000c112f600 x27: ffff0000c1351098
x26: ffff8000807e7000 x25: ffff800080976a60 x24: ffff800080e2afdc
x23: 0000000000000000 x22: ffff0000c2733500 x21: ffff0000c2733400
x20: ffff0000c2733448 x19: 00000000ffffffed x18: ffffffffffffffff
x17: 65676170206d6574 x16: 737973206e616874 x15: 0773077907730720
x14: 076e076107680774 x13: ffff800080e40b08 x12: 0000000000000a65
x11: 0000000000000377 x10: ffff800080ef0b08 x9 : ffff800080e40b08
x8 : 00000000ffffdfff x7 : ffff800080ef0b08 x6 : 80000000ffffe000
x5 : ffff0000fff65908 x4 : 0000000000000000 x3 : 0000000000000000
x2 : 0000000000000000 x1 : 0000000000000000 x0 : 00000000ffffffed
Call trace:
iommu_setup_default_domain+0x3c4/0x564 (P)
__iommu_probe_device+0x2f8/0x3a0
iommu_probe_device+0x34/0x78
acpi_dma_configure_id+0x8c/0x100
pci_dma_configure+0xec/0xf4
really_probe+0x60/0x298
__driver_probe_device+0x78/0x12c
driver_probe_device+0x3c/0x15c
__device_attach_driver+0xb8/0x134
bus_for_each_drv+0x88/0xe8
__device_attach+0xa0/0x190
device_initial_probe+0x14/0x20
bus_probe_device+0xac/0xb0
deferred_probe_work_func+0x88/0xc0
process_one_work+0x144/0x38c
worker_thread+0x27c/0x458
kthread+0xe0/0xe4
ret_from_fork+0x10/0x20
---[ end trace 0000000000000000 ]---
virtio-pci 0000:00:0d.0: enabling device (0000 -> 0002)
random: crng init done
sudo bpftrace -e "kprobe:viommu_attach_dev { @[kstack] = count(); }"
sudo bpftrace -e "kprobe:viommu_domain_alloc { @[kstack] = count(); }"
本站所有文章转发 CSDN 将按侵权追究法律责任,其它情况随意。