User visible objects are backed by following datastructures:
Several terminologies when looking at these datastructures:
Automatic domain - refers to an iommu domain created automatically when attaching a device to an IOAS object. This is compatible to the semantics of VFIO type1.
Manual domain - refers to an iommu domain designated by the user as the target pagetable to be attached to by a device. Though currently there are no uAPIs to directly create such domain, the datastructure and algorithms are ready for handling that use case.
In-kernel user - refers to something like a VFIO mdev that is using the IOMMUFD access interface to access the IOAS. This starts by creating an iommufd_access object that is similar to the domain binding a physical device would do. The access object will then allow converting IOVA ranges into struct page * lists, or doing direct read/write to an IOVA.
iommufd_ioas serves as the metadata datastructure to manage how IOVA ranges are mapped to memory pages, composed of:
Each iopt_pages represents a logical linear array of full PFNs. The PFNs are ultimately derived from userspace VAs via an mm_struct. Once they have been pinned the PFNs are stored in IOPTEs of an iommu_domain or inside the pinned_pfns xarray if they have been pinned through an iommufd_access.
PFN have to be copied between all combinations of storage locations, depending on what domains are present and what kinds of in-kernel “software access” users exist. The mechanism ensures that a page is pinned only once.
An io_pagetable is composed of iopt_areas pointing at iopt_pages, along with a list of iommu_domains that mirror the IOVA to PFN map.
Multiple io_pagetable-s, through their iopt_area-s, can share a single iopt_pages which avoids multi-pinning and double accounting of page consumption.
iommufd_ioas is shareable between subsystems, e.g. VFIO and VDPA, as long as devices managed by different subsystems are bound to a same iommufd.