ID-mapped mounts
       Creating an ID-mapped mount makes it possible to change the
       ownership of all files located under a mount.  Thus, ID-mapped
       mounts make it possible to change ownership in a temporary and
       localized way.  It is a localized change because the ownership
       changes are visible only via a specific mount.  All other users
       and locations where the filesystem is exposed are unaffected.  It
       is a temporary change because the ownership changes are tied to
       the lifetime of the mount.
       Whenever callers interact with the filesystem through an ID-
       mapped mount, the ID mapping of the mount will be applied to user
       and group IDs associated with filesystem objects.  This
       encompasses the user and group IDs associated with inodes and
       also the following xattr(7) keys:
       •  security.capability, whenever filesystem capabilities are
          stored or returned in the VFS_CAP_REVISION_3 format, which
          stores a root user ID alongside the capabilities (see
          capabilities(7)).
       •  system.posix_acl_access and system.posix_acl_default, whenever
          user IDs or group IDs are stored in ACL_USER or ACL_GROUP
          entries.
       The following conditions must be met in order to create an ID-
       mapped mount:
       •  The caller must have the CAP_SYS_ADMIN capability in the
          initial user namespace.
       •  The filesystem must be mounted in a mount namespace that is
          owned by the initial user namespace.
       •  The underlying filesystem must support ID-mapped mounts.
          Currently, the xfs(5), ext4(5), and FAT filesystems support
          ID-mapped mounts with more filesystems being actively worked
          on.
       •  The mount must not already be ID-mapped.  This also implies
          that the ID mapping of a mount cannot be altered.
       •  The mount must be a detached mount; that is, it must have been
          created by calling open_tree(2) with the OPEN_TREE_CLONE flag
          and it must not already have been visible in a mount
          namespace.  (To put things another way: the mount must not
          have been attached to the filesystem hierarchy with a system
          call such as move_mount(2).)
       ID mappings can be created for user IDs, group IDs, and project
       IDs.  An ID mapping is essentially a mapping of a range of user
       or group IDs into another or the same range of user or group IDs.
       ID mappings are written to map files as three numbers separated
       by white space.  The first two numbers specify the starting user
       or group ID in each of the two user namespaces.  The third number
       specifies the range of the ID mapping.  For example, a mapping
       for user IDs such as "1000 1001 1" would indicate that user ID
       1000 in the caller's user namespace is mapped to user ID 1001 in
       its ancestor user namespace.  Since the map range is 1, only user
       ID 1000 is mapped.
       It is possible to specify up to 340 ID mappings for each ID
       mapping type.  If any user IDs or group IDs are not mapped, all
       files owned by that unmapped user or group ID will appear as
       being owned by the overflow user ID or overflow group ID
       respectively.
       Further details on setting up ID mappings can be found in
       user_namespaces(7).
       In the common case, the user namespace passed in userns_fd
       (together with MOUNT_ATTR_IDMAP in attr_set) to create an ID-
       mapped mount will be the user namespace of a container.  In other
       scenarios it will be a dedicated user namespace associated with a
       user's login session as is the case for portable home directories
       in systemd-homed.service(8)).  It is also perfectly fine to
       create a dedicated user namespace for the sake of ID mapping a
       mount.
       ID-mapped mounts can be useful in the following and a variety of
       other scenarios:
       •  Sharing files or filesystems between multiple users or
          multiple machines, especially in complex scenarios.  For
          example, ID-mapped mounts are used to implement portable home
          directories in systemd-homed.service(8), where they allow
          users to move their home directory to an external storage
          device and use it on multiple computers where they are
          assigned different user IDs and group IDs.  This effectively
          makes it possible to assign random user IDs and group IDs at
          login time.
       •  Sharing files or filesystems from the host with unprivileged
          containers.  This allows a user to avoid having to change
          ownership permanently through chown(2).
       •  ID mapping a container's root filesystem.  Users don't need to
          change ownership permanently through chown(2).  Especially for
          large root filesystems, using chown(2) can be prohibitively
          expensive.
       •  Sharing files or filesystems between containers with non-
          overlapping ID mappings.
       •  Implementing discretionary access (DAC) permission checking
          for filesystems lacking a concept of ownership.
       •  Efficiently changing ownership on a per-mount basis.  In
          contrast to chown(2), changing ownership of large sets of
          files is instantaneous with ID-mapped mounts.  This is
          especially useful when ownership of an entire root filesystem
          of a virtual machine or container is to be changed as
          mentioned above.  With ID-mapped mounts, a single
          mount_setattr() system call will be sufficient to change the
          ownership of all files.
       •  Taking the current ownership into account.  ID mappings
          specify precisely what a user or group ID is supposed to be
          mapped to.  This contrasts with the chown(2) system call which
          cannot by itself take the current ownership of the files it
          changes into account.  It simply changes the ownership to the
          specified user ID and group ID.
       •  Locally and temporarily restricted ownership changes.  ID-
          mapped mounts make it possible to change ownership locally,
          restricting the ownership changes to specific mounts, and
          temporarily as the ownership changes only apply as long as the
          mount exists.  By contrast, changing ownership via the
          chown(2) system call changes the ownership globally and
          permanently.
   Extensibility
       In order to allow for future extensibility, mount_setattr()
       requires the user-space application to specify the size of the
       mount_attr structure that it is passing.  By providing this
       information, it is possible for mount_setattr() to provide both
       forwards- and backwards-compatibility, with size acting as an
       implicit version number.  (Because new extension fields will
       always be appended, the structure size will always increase.)
       This extensibility design is very similar to other system calls
       such as perf_setattr(2), perf_event_open(2), clone3(2) and
       openat2(2).
       Let usize be the size of the structure as specified by the user-
       space application, and let ksize be the size of the structure
       which the kernel supports, then there are three cases to
       consider:
       •  If ksize equals usize, then there is no version mismatch and
          attr can be used verbatim.
       •  If ksize is larger than usize, then there are some extension
          fields that the kernel supports which the user-space
          application is unaware of.  Because a zero value in any added
          extension field signifies a no-op, the kernel treats all of
          the extension fields not provided by the user-space
          application as having zero values.  This provides backwards-
          compatibility.
       •  If ksize is smaller than usize, then there are some extension
          fields which the user-space application is aware of but which
          the kernel does not support.  Because any extension field must
          have its zero values signify a no-op, the kernel can safely
          ignore the unsupported extension fields if they are all zero.
          If any unsupported extension fields are non-zero, then -1 is
          returned and errno is set to E2BIG.  This provides forwards-
          compatibility.
       Because the definition of struct mount_attr may change in the
       future (with new fields being added when system headers are
       updated), user-space applications should zero-fill struct
       mount_attr to ensure that recompiling the program with new
       headers will not result in spurious errors at runtime.  The
       simplest way is to use a designated initializer:
           struct mount_attr attr = {
               .attr_set = MOUNT_ATTR_RDONLY,
               .attr_clr = MOUNT_ATTR_NODEV
           };
       Alternatively, the structure can be zero-filled using memset(3)
       or similar functions:
           struct mount_attr attr;
           memset(&attr, 0, sizeof(attr));
           attr.attr_set = MOUNT_ATTR_RDONLY;
           attr.attr_clr = MOUNT_ATTR_NODEV;
       A user-space application that wishes to determine which
       extensions the running kernel supports can do so by conducting a
       binary search on size with a structure which has every byte
       nonzero (to find the largest value which doesn't produce an error
       of E2BIG).