io_uring_register(2) — Linux manual page

NAME | SYNOPSIS | DESCRIPTION | RETURN VALUE | ERRORS | COLOPHON

io_uring_register(2)    Linux Programmer's Manual   io_uring_register(2)

NAME         top

       io_uring_register - register files or user buffers for
       asynchronous I/O

SYNOPSIS         top

       #include <liburing.h>

       int io_uring_register(unsigned int fd, unsigned int opcode,
                             void *arg, unsigned int nr_args);

DESCRIPTION         top

       The io_uring_register(2) system call registers resources (e.g.
       user buffers, files, eventfd, personality, restrictions) for use
       in an io_uring(7) instance referenced by fd.  Registering files
       or user buffers allows the kernel to take long term references to
       internal data structures or create long term mappings of
       application memory, greatly reducing per-I/O overhead.

       fd is the file descriptor returned by a call to
       io_uring_setup(2).  If opcode has the flag
       IORING_REGISTER_USE_REGISTERED_RING ored into it, fd is instead
       the index of a registered ring fd.

       opcode can be one of:

       IORING_REGISTER_BUFFERS
              arg points to a struct iovec array of nr_args entries.
              The buffers associated with the iovecs will be locked in
              memory and charged against the user's RLIMIT_MEMLOCK
              resource limit.  See getrlimit(2) for more information.
              Additionally, there is a size limit of 1GiB per buffer.
              Currently, the buffers must be anonymous, non-file-backed
              memory, such as that returned by malloc(3) or mmap(2) with
              the MAP_ANONYMOUS flag set.  It is expected that this
              limitation will be lifted in the future. Huge pages are
              supported as well. Note that the entire huge page will be
              pinned in the kernel, even if only a portion of it is
              used.

              After a successful call, the supplied buffers are mapped
              into the kernel and eligible for I/O.  To make use of
              them, the application must specify the
              IORING_OP_READ_FIXED or IORING_OP_WRITE_FIXED opcodes in
              the submission queue entry (see the struct io_uring_sqe
              definition in io_uring_enter(2)), and set the buf_index
              field to the desired buffer index.  The memory range
              described by the submission queue entry's addr and len
              fields must fall within the indexed buffer.

              It is perfectly valid to setup a large buffer and then
              only use part of it for an I/O, as long as the range is
              within the originally mapped region.

              An application can increase or decrease the size or number
              of registered buffers by first unregistering the existing
              buffers, and then issuing a new call to
              io_uring_register(2) with the new buffers.

              Note that before 5.13 registering buffers would wait for
              the ring to idle.  If the application currently has
              requests in-flight, the registration will wait for those
              to finish before proceeding.

              An application need not unregister buffers explicitly
              before shutting down the io_uring instance. Note, however,
              that shutdown processing may run asynchronously within the
              kernel. As a result, it is not guaranteed that pages are
              immediately unpinned in this case. Available since 5.1.

       IORING_REGISTER_BUFFERS2
              Register buffers for I/O. Similar to
              IORING_REGISTER_BUFFERS but aims to have a more extensible
              ABI.

              arg points to a struct io_uring_rsrc_register, and nr_args
              should be set to the number of bytes in the structure.

               struct io_uring_rsrc_register {
                   __u32 nr;
                   __u32 resv;
                   __u64 resv2;
                   __aligned_u64 data;
                   __aligned_u64 tags;
               };

               The data field contains a pointer to a struct iovec array
               of nr entries.  The tags field should either be 0, then
               tagging is disabled, or point to an array of nr "tags"
               (unsigned 64 bit integers). If a tag is zero, then
               tagging for this particular resource (a buffer in this
               case) is disabled. Otherwise, after the resource had been
               unregistered and it's not used anymore, a CQE will be
               posted with user_data set to the specified tag and all
               other fields zeroed.

               Note that resource updates, e.g.
               IORING_REGISTER_BUFFERS_UPDATE, don't necessarily
               deallocate resources by the time it returns, but they
               might be held alive until all requests using it complete.

               Available since 5.13.

       IORING_REGISTER_BUFFERS_UPDATE
              Updates registered buffers with new ones, either turning a
              sparse entry into a real one, or replacing an existing
              entry.

              arg must contain a pointer to a struct
              io_uring_rsrc_update2, which contains an offset on which
              to start the update, and an array of struct iovec.  tags
              points to an array of tags.  nr must contain the number of
              descriptors in the passed in arrays.  See
              IORING_REGISTER_BUFFERS2 for the resource tagging
              description.

               struct io_uring_rsrc_update2 {
                   __u32 offset;
                   __u32 resv;
                   __aligned_u64 data;
                   __aligned_u64 tags;
                   __u32 nr;
                   __u32 resv2;
               };

               Available since 5.13.

       IORING_UNREGISTER_BUFFERS
              This operation takes no argument, and arg must be passed
              as NULL.  All previously registered buffers associated
              with the io_uring instance will be released synchronously.
              Available since 5.1.

       IORING_REGISTER_FILES
              Register files for I/O.  arg contains a pointer to an
              array of nr_args file descriptors (signed 32 bit
              integers).

              To make use of the registered files, the IOSQE_FIXED_FILE
              flag must be set in the flags member of the struct
              io_uring_sqe, and the fd member is set to the index of the
              file in the file descriptor array.

              The file set may be sparse, meaning that the fd field in
              the array may be set to -1.  See
              IORING_REGISTER_FILES_UPDATE for how to update files in
              place.

              Note that before 5.13 registering files would wait for the
              ring to idle.  If the application currently has requests
              in-flight, the registration will wait for those to finish
              before proceeding. See IORING_REGISTER_FILES_UPDATE for
              how to update an existing set without that limitation.

              Files are automatically unregistered when the io_uring
              instance is torn down. An application needs only
              unregister if it wishes to register a new set of fds.
              Available since 5.1.

       IORING_REGISTER_FILES2
              Register files for I/O. Similar to IORING_REGISTER_FILES.

              arg points to a struct io_uring_rsrc_register, and nr_args
              should be set to the number of bytes in the structure.

              The data field contains a pointer to an array of nr file
              descriptors (signed 32 bit integers).  tags field should
              either be 0 or or point to an array of nr "tags" (unsigned
              64 bit integers). See IORING_REGISTER_BUFFERS2 for more
              info on resource tagging.

              Note that resource updates, e.g.
              IORING_REGISTER_FILES_UPDATE, don't necessarily deallocate
              resources, they might be held until all requests using
              that resource complete.

              Available since 5.13.

       IORING_REGISTER_FILES_UPDATE
              This operation replaces existing files in the registered
              file set with new ones, either turning a sparse entry (one
              where fd is equal to -1 ) into a real one, removing an
              existing entry (new one is set to -1 ), or replacing an
              existing entry with a new existing entry.

              arg must contain a pointer to a struct
              io_uring_files_update, which contains an offset on which
              to start the update, and an array of file descriptors to
              use for the update.  nr_args must contain the number of
              descriptors in the passed in array. Available since 5.5.

              File descriptors can be skipped if they are set to
              IORING_REGISTER_FILES_SKIP.  Skipping an fd will not touch
              the file associated with the previous fd at that index.
              Available since 5.12.

       IORING_REGISTER_FILES_UPDATE2
              Similar to IORING_REGISTER_FILES_UPDATE, replaces existing
              files in the registered file set with new ones, either
              turning a sparse entry (one where fd is equal to -1 ) into
              a real one, removing an existing entry (new one is set to
              -1 ), or replacing an existing entry with a new existing
              entry.

              arg must contain a pointer to a struct
              io_uring_rsrc_update2, which contains an offset on which
              to start the update, and an array of file descriptors to
              use for the update stored in data.  tags points to an
              array of tags.  nr must contain the number of descriptors
              in the passed in arrays.  See IORING_REGISTER_BUFFERS2 for
              the resource tagging description.

              Available since 5.13.

       IORING_UNREGISTER_FILES
              This operation requires no argument, and arg must be
              passed as NULL.  All previously registered files
              associated with the io_uring instance will be
              unregistered. Available since 5.1.

       IORING_REGISTER_EVENTFD
              It's possible to use eventfd(2) to get notified of
              completion events on an io_uring instance. If this is
              desired, an eventfd file descriptor can be registered
              through this operation.  arg must contain a pointer to the
              eventfd file descriptor, and nr_args must be 1. Note that
              while io_uring generally takes care to avoid spurious
              events, they can occur. Similarly, batched completions of
              CQEs may only trigger a single eventfd notification even
              if multiple CQEs are posted. The application should make
              no assumptions on number of events being available having
              a direct correlation to eventfd notifications posted. An
              eventfd notification must thus only be treated as a hint
              to check the CQ ring for completions. Available since 5.2.

              An application can temporarily disable notifications,
              coming through the registered eventfd, by setting the
              IORING_CQ_EVENTFD_DISABLED bit in the flags field of the
              CQ ring.  Available since 5.8.

       IORING_REGISTER_EVENTFD_ASYNC
              This works just like IORING_REGISTER_EVENTFD , except
              notifications are only posted for events that complete in
              an async manner. This means that events that complete
              inline while being submitted do not trigger a notification
              event. The arguments supplied are the same as for
              IORING_REGISTER_EVENTFD.  Available since 5.6.

       IORING_UNREGISTER_EVENTFD
              Unregister an eventfd file descriptor to stop
              notifications. Since only one eventfd descriptor is
              currently supported, this operation takes no argument, and
              arg must be passed as NULL and nr_args must be zero.
              Available since 5.2.

       IORING_REGISTER_PROBE
              This operation returns a structure, io_uring_probe, which
              contains information about the opcodes supported by
              io_uring on the running kernel.  arg must contain a
              pointer to a struct io_uring_probe, and nr_args must
              contain the size of the ops array in that probe struct.
              The ops array is of the type io_uring_probe_op, which
              holds the value of the opcode and a flags field. If the
              flags field has IO_URING_OP_SUPPORTED set, then this
              opcode is supported on the running kernel. Available since
              5.6.

       IORING_REGISTER_PERSONALITY
              This operation registers credentials of the running
              application with io_uring, and returns an id associated
              with these credentials. Applications wishing to share a
              ring between separate users/processes can pass in this
              credential id in the sqe personality field. If set, that
              particular sqe will be issued with these credentials. Must
              be invoked with arg set to NULL and nr_args set to zero.
              Available since 5.6.

       IORING_UNREGISTER_PERSONALITY
              This operation unregisters a previously registered
              personality with io_uring.  nr_args must be set to the id
              in question, and arg must be set to NULL. Available since
              5.6.

       IORING_REGISTER_ENABLE_RINGS
              This operation enables an io_uring ring started in a
              disabled state (IORING_SETUP_R_DISABLED was specified in
              the call to io_uring_setup(2)).  While the io_uring ring
              is disabled, submissions are not allowed and registrations
              are not restricted.

              After the execution of this operation, the io_uring ring
              is enabled: submissions and registration are allowed, but
              they will be validated following the registered
              restrictions (if any).  This operation takes no argument,
              must be invoked with arg set to NULL and nr_args set to
              zero. Available since 5.10.

       IORING_REGISTER_RESTRICTIONS
              arg points to a struct io_uring_restriction array of
              nr_args entries.

              With an entry it is possible to allow an
              io_uring_register(2) opcode, or specify which opcode and
              flags of the submission queue entry are allowed, or
              require certain flags to be specified (these flags must be
              set on each submission queue entry).

              All the restrictions must be submitted with a single
              io_uring_register(2) call and they are handled as an
              allowlist (opcodes and flags not registered, are not
              allowed).

              Restrictions can be registered only if the io_uring ring
              started in a disabled state (IORING_SETUP_R_DISABLED must
              be specified in the call to io_uring_setup(2)).

              Available since 5.10.

       IORING_REGISTER_IOWQ_AFF
              By default, async workers created by io_uring will inherit
              the CPU mask of its parent. This is usually all the CPUs
              in the system, unless the parent is being run with a
              limited set. If this isn't the desired outcome, the
              application may explicitly tell io_uring what CPUs the
              async workers may run on.  arg must point to a cpu_set_t
              mask, and nr_args the byte size of that mask.

              Available since 5.14.

       IORING_UNREGISTER_IOWQ_AFF
              Undoes a CPU mask previously set with
              IORING_REGISTER_IOWQ_AFF.  Must not have arg or nr_args
              set.

              Available since 5.14.

       IORING_REGISTER_IOWQ_MAX_WORKERS
              By default, io_uring limits the unbounded workers created
              to the maximum processor count set by RLIMIT_NPROC and the
              bounded workers is a function of the SQ ring size and the
              number of CPUs in the system. Sometimes this can be
              excessive (or too little, for bounded), and this command
              provides a way to change the count per ring (per NUMA
              node) instead.

              arg must be set to an unsigned int pointer to an array of
              two values, with the values in the array being set to the
              maximum count of workers per NUMA node. Index 0 holds the
              bounded worker count, and index 1 holds the unbounded
              worker count. On successful return, the passed in array
              will contain the previous maximum values for each type. If
              the count being passed in is 0, then this command returns
              the current maximum values and doesn't modify the current
              setting.  nr_args must be set to 2, as the command takes
              two values.

              Available since 5.15.

       IORING_REGISTER_RING_FDS
              Whenever io_uring_enter(2) is called to submit request or
              wait for completions, the kernel must grab a reference to
              the file descriptor. If the application using io_uring is
              threaded, the file table is marked as shared, and the
              reference grab and put of the file descriptor count is
              more expensive than it is for a non-threaded application.

              Similarly to how io_uring allows registration of files,
              this allow registration of the ring file descriptor
              itself. This reduces the overhead of the io_uring_enter(2)
              system call.

              arg must be set to an unsigned int pointer to an array of
              type struct io_uring_rsrc_register of nr_args number of
              entries. The data field of this struct must point to an
              io_uring file descriptor, and the offset field can be
              either -1 or an explicit offset desired for the registered
              file descriptor value. If -1 is used, then upon successful
              return of this system call, the field will contain the
              value of the registered file descriptor to be used for
              future io_uring_enter(2) system calls.

              On successful completion of this request, the returned
              descriptors may be used instead of the real file
              descriptor for io_uring_enter(2), provided that
              IORING_ENTER_REGISTERED_RING is set in the flags for the
              system call. This flag tells the kernel that a registered
              descriptor is used rather than a real file descriptor.

              Each thread or process using a ring must register the file
              descriptor directly by issuing this request.

              The maximum number of supported registered ring
              descriptors is currently limited to 16.

              Available since 5.18.

       IORING_UNREGISTER_RING_FDS
              Unregister descriptors previously registered with
              IORING_REGISTER_RING_FDS.

              arg must be set to an unsigned int pointer to an array of
              type struct io_uring_rsrc_register of nr_args number of
              entries. Only the offset field should be set in the
              structure, containing the registered file descriptor
              offset previously returned from IORING_REGISTER_RING_FDS
              that the application wishes to unregister.

              Note that this isn't done automatically on ring exit, if
              the thread or task that previously registered a ring file
              descriptor isn't exiting. It is recommended to manually
              unregister any previously registered ring descriptors if
              the ring is closed and the task persists. This will free
              up a registration slot, making it available for future
              use.

              Available since 5.18.

       IORING_REGISTER_PBUF_RING
              Registers a shared buffer ring to be used with provided
              buffers. This is a newer alternative to using
              IORING_OP_PROVIDE_BUFFERS which is more efficient, to be
              used with request types that support the
              IOSQE_BUFFER_SELECT flag.

              The arg argument must be filled in with the appropriate
              information. It looks as follows:

                   struct io_uring_buf_reg {
                       __u64 ring_addr;
                       __u32 ring_entries;
                       __u16 bgid;
                       __u16 pad;
                       __u64 resv[3];
                   };

               The ring_addr field must contain the address to the
               memory allocated to fit this ring.  The memory must be
               page aligned and hence allocated appropriately using eg
               posix_memalign(3) or similar. The size of the ring is the
               product of ring_entries and the size of struct
               io_uring_buf.  ring_entries is the desired size of the
               ring, and must be a power-of-2 in size. The maximum size
               allowed is 2^15 (32768).  bgid is the buffer group ID
               associated with this ring. SQEs that select a buffer have
               a buffer group associated with them in their buf_group
               field, and the associated CQEs will have
               IORING_CQE_F_BUFFER set in their flags member, which will
               also contain the specific ID of the buffer selected. The
               rest of the fields are reserved and must be cleared to
               zero.

               nr_args must be set to 1.

               Also see io_uring_register_buf_ring(3) for more details.
               Available since 5.19.

       IORING_UNREGISTER_PBUF_RING
              Unregister a previously registered provided buffer ring.
              arg must be set to the address of a struct
              io_uring_buf_reg, with just the bgid field set to the
              buffer group ID of the previously registered provided
              buffer group.  nr_args must be set to 1. Also see
              IORING_REGISTER_PBUF_RING .

              Available since 5.19.

       IORING_REGISTER_SYNC_CANCEL
              Performs a synchronous cancelation request, which works in
              a similar fashion to IORING_OP_ASYNC_CANCEL except it
              completes inline. This can be useful for scenarios where
              cancelations should happen synchronously, rather than
              needing to issue an SQE and wait for completion of that
              specific CQE.

              arg must be set to a pointer to a struct
              io_uring_sync_cancel_reg structure, with the details
              filled in for what request(s) to target for cancelation.
              See io_uring_register_sync_cancel(3) for details on that.
              The return values are the same, except they are passed
              back synchronously rather than through the CQE res field.
              nr_args must be set to 1.

              Available since 6.0.

       IORING_REGISTER_FILE_ALLOC_RANGE
              sets the allowable range for fixed file index allocations
              within the kernel. When requests that can instantiate a
              new fixed file are used with IORING_FILE_INDEX_ALLOC , the
              application is asking the kernel to allocate a new fixed
              file descriptor rather than pass in a specific value for
              one. By default, the kernel will pick any available fixed
              file descriptor within the range available.  This
              effectively allows the application to set aside a range
              just for dynamic allocations, with the remainder being
              used for specific values.

              nr_args must be set to 1 and arg must be set to a pointer
              to a struct io_uring_file_index_range:

                   struct io_uring_file_index_range {
                       __u32 off;
                       __u32 len;
                       __u64 resv;
                   };

               with off being set to the starting value for the range,
               and len being set to the number of descriptors. The
               reserved resv field must be cleared to zero.

               The application must have registered a file table first.

               Available since 6.0.

RETURN VALUE         top

       On success, io_uring_register(2) returns either 0 or a positive
       value, depending on the opcode used.  On error, a negative error
       value is returned. The caller should not rely on the errno
       variable.

ERRORS         top

       EACCES The opcode field is not allowed due to registered
              restrictions.

       EBADF  One or more fds in the fd array are invalid.

       EBADFD IORING_REGISTER_ENABLE_RINGS or
              IORING_REGISTER_RESTRICTIONS was specified, but the
              io_uring ring is not disabled.

       EBUSY  IORING_REGISTER_BUFFERS or IORING_REGISTER_FILES or
              IORING_REGISTER_RESTRICTIONS was specified, but there were
              already buffers, files, or restrictions registered.

       EEXIST The thread performing the registration is invalid.

       EFAULT buffer is outside of the process' accessible address
              space, or iov_len is greater than 1GiB.

       EINVAL IORING_REGISTER_BUFFERS or IORING_REGISTER_FILES was
              specified, but nr_args is 0.

       EINVAL IORING_REGISTER_BUFFERS was specified, but nr_args exceeds
              UIO_MAXIOV

       EINVAL IORING_UNREGISTER_BUFFERS or IORING_UNREGISTER_FILES was
              specified, and nr_args is non-zero or arg is non-NULL.

       EINVAL IORING_REGISTER_RESTRICTIONS was specified, but nr_args
              exceeds the maximum allowed number of restrictions or
              restriction opcode is invalid.

       EMFILE IORING_REGISTER_FILES was specified and nr_args exceeds
              the maximum allowed number of files in a fixed file set.

       EMFILE IORING_REGISTER_FILES was specified and adding nr_args
              file references would exceed the maximum allowed number of
              files the user is allowed to have according to the
              RLIMIT_NOFILE resource limit and the caller does not have
              CAP_SYS_RESOURCE capability. Note that this is a per user
              limit, not per process.

       ENOMEM Insufficient kernel resources are available, or the caller
              had a non-zero RLIMIT_MEMLOCK soft resource limit, but
              tried to lock more memory than the limit permitted.  This
              limit is not enforced if the process is privileged
              (CAP_IPC_LOCK).

       ENXIO  IORING_UNREGISTER_BUFFERS or IORING_UNREGISTER_FILES was
              specified, but there were no buffers or files registered.

       ENXIO  Attempt to register files or buffers on an io_uring
              instance that is already undergoing file or buffer
              registration, or is being torn down.

       EOPNOTSUPP
              User buffers point to file-backed memory.

       EFAULT User buffers point to file-backed memory (newer kernels).

COLOPHON         top

       This page is part of the liburing (A library for io_uring)
       project.  Information about the project can be found at 
       ⟨https://github.com/axboe/liburing⟩.  If you have a bug report for
       this manual page, send it to io-uring@vger.kernel.org.  This page
       was obtained from the project's upstream Git repository
       ⟨https://github.com/axboe/liburing⟩ on 2023-12-22.  (At that
       time, the date of the most recent commit that was found in the
       repository was 2023-12-19.)  If you discover any rendering
       problems in this HTML version of the page, or you believe there
       is a better or more up-to-date source for the page, or you have
       corrections or improvements to the information in this COLOPHON
       (which is not part of the original manual page), send a mail to
       man-pages@man7.org

Linux                          2019-01-17           io_uring_register(2)

Pages that refer to this page: io_uring_enter2(2)io_uring_enter(2)io_uring_register(2)io_uring_setup(2)syscalls(2)io_uring_prep_accept(3)io_uring_prep_accept_direct(3)io_uring_prep_cmd(3)io_uring_prep_fadvise(3)io_uring_prep_files_update(3)io_uring_prep_madvise(3)io_uring_prep_multishot_accept(3)io_uring_prep_multishot_accept_direct(3)io_uring_prep_openat2(3)io_uring_prep_openat2_direct(3)io_uring_prep_openat(3)io_uring_prep_openat_direct(3)io_uring_prep_provide_buffers(3)io_uring_prep_remove_buffers(3)io_uring_prep_splice(3)io_uring_prep_tee(3)io_uring_register_buffers(3)io_uring_register_buffers_sparse(3)io_uring_register_buffers_tags(3)io_uring_register_buffers_update_tag(3)io_uring_register_files(3)io_uring_register_files_sparse(3)io_uring_register_files_tags(3)io_uring_register_files_update(3)io_uring_register_files_update_tag(3)io_uring_register_iowq_aff(3)io_uring_register_iowq_max_workers(3)io_uring_unregister_iowq_aff(3)io_uring(7)