Index | index by Group | index by Distribution | index by Vendor | index by creation date | index by Name | Mirrors | Help | Search |
Name: libfabric | Distribution: openSUSE Tumbleweed |
Version: 1.22.0 | Vendor: openSUSE |
Release: 3.1 | Build date: Mon Dec 2 09:47:15 2024 |
Group: Development/Libraries/C and C++ | Build host: reproducible |
Size: 288641 | Source RPM: libfabric-1.22.0-3.1.src.rpm |
Packager: https://bugs.opensuse.org | |
Url: http://www.github.com/ofiwg/libfabric | |
Summary: User-space RDMA Fabric Interfaces |
libfabric provides a user-space API to access high-performance fabric services, such as RDMA. This package only contains the fi_info binary.
BSD-2-Clause OR GPL-2.0-only
* Mon Dec 02 2024 Nicolas Morey <nicolas.morey@suse.com> - Completely remove building for AVX/AVX2 in PSM3 (bsc#1213538, bsc#1233356, bsc#1234014) Runtime detection before initializing the provider is not enough as PSM3 uses constructors which may include AVX insctruction. Only requires SSE4.2 as it does make a large performance impact in calculatin packet hashes. - Remove psm3-fix-SIGILL-on-system-not-supporting-AVX.patch - Add psm3-prevent-code-from-building-using-AVX-AVX2.patch - Add _constraints to mark SSE4.2 as required * Thu Nov 28 2024 Nicolas Morey <nicolas.morey@suse.com> - Add psm3-fix-SIGILL-on-system-not-supporting-AVX.patch to fix SIGILL hapening during init on older CPU (bsc#1213538, bsc#1233356). - Refresh libfabric-libtool.patch tu support patch -p0 * Mon Aug 05 2024 Filip Kastl <filip.kastl@suse.com> - Add -Wno-incompatible-pointer-types to CFLAGS to enable building for 32bit with GCC 14. * Sun Aug 04 2024 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.22.0 - Coll - Fix Coverity issues - Core - General bug fixes - hmem: change neuron get_dmabuf_fd error code - Fix an error in the error handling path of fi_param_define() - Makefile.am: Add Windows build files to distribution tarball - hmem: disable ZE IPC - Add profile variables for connections and memory allocated - hmem: Fix `cuDeviceCanAccessPeer()` error reporting - man: Update text for `len` parameter - Add page size MR attr field - man: Extend fi_mr_refresh support - man: Improve FI_MR_ALLOCATED documentation - man: Support optional MR desc - man: Improve FI_MR_HMEM documentation - Added ofi_get_realtime interfaces - Add endpoint options for max message size and inject size - Add Windows definition for `EREMOTEIO` - EFA - General improvement and bug fixes - Handle recv cancel for zero copy recv - Avoid iterating EP list in CQ read - Add RDMA core errno for remote unknown peer - Map EFA errnos to Libfabric codes - Improve the zero-copy receive feature - Improve the handshake enforcement procedure - Support unsolicited rdma-write recv - Support FI_MORE for eager send and rdma-write - Improve the EFA_IO_COMP error code and explanation - Improve the unit test for LL128 protocol - Distinguish max RMA size from msg size - Hooks - dmabuf: Fix incompatible pointer warning - OPX - Add missing file needed for fabric direct build to release package - Fix performance issue caused by not setting ACK bit in the single SDMA packet case - TID cache debug improvements - Detection of driver lack of support for TID - Multi-CTS support for TID - Removal of statement that TID is not supported - OPX Tracer improvements - Improvements to OPX shared memory cleanup - H to H performance improvements for build that supports HMEM - Bug fix for a threshold check - Bug fix for FI_SELECTIVE_COMPLETION - CN5000 fixes - Parameterization of various thresholds - Further enhancements to support NVIDIA GPUs, included CUDA-allocated bounce buffers and in-provider support for GDRCopy - Enhancements to enable support for CN5000 hardware - Better checking for TID support - General TID enhancements - Pkey error handling - Send work queue splitting - Support for OPX tracer for profiling purposes - Coverity scan fixes - Fixes and enhancements to logging and debug messages - Intranode RMA read fixes - Fix compile issues - Fix shared memory segment index creation bug - PSM3 - Update provider to sync with IEFS 11.7.0.0.110 - Improved auto-tuning features for PSM3, including dynamic Credit Flows and detecting the presence of the rv kernel module - Improved PSM3 intra-node performance for large message sizes - SHM - Added support for write() method to submit DSA work - Touch all buffer pages after DSA page fault - Add return and more descriptive error message - Fix coverity about incorrect sign - Fix memory leaks for srx - Fix atomic read - Sockets - Fix Coverity issues - USNIC - Fix a few Coverity issues - Util - Discard outstanding operations in util_srx_close - Enable profile on the size of bufpool allocated. - Add more predefined profile variables. - Fix issue while displaying addresses with fi_info -a <addr_format> - fi_pingpong: Fix out of scope memory leak - Add source address to fi_pingpong - Verbs - Flush CQ for SQ on no SQ credit - Optimize search for device max inline size - Enable profiling - Fabtests - pytest/shm: reduce the msg size in test_unexpected_msg - Fix synapseai fabtests build - Add pytests for EFA zero-copy receive - Add benchmark option for `FI_OPT_MAX_MSG_SIZE` - benchmarks: Add synapseai support - Disable fi_rdm_tagged_peek test for ucx and psm3 - Add manual init sync to fi_rdm_multiclient and fi_rdm - Refactor ft_sock_sync to take in a socket - Add fi_rdm_bw test - Skip rma_pingpong write tests - Init rx_buf before sending data - Add rma_pingpong tests to makefile - pytest: use different message sizes for rma pingpong - Fix missing fixture memory_type in test_rma_pingpong_range_no_inject - pytest: account for process startup overhead in client-server tests - pytest: save client process output to a file - Support testing inject with cq data - multinode: update arguments - multi_ep: Fix memory leak - rdm_tagged_peek: Align rx's msg_order with tx's - Add backlog > 0 to listen call * Wed Apr 03 2024 Nicolas Morey <nicolas.morey@suse.com> - Enable ucx and new efa provider on 64b architectures. - Use a single changes file for libfabric and fabtests. - Update to 1.21.0 - Core - Various update and fixed in man pages - Fix xpmem memory corruption - Extend FI_PROVIDER_PATH to allow setting preferred DL provider - Add a SECURITY.md file - Document preferred threading model for scalable endpoints - Move FI_PRIORITY to internal flag - Remove FI_PROV_SPECIFIC - Remove unimplemented or unused features - Support cntr byte counting - configure: Do not check for xpmem if disabled - Add FI_PROGRESS_CONTROL_UNIFIED - hmem/cuda: Get multiple attributes at once in cuda_is_addr_valid - configure: Add -pipe by default to CFLAGS - Selectively generate warnings on failed loading of DL providers - hmem: introduce ofi_dev_reg_copy_*_iov ops - Print provider path on fabric creation - Introduce FI_OPT_SHARED_MEMORY_PERMITTED - README.md: Add badge for openssf scorecard - man: Regulate the fi_setopt call sequence. - man: Clarify the usage of FI_RMOTE_CQ_DATA flag - man: Add ucx provider to the fi_provider man page - configure.ac: add extra check for 128 bit atomic support - include/osd: align atomic complex definitions - hmem/synapseai: Refine the error handling and warning - Specify C11 standard for Visual Studio builds - configure: Do not check for xpmem if disabled - man page fixes - EFA - General improvement and bug fixes - Propagate errnos from core functions untouched - Create 1:1 relationship between libfabric CQs and IBV CQs - Do not progress ep inside transmission call when hitting EAGAIN - Remove unnecessary check in rdma write. - Handle rx pkts error without ope - Add a new rx pkt counter - Enable runting for neuron with a different runt size - Distinguish unresponsive receiver errors - Remove unnecessary handshake in send path - Don't fail the whole domain init if cudamalloc failed - Introduce efa specific domain operations - Implement FI_OPT_SHARED_MEMORY_PERMITTED - Do not memset rxe to 0 on init - Reduce # of error cases in happy path - Add FI_EFA_USE_HUGE_PAGE to efa man page. - Don't do handshake for local fi_write - Add pingpong test after exhausting MRs - Introduce utilities to exhaust MRs on EFA device - Test EFA with a 1GiB message - Do not abort on all deprecated env vars - Onboard fi_mr_dmabuf API in mem reg ops. - Try registering cuda memory via dmabuf when checking p2p - Introduce HAVE_EFA_DMABUF_MR macro in configure - Use long CTS protocol if long read and runting read protocols fail because of memory registration limits - Remove unnecessary check in rdma write. - Enable runting for neuron with a different runt size - Handle rx pkts error without ope - Distinguish unresponsive receiver errors - Add `efa_show_help()` - Refactor error code definitions - Remove error message assertions from CQ unit tests - Refactor `efa_strerror()` - Doxyfile: Configure tabs to 8 spaces - Rename Doxyfile - Hooks - dmabuf_peer_mem: initialize fd to supress compiler warning - NETDIR - Removed. The functionality is intergrated into the verbs provider. - OPX - Fix compiler warnings and coverity issues - General improvement and bug fixes - Add GPU support to expected TID - RZV RTS packet exclude empty immediate data - Add more efficient check for cuda-resident user buffer - Improve default HFI selection logic in multi rail environments - Flush dead list opportunistically - Add RISC-V support - Make update HDRQ register frequency configurable at build time - Removed all references to the reliability nack threshold env var - Added missing tuneables, rearraged to match fi_info -e output - Use BAR load/store macros - Check HFI driver version to allow GPU-enabled build/run - Added kernel and driver version check to allow/disallow expected receive TID - Fix max SHM connections to allow up to 16 HFIs - Use FI_HMEM_SYSTEM for Cuda-Managed (Unified) memory - Handle FI_OPT_CUDA_API_PERMITTED - Use contiguous send when only one iov present - Always replay TID packets over SDMA - Add Virtual Lane and Partition pkey (FI_OPX_SL and FI_OPX_PKEY) - Forced AV type to be AV Map when requested AV is unsupported - Reduce size of opx_shm_tx - Add GPU support for RMA Atomic operations - Add GPU support for RMA reads and writes - Add HMEM debug counters - Print debug counters upon receiving SIGUSR1 - Fix multi-receive to work with contiguous rzv payload - Initial support for GPU / FI_HMEM - Limit multipacket eager implementation to tagged sends - Read, verify and store some hfi chip attributes - PSM3 - Update provider to sync with IEFS 11.6.0.0.231 - Fix some conditional build errors - RSTREAM - Removed. - RXM - Add option to auto detect hmem iface of user buffers - SHM - Manually align 8 byte fields in memory region - Close device_fds for connected peers when the EP is closed - Print shm name and error code when failed to open - Mark send as completed when a message is discarded - Don't close dmabuf-fd when a request is done - Revert the smr_region fields adjustment - Fix various coverity issues - Add ep to cq ep list once in cq bind - Add ofi_buf_alloc error handling - Revert the smr_region fields adjustment - Don't close dmabuf-fd when a request is done - Mark send as completed when a message is discarded - Print shm name and error code when failed to open - Close device_fds for connected peers when the EP is closed - SOCKETS - fix compiler warnings and coverity issues - UCX - Fix incorrect enum value in FI_DBG() and FI_WARN() - USNIC - Turn off compiler warnings of possible string truncation - Util - Make ep_list_lock noop for FI_PROGRESS_CONTROL_UNIFIED - Save control progress model to util_domain - Set import monitor state to idle upon close - Add name field to memory monitors - memhooks: Fix a bug when calculating mprotect region - Modify domain_attr based on FI_AV_AUTH_KEY - Verbs - Non-blocking EP creation - Address cm_id resource leak in rdma_reject path - Redirected error handle logic for dmabuf failure in verbs - Added rocr dmabuf support under verbs - Windows: Check error code from GetPrivateData - Add missing lock to protect SRX - Fix compiler warnings about out of boundary access - Fabtests - Fix various coverity issues - General improvement and bug fixes - Add multi_ep test - Serialize the run of fi_cq_test - Utilize `junitparser` module directly - Add progress models to SHM/EFA fabtests - Add option to change progress model - efa/rnr_cq_read_err: poll cq when hitting EAGAIN - Allow testing multi_ep with shared/non-shared cq and av - Print warning for HMEM iface init failure - efa: Add small tx_rx size test - pytest: Make ssh connection error pattern less stringent - Add new exclude file for io_uring tests - Add rma_pingpong benchmark - efa: Make 1G tests run faster - pytests: add command line argument for dmabuf reg - Bump Libfabric API version. - Add option to support dmabuf MR - Add dmabuf ops for cuda. - Replace strtok with strtok_r - Add new exclude file for io_uring tests * Mon Mar 25 2024 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.20.1 - Core - hmem/ze: Change the library name passed to dlopen - hmem/ze: map device id to physical device - hmem/ze: skip duplicate initialization - hmem/ze: dynamically allocate device resources based on number of devices - hmem/ze: fix hmem_ze_copy_engine variable look up - hmem/ze: Increase ZE_MAX_DEVICES to 32 - man: Fix typo in fi_getinfo man page - Fix compiler warning when compiling with ICX - man: Fix fi_rxm.7 and fi_collective.3 man pages - man: Update EFA docs for FI_EFA_INTER_MIN_READ_WRITE_SIZE - EFA - efa_rdm_ep_record_tx_op_submitted() rm peer lookup - Remove peer lookup from efa_rdm_pke_sendv() - Make handshake response use txe - test: Only close SHM if SHM peer is Created - Handshake code allocs txe via efa util - Initialize txe.rma_iov_count to 0 - Switch fi_addr to efa_rdm_peer in trigger_handshake - Downgrade EFA Endpoint Creation WARN to INFO - Init srx_ctx before use - Clean up generic_send path - Pass in efa_rdm_ep to efa_rdm_msg_generic_recv() - Make recv path slightly more efficient - re-org rma write to avoid duplicate checks - Add missing sync_memops call to writedata - use peer pointer from txe in read, write and send - Pass in peer pointer to txe - Get rid of noop instruction from empty #define - Remove noop memset - Fix the ibv cq error handling. - Don't do handshake for local read - Fix a typo in configure.m4 - Make runt_size aligned - OPX - Initialize cq error data size - RXM - Fix data error with FI_OFI_RXM_USE_RNDV_WRITE=1 - SHM - Fix coverity issue about resource leak - Adjust the order of smr_region fields. - Allocate peer device fds dynamically - Util - Fix coverity issue about missing lock - Implement timeout in util_wait_yield_run() - Fix bug in util_cq startup error case - util_mem_hooks: add missing parantheses - Verbs - Windows: Resolve regression in user data retrieval - Fabtests - efa: Close ibv device after use - efa: Get device MR limit from ibv_query_device - efa: Add simple unexpected test to MR exhaustion test - pytest: add a new ssh connection error pattern * Thu Feb 29 2024 pgajdos@suse.com - Use %autosetup macro. Allows to eliminate the usage of deprecated %patchN * Sun Nov 19 2023 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.20.0 (jsc#PED-5777, jsc#PED-5893, jsc#PED-5889) - Core - General bug fixes and code clean-up - configure.ac: add extra check for 128 bit atomic support - hmem/synapseai: Refine the error handling and warning - Introduce FI_ENOMR - hmem/cuda: fix a bug when calculating aligned size. - Handle dmabuf for ofi_mr_cache* functions. - Handle dmabuf flag in ofi_mr_attr_update - Handle dmabuf for mr_map insert. - man: Fix the description of virtual address when FI_MR_DMABUF is set - man: Clarify the defition of FI_OPT_MIN_MULTI_RECV - hmem/cuda: Add dmabuf fd ops functions - include/ofi_atomic_queue: Properly align atomic values - Define fi_av_set_user_id - Support multiple auth keys per EP - Simplify restricted-dl feature - hmem: Only initalize synapseai if device exists - Add "--enable-profile" option - windows: Updated config.h - Add environment variable for selective HMEM initialization - Add restricted dlopen flag to configure options - hmem: generalize the use of OFI_HMEM_DATA to non-cuda iface - hmem: fail cuda_dev_register if gdrcopy is not enabled - Add 1.7 ABI compat - Define fi_domain_attr::max_ep_auth_key - hmem: Add new op to hmem_ops for getting dmabuf fd - hmem/cuda: Update cuda_gdrcopy_dev_register's signature - mr_cache: Define ofi_mr_info::flags - Add ABI compat for fi_cq_err_entry::src_addr - Define fi_cq_err_entry::src_addr - Add base_addr to fi_mr_dmabuf - hmem: Set FI_HMEM_HOST_ALLOC for ze addr valid - hmem: Support dev reg with FI_HMEM_ZE - tostr: Added fi_tostr() for data type struct fi_cq_err_entry. - hmem_ze: fix incorrect device id in copy function - Introduce new profiling interface for low-level statistics - hmem: Support dev reg with FI_HMEM_CUDA - hmem: Support dev reg with FI_HMEM_ROCR - hmem: Support dev reg with FI_HMEM_SYSTEM - hmem: Define optimized HMEM memcpy APIs - Implement memhooks atfork child handler - hmem: Support ofi_hmem_get_base_addr with sys mem - hmem: Add length field to ofi_hmem_get_base_addr - mr_cache: Improve cache hit rate - mr_cache: Purge dead regions in find - mr_cache: Update find to remove invalid MR entries - mr_cache: Update find with MM valid check - Add direct support for dma-buf memory registration - man/fi_tagged: Remove the peek for data ability - indexer: Add byte idx abstraction - Add missing FI_REMOTE_CQ_DATA for fi_inject_writedata - Add configure flags for more sanitizers - Fix fi_peer man page inconsistency - include/fi_peer: Add cq_data to rx_entry, allow peer to modify on unexp - Add XPMEM support - EFA - General bug fix and code clean-up - Do not abort on all deprecated env vars - Onboard fi_mr_dmabuf API in mem reg ops. - Try registering cuda memory via dmabuf when checking p2p - Introduce HAVE_EFA_DMABUF_MR macro in configure - Add read nack protocol docs - Receiver send NACK if runt read fails with ENOMR - Sender switch to long CTS protocol if runt read fails with ENOMR - Receiver send NACK if long read fails with ENOMR - Update efa_rdm_rxe_map_remove to accept msg_id and addr - Sender switch to long CTS protocol if long read fails with ENOMR - Introduce new READ_NACK feature - Use SHM's full inject size - Add testing for small messages without inject - Enable inject rdma write - Use bounce buffer for 0 byte writes - Onboard ofi_hmem_dev_register API - Update cuda_gdrcopy_dev_register's signature - Allocate pke_vec, recv_wr_vec, sge_vec from heap - Close shm resource when it is disabled in ep - Disable RUNTING for Neuron - Move cuda-sync-memops from MR to EP - Do not insert shm av inside efa progress engine - Enable shm when FI_HMEM and FI_ATOMIC are requested - Adjust posted receive size to pkt_size - Do not create SHM peer when SHM is disabled - Use correct threading model for shm - Restrict RDMA read to compatible EFA devices - Add EFA device version to handshake - Add missing locks in efa_cntr_wait. - Add writedata RNR fabtest - Handle RNRs from RDMA writedata - Check opt_len in efa_rdm_ep_getopt - Use correct tx/rx op_flags for shm - Hooks - dmabuf: Initialize fd to supress compiler warning - trace: Add log on FI_VAR_UNEXP_MSG_CNT when enabled. - trace: Fixed trace log format on some attributes. - OPX - Fix compiler warnings - PSM3 - Fix compiler warnings - Update provider to sync with IEFS 11.5.1.1.1 - RXM - Remove unused function - Use gdrcopy in rma when emulating injection - Use gdrcopy in eager send/recv - Add hmem gdrcopy functions - Remove unused dynamic rbuf support - SHM - General bug fixes and cleanup - Add ofi_buf_alloc error handling - Only copy header + msg on unexpected path - Add FI_HMEM atomic support - Add memory barrier before updating resp for atomic - Add more error output - Reduce atomic locking with ofi_mr_map_verify - Only increment tx cntr when inject rma succeeded. - Use peer cntr inc ops in smr_progress_cmd - Allow for inject protocol to buffer more unexpected messages - Change pending fs to bufpool to allow it to grow - Add unexpected SAR buffering - Use generic acronym for shm cap - Move CMA to use the p2p infrastructure - Add p2p abstraction - Load DSA dependency dynamically - Replace tx_lock with ep_lock - Calculate comp vars when writing completion - Move progress_sar above progress_cmd - Rename SAR status enum to be more clear - Make SAR protocol handle 0 byte transfer. - Move selection logic to smr_select_proto() - Sockets - Fix compiler warnings - Fix provider name and api version in returned fi_info struct - TCP - Add profiling interface support - Pass through rdm_ep flags to msg eps - Derive cq flags from op and msg flags - Do not progress ep that is disconnected - Set FI_MULTI_RECV for last completed RX slice - Return an error if invalid sequence number received - xnet_progress_rx() must only be called when connected - Reset ep->rx_avail to 0 after RX queue is flushed - Disable the EP if an error is detected for zero-copy - Add debug tracking of transfer entries - Negotiate support for rendezvous - Add rendezvous protocol option - Generalize xnet_send_ack - Flatten protocol header definitions - Remove unused dynamic rbuf support - Define tcp specific protocol ops - Remove unneeded and incorrect rx_entry init code - UCX - Add FI_HMEM support - Initialize ep_flush to 1 - Util - General bug fixes - memhooks: Fix a bug when calculating mprotect region - Check the return value of ofi_genlock_init() - Update checks for FI_AV_AUTH_KEY - Define domain primary and secondary caps - Add profiling util functions - Update util_cq to support err_data - Update ofi_cq_readerr to use new memcpy - Update ofi_cq_err_memcpy to handle err_data - Zero util cancel err entry - Move FI_REMOTE/LOCAL_COMM to secondary caps - Alter domain max_ep_auth_key - Add domain checks for max_ep_auth_key - Revert util_cntr->ep_list_lock to ofi_mutex - Add NIC FID functions to ofi.h - Add EP and domain auth key checking - Add bounds checks to ibuf get - Define dlist_first_entry_or_null - Update util_getinfo to dup auth_key - Revert util_av, util_cq and util_cntr to mutex - Add missing calls to (de)initialize monitor's mutexes - Avoid attempting to cleanup an uninitialized MR cache - Rename ofi_mr_info fields - Add rv64g support to memory hooks - Verbs - Windows: Check error code from GetPrivateData - Add missing lock to protect SRX - Add synapseai dmabuf mr support - Bug fix for matching domain name with device name - Windows: Fetch rejected connection data - Add support for DMA-buf memory registration - Windows: Fix use-after-free in case of failure in fi_listen - Windows: Map ND request type to ibverbs opcode - Fix memory leak when creating EQ with unsupported wait object - Track ep state to prevent duplicate shutdown events - Fabtests - Update man page - pytests/efa: onboard dmabuf argument for test_mr - pytest: make do_dmabuf_reg_for_hmem an cmdline argument - Bump Libfabric API version. - mr_test: Add dmabuf support - Introduce ft_get_dmabuf_from_iov - unexpected_msg: Use ft_reg_mr to register memory - pytest: Allow registering mr with dmabuf - Add dmabuf support to ft_reg_mr - Add dmabuf ops for cuda. - Test max inject size - Add FI_HMEM support to fi_rdm_rma_event and fi_rdm tests - memcopy-xe: Fix data verification error for device buffer - dmabuf-rdma: Increase the number of NICs that can be tested - dmabuf-rdma: Remove redundant libze_ops definition - fi-mr-reg-xe: Skip native dmabuf reg test for system memory - Check if fi_info is returned correctly in case of FI_CONNREQ - cq_data: relax CQ data validation to cq_data_size - Add ZE host alloc function - Use common device host buffer for check_buf - hmem_ze: allocate one cq and cl on init - fi-mr-reg-xe: Add testing for dmabuf registration - scripts: use yaml safe_load - macos: Fix build error with clang - multinode: Use FI_DELIVERY_COMPLETE for 'barrier' - Handle partial read scenario for fi_xe_rdmabw test For cross node tests - pytest/efa: add cuda memory marker - pytest/efa: Skip some configuration for unexp msg test on neuron. - runfabtests.py: ignore error due to no tests are collected. - pytest/efa: extend unexpected msg test range - pytest/shm: extend unexpected msg test range - pytest: Allow running shm fabtests in parallel - unexpected_msg.c: Allow running the test with FI_DELIVERY_COMPLETE - runfabtests.sh: run fi_unexpected_msg with data validation - pytest/shm: Extend test_unexpected_message - unexpected_msg: Make tx/rx_size large enough - pytest/shm: Extend shm's rma bw test - Update shm.exclude * Mon Sep 04 2023 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.19.0 - Core - General code cleanup and restructuring - Add ofi_hmem_any_ipc_enabled() - ofi_consume_iov allows 0-byte consume - ofi_consume_iov consistency - ofi_indexer: return error code when iterating - getinfo: Add post filters for domain and fabric names - Filter loopback device if iface is specified - bsock: Fix error checking for -EAGAIN - windows/osd: Remove unneeded check to silence coverity - windows/osd: Move variable declaration to silence coverity - Introduce gdrcopy awareness to hmem copy - mr/cache: Fix fi_mr_info initialization - hmem_cuda: remove gdrcopy from cuda hmem copy path - iouring: Fix wrong indent in ofi_sockapi_accept_uring() - Implement ofi_sockctx_uring_poll_add() - hmem: introduce gdrcopy from/to cuda iov functions - hmem: Deprecate `FI_HMEM_CUDA_ENABLE_XFER` - hmem_cuda: Restrict CUDA IPC based on peer accessibility - hmem_cuda: Log number of CUDA devices detected - hmem_cuda: Refactor global variables - tostr: Remove the extra dir "shared/" from "include/" and "src/" . - hmem_ze: fix ZE is valid check - hmem_rocr: fix offset calculation - hmem_rocr: use ofi spinlock functions - hmem_rocr: minor fixes - hmem_neuron: convert warn to info for nrt_get_dmabuf_fd not found - hmem_neuron: check existance of neuron devices during initialization - tostr: Moved Windows functions in shared/ofi_str.c to windows/osd.h - tostr: Add helper functions ofi_tostr_size() and ofi_tostr_count(). - EFA - Onboard Peer API, use shm provider as a peer provider - Uses util SRX framework in shared receive procedures. - Register shm MR with hmem_data, allow shm to use gdrcopy for cuda data movement - Finish the refactor for rxr squash. - Use rdma-core WR API for send requests - Check optlen in getopt call - Fix the rdma-read support check in RMA and MSG operations - Optimize ep lock usage - Use an internal fi_mr_attr for memory registration - Hooks - Init field in mr_attr to silence coverity - Add profiling hook provider - Rename cq hooking functions' names - Added trace for resource creation operations - OPX - Initialize ofi_mr_info - Fix dput credit check - Only allocate replay buffer if psn is valid - Support SHM Intra-node communication between single server HFI devices - Fix incorrect packet size in packet header when sending CTS packet - Added check to address Coverity scan defect - Add multi-entry caching to TID rendezvous - Fall back to default domain name for TID fabric - Properly handle multiple IOVs in fi_opx_tsendmsg - Fix OPX Rzv RTS receive operation SHM error (DAOS-related) - Fix non-tagged sends may incorrectly set FI_TAGGED in send completions - Add more info to reliability IOV buffer validation check - Move dput packet build functions to new inline include - Use fi_mr_attr in fi_opx_mr - Disable Pre-NAKing by default, throttle until all outstanding replays ACK'd - Fix reliability bug when NAKing the last PSN - Update HeaderQ Register more frequently - No rbuf_wrap needed for expected receive (TID) - Fixes for Coverity scan issues - Enhanced tag matching - Tune expected recv for unaligned buffers - Observability: Add finer logging granularity - Reduce RTS immediate data and fix packet estimate for odd TID lengths - Add additional sources for FI_OPX_UUID - Peer - Add cq_data to rx_entry, allow peer to modify on unexp - Introduce peer cntr API - Add foreach_unspec_addr API - Add size as an input of the get_tag - PSM3 - Sync with IEFS 11.5.0.0.172 - SHM - Only poll IPC list when ROCR IPC is enabled - Allow for SAR and inject protocol to buffer more unexpected messages - Remove unused sar fields - Make SAR protocol handle 0 byte transfer - Load DSA dependency dynamically - Change recv entry freestack into bufpool - Remove shm signal - Use util peer cntr implementation - Make SHM default to domain level threading level - Replace internal shared receive implementation with util_srx - Lock entire progress loop - Fix ROCR data coherency - Add FI_LOCAL_COMM to shm attrs - Handle empty freestack - Fix bug in configure.m4 in atomics_happy assignment happy - Add memory barrier before update resp->status for SAR - Do not use inline/inject for read op - Allow shm to use gdrcopy - Refactor protocol selection code - Init map fi addrs to FI_ADDR_NOTAVAIL - TCP - General code cleanups - Restrict which EPs can be opened per domain - Increase CM error debug output - Avoid calling close() on an invalid socket after accept error - Mark the EP as disconnected before flushing the queues - Add assertion failures for xnet_{monitor,halt}_sock - Disable ofi_dynpoll_wait() for non-blocking progress - Move PEP pollin operations to io_uring - Move EP poll operations to io_uring - Early exit if ofi_bsock_flush() has operation in progress - Implement pollin sockctx in bsock - Add missing call to xnet_submit_uring() - Add return error to xnet_update_pollflag() - Remove the cancel sockctx from the EP structure - Move io_uring cqe from the stack to progress struct - Reduce stack size for epoll event array - handle NULL av in xnet_freeall_conns() - UCX - Publish FI_LOCAL_COMM and FI_REMOTE_COMM capabilities - Fix configure error with newer MOFED - Fix segfault in unsignalled completions - Util - Add FI_PEER support to util counter - Refactor the usage of cntrs - Change util_ep to be a genlock - Add util shared receive implementation - Update log message for invalid AV type message - Fix fi_mr_info initialization - Add peer ID to MR cache - Store hmem_data in ofi_mr_map - Split the cq progress and reading entries in ofi_cq_readfrom - Verbs - Add event lock to EQ to serialize closing ep - Remove saved_wc_list and use CQ directly - Consolidate peer_mem and dmabuf support check - Fix vrb_add_credits signature - Introduce new progress engine structure - Simplify (and correct) locking around progress operations - General code restructuring - Fabtests - Fix reading addressing options - Allow to change only the OOB address - Allow to use FI_ADDR_STR with -F - Fix bw buffer utilization - Separate RX and RMA counters - Fix tx counter with RMA - Add FI_CONTEXT mode to rdm_cntr_pingpong - Add HMEM support to fi_unexpected_msg test - Fix array OOB during fabtest list parsing - Enable shm tagged_peek test - Fix windows build warnings - Make tx_buf and rx_buf aligned to 64 bytes by default - Fix windows build warnings for sscanf - Use dummy ft_pin_core on macOS - Fix some header includes - sock_test: Do not use epoll if not available - recv_cancel: initialize error entry - Fix wrong size used to allocate tx_msg_buf - unexpected: change defaults to support tcp - unexpected: add unknown unexpected peer test - Enable a list of arbitrary message sizes - Enabled data validation for rma read & write - bw_rma operates on distinct buffer offsets - ft_post_rma issues reads from remote's tx_buf - General code cleanup and restructuring - rdm_tagged_peek: fix race condition synchronization - Add FI_LOCAL_COMM/FI_REMOTE_COMM presence check to fi_getinfo_test - Correct ft_exchange_keys in prefix-mode - Make rdm_tagged_peek test more general - Add unit test for fi_setopt * Mon Aug 07 2023 Nicolas Morey <nicolas.morey@suse.com> - Drop support for obsolete TrueScale (bsc#1212146) * Mon Jul 03 2023 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.18.1 - Core - Fix build warning for ofi_dynpoll_get_fd - EFA - Handle 0-byte writes - Apply byte_in_order_128_byte for all memory type - Increase default shm_av_size to 256 - Force handshake before selecting rtm for non-system ifaces. - Only select readbase_rtm when both sides support rdma-read - Bugfix for initializing SHM offload - Correct CPPFLAGS during configure - Make setopt support sendrecv aligned 128 bytes - Make data size to be 128 byte multiples for in-order aligned send/recv - prepare local read pkt entry for in-order aligned send/recv. - Disable gdrcopy and cudamemcpy for in-order aligned recv. - Increase the pad size in rxr_pkt_entry - Make readcopy pkt pool 128 byte aligned - Introduce alignment to support in order aligned ops - Fix a bug when calling ibv_query_qp_data_in_order - RMA operations will ensure FI_ATOMIC cap - RMA operations will ensure FI_RMA cap - Unittest atomics without FI_ATOMIC cap. - Unittest RMA without FI_RMA cap. - Refactor pkt_entry assignment in poll_ibv loop - Fixes for RDMA Write and Writedata - RXM - Revert rxm util peer CQ support - Fix credit size parameter for flow ctrl - SHM - Fix DSA enable - Assert read op and inject proto are mutually exclusive - Fix ROCR data coherency - Add FI_LOCAL_COMM to shm attrs - Signal peer when peer is out of resources - Handle empty freestack - Fix bug in configure.m4 in atomics_happy assignment happy - Add memory barrier before update resp->status for SAR - Fix resource leak reported by coverity - Switch cmd_ctx pool from freestack to bufpool - Add iface parameter to smr_select_proto - TCP - Fix spinning on fi_trywait() - Handle truncation of active message - Handle prefetched data after reporting ETRUNC error - Progress all ep's on unexp_msg_list when posting recv - Removed unused saved_msg::ep field to fix assert - Continue receiving after truncation error - Create function to allocate internal msg buffer - Add runtime setting for max saved message size - Increase default max_saved value - Dynamically allocate large saved Rx buffers - Separate the max inject and recv buf size - Remove 1-line xnet_cq_add_progress function - Changed default wait object to epoll - Handle case where epoll isn't natively supported - Hold domain lock while deregistering memory - Rename DL package from libnet to libtcp - UCX - Align the provider version with the libfabric version - Verbs - Delay device initialization to when fi_getinfo is called - Consolidate peer_mem and dmabuf support check - verbs_nd: Init len to 0 for WCSGetProviderPath call - verbs_nd: Verify CQs are valid in rdma_create_qp - verbs_nd: Initialize ibv_wc fields - verbs_nd: Release lock in network direct error paths - Fix vrb_add_credits signature - Fix credit size parameter for flow ctrl - Recover RXM connection from verbs QP in error state - Fabtests - Add ze-dlopen functions to component tests - Call cudaSetDevice() for selected device - pytest/efa: Adjust get_efa_devices() - pytest/common: Support parallel neuron test - pytest/common: Use different cuda device for parallel cuda set - efa: Test_flood_peer.py increase timeout - pytest/efa: Test to flood peer during startup - fi-rdmabw-xe: Add option to set maximum message size - fi-rdmabw-xe: Add option to set batch size * Thu May 04 2023 Frederic Crozat <fcrozat@suse.com> - Add _multibuild to define additional spec files as additional flavors. Eliminates the need for source package links in OBS. * Tue Apr 18 2023 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.18.0 - Core - rocr: fix offset calculation - rocr: use ofi spinlock functions - rocr: minor fixes - neuron: convert warn to info for nrt_get_dmabuf_fd not found - neuron: check existance of neuron devices during initialization - neuron: Add support for neuron dma-buf - ze: update ZE to support new driver index specification - List variables read from config file - Add switch to prefer system-config over environment - Add basic system-config support for setting library variables - Move peer provider defines into new header - rocr: Support asynchronous memory copies - rocr: Add support for ROCR IPC - rocr: rename rocr data-structures - synpaseai: return 0 for host_register and host_deregister - fabric: Improve log level of provider mismatch - cuda: Allow CUDA IPC when P2P disabled - ze: add ZE command list pool to reuse command lists - cuda: implement cuda_get_xfer_setting for non cuda build - cuda: adjust FI_HMEM_CUDA_ENABLE_XFER behavior - cuda.c: Add const to param to remove warning - Add IFF_RUNNING check to indicate iface is up and running - io_uring support enhancements - EFA - Implement CUDA support on instance types that do not support GPUDirect RDMA - Implement fi_write using device's RDMA write capability - Enrich error messages with debug and connection info - Implement support for FI_OPT_EFA_USE_DEVICE_RDMA in fi_setopt - Implement support for FI_OPT_CUDA_API_PERMITTED in fi_setopt - Add support for neuron dma-buf - Use gdrcopy to improve the intra-node CUDA communication performance for small messages - Use shm provider's FI_AV_USER_ID support - Fix bugs in efa provider’s shm info initialization procedure - Hooks - dmabuf_peer_mem: Handle IPC handle caching in L0 - trace: Add trace log for CM operation APIs - trace: Change tag in trace log to hex format - trace: Enhance trace log for data transfer API calls - trace: Add trace log for API fi_cq_readerr() - trace: Add trace log for CQ operation APIs - Add tracing hook provider - Net - Net provider optimizations have been integrated into the tcp provider. - Net provider has been removed as a reported provider. - OPX - Fixes for Coverity scan issues - Enhanced tag matching - Tune expected recv for unaligned buffers - Add finer logging granularity - Reduce RTS immediate data and fix packet estimate for odd TID lengths - Add additional sources for FI_OPX_UUID - Exclude opx from build if missing needed defines - Move some logs to optimized builds - Fix build warnings for unused return code from posix_memalign - Add reliability sanity check to detect when send buffer is illegally altered - SDMA Completion workaround for driver cache invalidation race condition - Fix replay payload pointer increment - Handle completion counter across multiple writes in SDMA - Cleanup pointers after free() - Modify domain creation to handle soft cache errors - Two biband performance improvements - Fixes based on Coverity Scan related to auto progress patch - Changed poll many argument to rx_caps instead of caps - Resync with server configured for Multi-Engines (DAOS CART Self Tests) - Remove import_monitor as ENOSYS case - Address memory leaks reported on OFIWG issues page - General code cleanup - Add replays over SDMA - Implement basic TID Cache - Revert work_pending check change - Fix use_immediate_blocks - Restore state after replay packet is NULL - Fix memory leak from early arrival packets - Fix segfault in SHM operations from uninitialized value in atomic path - Prevent SDMA work entries from being reused with outstanding replays - Set runtime as default for OPX_AV - Fix RTS replay immediate data - Fix errors caught by the upstream libfabric Coverity Scan - fi_getInfo - Support multiple HFI devices - Support OFI_PORT and Contiguous endpoint addresses for CART & Mercury - Add fi_opx_tid.h to Makefile.include - Fix progress checks and default domain - Revert is_intranode simplification. - Don't inline handle_ud_ping function - Allow atomic fetch ops to use SDMA for sufficiently large counts - Cleaned up FI_LOG_LEVEL=warn output - Cleaned up unused macros for FI_REMOTE_COMM and FI_LOCAL_COMM - Reset default progress to FI_PROGRESS_MANUAL - Fixed GCC 10 build error with Auto Progress - Add support for FI_PROGRESS_AUTO - Use max allowed packet size in SDMA path when expected TID is off - Expected receive (TID) rendezvous - RMA Read/Write operations over SDMA - Remove origin_rs from cts and dput packet header - Fix for hang in DAOS CART tests - Use single IOV for bounce buffer in SDMA requests. - Check for FI_MULTI_RECV with bitwise OR instead of AND - Fix for intermittent intra-node deadlock hang (DAOS CART tests) - Fix to RPC transport error failure (DAOS CART tests) - Fix for context->buf set to NULL - Fix bad asserts - Ensure atomicity of atomic ops - fi_opx_cq_poll_inline count and head check fix - Fix intermittent intra-node hang causing RPC timeouts (DAOS CART tests) - PSM3 - Update provider to sync with IEFS 11.4.1.1.2 - Fix warnings from build - Add oneapi ZE support to OFI configure - RXD - Ignore error path in av_close return - RXM - Handle NULL av in rxm_freeall_conns() - Implement the FI_OPT_CUDA_API_PERMITTED option - Write "len" field for remote write - Ignore error path domain_close return - Free coll_pool on ep close - Update rxm to use util_cq FI_PEER support functions - Fix incorrect CQ completion field - Rename srx to msg_srx - Disable FI_SOURCE if not requested - Memory leaks removed - Set offload_coll_mask based on actual configuration - Report on coll offload capabilities with OFI_OFFLOAD_PROV_ONLY - Fabric setups collective offload fabric - Create eq for collective offload provider - Close collective providers ep when rxm_ep is closed - Fix incorrect use of OFI_UNUSED() - Rework collective support to use collective provider(s) - SHM - Fix potential deadlock in smr_generic_rma() - smr_generic_rma() wwrite error completion with positive errno - Update SHM to use ROCR - Fix incorrect discard call when cleaning up unexpected queues - Separate smr_generic_msg into msg and tagged recv - Fix start_msg call - Implement the FI_OPT_CUDA_API_PERMITTED option - Assert not valid atomic op - Fix a bug in smr_av_insert - Optimize locking on the SAR path - Remove unneeded sar_cnt - Optimize locking - Enable multiple GPU/interface support - Remove HMEM specific calls from atomic path - Use util_cq FI_PEER support - Import shm as device host memory - Add HMEM flag to smr region - Fix user_id support - Write tx err comp to correct cq - Fix index when setting FI_ADDR_USER_ID - TCP - Provider source has been replaced by net provider source - Removed incorrect reporting of support for FI_ATOMIC - Do not save unmatched messages until we have the peer's fi_addr - Use internal flag for FI_CLAIM messages, versus a reserved tag bit - Fix updating error counter when discarding saved messages - Allow saved messages to be received after the underlying ep has been closed - Enhanced debug logging in connection path - Force CM progress on unconnected ep's when posting data transfers - Support connect and accept calls with io_uring - Fix segfault accessing an invalid fi_addr - Add io_uring support for CM message exchange - Move CM progress from fabric to EQ to improve multi-threaded performance - Fix small memory leak destroying an EQ - Fix race where same rx entry could be freed twice - Handle NULL av in rdm ep cleanup - Reduce stack use for epoll event array - UCX - New provider targeting Nvidia fabrics that layers over libucp - Util - Fix the behavior of cq_read for FI_PEER - rocr: Fix compilation issue - cuda: Use correct debug string calls - Free cq->peer_cq on close - Remove extra new line from av insert log - Check for count = 0 in ofi_ip_av_insert - rocr: Add support for ROCR IPC - Add FI_PEER support to util_cq - Disable FI_SOURCE if not requested - Remove FID events from the EQ when closing endpoint - Rework collective support to be a peer collective provider(s) - Allow FI_PEER to pass CQ, EQ and AV attr checking - Remove annoying WARNING message for FI_AFFINITY - Add utility collective provider - Verbs - Implement the FI_OPT_CUDA_API_PERMITTED option - Add support for ROCR IPC - Fabtests - Add fi_setopt_test unit test - Update ze device registration calls - fi-rdmabw-xe: Always use host buffer for synchronization - Fix bug in posting RMA operation - fi_cq_data: Extend test to fi_writedata - fi_cq_data: Extend validation of completion data - Rename fi_msg_inject tests to fi_inject_test to reflect its use - fi_rdm_stress: Add count option to json key/pair options - Add and fix OOB option handling in several tests - fi_eq_test: Fix incorrect return value - fi_rdm_multi_client: Increase the size of ep name buffer - Add FI_MR_RAW to default mr_mode - Support larger control messages needed by newer providers - fi-rdmabw-xe: Update to work with the ucx provider - fi_ubertest: Cleanup allocations in failure cases - Change ft_reg_mr to not assume hmem iface & device - fi_multinode: Bugfix multinode test for ze + verbs - fi_multinode: Remove unused validation print - fi_multinode: Skip tests for unsupported collective operations - fi_ubertest: Fix data validation with device memory - fi_peek_tagged: Restructure and expand test * Mon Mar 20 2023 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.17.1 - Core - hmem_cuda Add const to param to remove warning - Fix typos in fi_ext.h - ofi_epoll: Remove unused hot_index struct member - EFA - Print local/peer addresses for RX write errors - Unit test to verify no copy with shm for small host message - Avoid unnecessary copy when sending data from shm - Compare pci bus id in hints - Fix double free in rxr endpoint init - Hooks - dmabuf_peer_mem: Handle IPC handle caching in L0 - OPX - Exclude from build if missing needed defines - Move some logs to optimized builds - Fix build warnings for unused return code from posix_memalign - Add reliability sanity check to detect when send buffer is illegally altered - SDMA Completion workaround for driver cache invalidation race condition - Fix replay payload pointer increment - Handle completion counter across multiple writes in SDMA - Cleanup pointers after free() - Modify domain creation to handle soft cache errors - Two biband performance improvements - Fixes based on Coverity Scan related to auto progress patch - Changed poll many argument to rx_caps instead of caps - Resynch with server configured for Multi-Engines (DAOS CART Self Tests) - Remove import_monitor as ENOSYS case - Address memory leaks reported on OFIWG issues page - Remove unused fields - Fix unwanted print statement case - Add replays over SDMA - Implement basic TID Cache - Revert work_pending check change - Fix use_immediate_blocks - Restore state after replay packet is NULL - Fix memory leak from early arrival packets. - Fix segfault in SHM operations from uninitialized value in atomic path. - Prevent SDMA work entries from being reused with outstanding replays pointing to bounce buf. - Set runtime as default for OPX_AV - Fix RTS replay immediate data - Fix errors caught by the upstream libfabric Coverity Scan - Support multiple HFI devices - Support OFI_PORT and Contiguous endpoint addresses - Update man pages - Util - util_cq: Remove annoying WARNING message for FI_AFFINITY * Mon Dec 19 2022 Nicolas Morey <nicolas.morey@suse.com> - Update to 1.17.0 - Core - Add IFF_RUNNING check to indicate iface is up and running - General code cleanups - Add abstraction for common io_uring operations - Support ROCR get_base_addr - Add a 'flags' parameter to fi_barrier() - Introduce new calls for opening domain and endpoint with flags - Add ability to re-sort the fi_info list - Allowing layering of rxm over net provider - General cleanup of provider filtering functions - Add io_uring operations to be used by sockapi - Modify internal handling of async socket operations - Sockets operations are moved to a common sockapi abstraction - Add support for Ze host register/unregister - Add new offload provider type - Rename fi_prov_context and simplify its use - Convert interface prefix string checks to exact checks - EFA - Code cleanups and various bug fixes - Improved debug logging and warnings and assertions - Do not ignore hints->domain_attr->name - Fix the calculation of REQ header size for a packet entry - Fix default value for host memory's max_medium_msg_size - Add tracepoints to send/recv/read ops - Simplified emulated read protocol - Set use_device_rdma according to efa device id - Fix shm initialization path on error - Fix Implementation of FI_EFA_INTER_MIN_READ_MESSAGE_SIZE - Do not enable rdma_read if rxr_env.use_device_rdma is false - Remove de-allocated CUDA memory region during registration - Fix the error handling path of efa_mr_reg_impl() - Fix rxr_ep unit tests involving ibv_cq_ex - Add check of rdma-read capability for synapseai - Report correct default for runt_size parameter - Toggle cuda sync memops via environment variable. - Net - Continued fork of tcp provider, will eventually merge changes back - Fix inject support - Fix memory leak in peek/claim path - General code cleanups and bug fixes from initial fork - Allow looking ahead in tcp stream to handle out-of-order messages - Add message tracing ability - Fetch correct ep when posting to a loopback connection - Release lock in case of error in rdm_close - Fix error path in xnet_enable_rdm - Add missing progress lock in srx cleanup - Code restructuring and enhancements with longer term goal of supporting io_uring - Disable the progress thread in most situations - Rename DL from libxnet-fi to libnet-fi - Add missing initialization calls for DL provider - Add support for FI_PEEK, FI_CLAIM, and FI_DISCARD - Include source address with CQ entry - Fix support for FI_MULTI_RECV - OPX - Bug fixes and general code cleanup - Fix progress checks and default domain - Allow atomic fetch ops to use SDMA for sufficiently large counts - Cleaned up FI_LOG_LEVEL=warn output - Reset default progress to FI_PROGRESS_MANUAL - Fixed GCC 10 build error with Auto Progress - Add support for FI_PROGRESS_AUTO - Use max allowed packet size in SDMA path when expected TID is turned off - Expected receive (TID) rendezvous - RMA Read/Write operations over SDMA - Remove origin_rs from cts and dput packet header. - Fix for hang - unable to match inbound packets with receive context->src_addr (DAOS CART tests) - Use single IOV for bounce buffer in SDMA requests. - Check for FI_MULTI_RECV with bitwise OR instead of AND - Fix for intermittent intra-node deadlock hang (DAOS CART tests) - Fix to RPC transport error failure (DAOS CART tests) - Fix for context->buf set to NULL - Fix bad asserts - Ensure atomicity of atomic ops - fi_opx_cq_poll_inline count and head check fix - Fix intermittent intra-node hang causing RPC timeouts (DAOS CART tests) - Temporarily reduce SDMA queue ring size for possible driver bug workaround - Fix alignment issue and asserts - Enable more parallel SDMA operations - PSM3 - Synced to IEFS 11.4.0.0.198 - Tech Preview Ubuntu 22.04 Support - Tech Preview Intel DSA Support - Improved Intel GPU Support - Various performance improvements - Various bug fixes - RxM - Always use rendezvous protocol for ZE device memory send - Code cleanup - Add option to free resources on AV removal - SHM - Fix user_id support - Write tx err comp to correct cq - Fix index when setting FI_ADDR_USER_ID - Remove extraneous ofi_cirque_next() call - Add support for FI_AV_USER_ID - Fix multi_recv messaging - General code restructuring for maintainability - Implement shared completion queues - Decouple error processing from cq completion path to avoid switch - Fix incorrect op passed into recv cancel operation - Enhanced SHM implementation with DSA offload - Use multiple SAR buffers per copy operation - Fix ZE IPC race condition on startup - TCP - Minor updates in preparation for io_uring support (via net provider) - Util - Add option to free resources on AV removal - Add 'flags' parameter to new fi_barrier2() call - Add debugging in ofi_mr_map_verify - Rename internal bitmask struct to include ofi prefix - Verbs - Add option to disable dmabuf support - FI_SOCKADDR includes support of FI_SOCKADDR_IB - Fabtests - shared: Expand hmem support - fi_loopback: Add support for tagged messages - fi_mr_test: add support of hmem - fi_rdm_atomic: Fix hmem support - fi_rdm_tagged_peek: Read messages in order, code cleanup and fixes - fi_multinode: Add performance and runtime control options, cleanups - benchmarks: Add data verification to some bw tests - fi_multi_recv: Fix possible crash in cleanup - Drop prov-net-fix-error-path-in-xnet_enable_rdm.patch which was merged upstream. * Tue Nov 08 2022 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Add prov-net-fix-error-path-in-xnet_enable_rdm.patch to fix a deadlock when no network interfaces are available (bsc#1205139) * Mon Oct 10 2022 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.16.1 - Core - Fix windows implementation to remove fd from poll set - PSM3 - Add missing files to release tarball - Util - Handle NULL address insertion to fi_av_insert - Drop prov-rxm-Disable-128-bit-atomics.patch which was merged upstream * Thu Oct 06 2022 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Add prov-rxm-Disable-128-bit-atomics.patch to fix a potential segfault on misaligned buffers. * Fri Sep 30 2022 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.16.0 (jsc#PED-351, jsc#PED-190) - Core - Added HMEM IPC cache - Use exact string comparison checks for network interfaces - Restructuring of poll/epoll abstraction - Add ability to disable locks completely in debug builds - Serialize access to modifying the logging calls - Minor fixes to fi_tostr text formatting - Add hmem interface checks to memory registration - EFA - Added support of Synapse AI memory. - Improved error message - Net - Temporarily forked, optimized version of tcp provider - Focused on improved performance and scalability over tcp sockets - Fork ensures tcp provider stability while net provider is developed - Shares the tcp provider protocol and base implementation for msg endpoints - Integrates direct support for rdm endpoints, using a derivative from rxm - Implements own protocol for rdm endpoints, separate from rxm;tcp - OPX - Added initial support for SDMA - General performance enhancements - Performance improvements to reliability protocol - Improved deferred work pending complete - Added support for OPX_AV=runtime - Support iov memory registration ops - Added DAOS RPC support - Atomic ops enhancements - Improved documentation - Debug build enhancements - Fixed compiler warnings - Reduced time to compile prov/opx code - General bug fixes - Fixed PSN wrapping scaling - Added intranode fence - Addressed bugs discovered by coverity scan - PSM2 - Fix sending CQ data in some instances of fi_tsendmsg - PSM3 - Updated to match Intel Ethernet Fabric Suite (IEFS) 11.3 release - RxM - Update to read multiple completions at once from msg provider - Move RxM AV implementation to util code to share with net provider - Minor code cleanups - SHM - Implement and use ipc_cache - Add log messages for debugging and error tracking - Fix check for FI_MR_HMEM mr_mode - Move shm signal handlers initialization to EP - Added log messages for errors detected - TCP - Fix incorrect signaling of the CQ - Increase max number of poll events to retrieve - Acquire ep lock prior to flushing socket in shutdown - Verify ep state prior to progressing socket data - Read cm error data when receiving connreq response - Log error on connect failure - Fix assertion failure in CQ progress function - Util - Fix text in log of UFFD ioctl failure - Introduce cuda ipc monitor - Fix CQ memory leak handling overflow - Fix MR mode bit check for ver 1.5 and greater - Add max_array_size to track/check array overflow - Always progress transfers when reading from a CQ - Handle NULL address insertion - Try IPv4 before IPv6 addresses when starting name server - Fix IP util av default address length - Fix util IP getinfo path to read hints->addr_format - Fix debug print mismatch - Fix return code when memory allocation fails. - Fix build sign warning in ofi_bufpool_region_alloc - Minor code cleanups - Print warning if an addr is inserted into an AV again - Verbs - Fix support of FI_SOCKADDR_IB when requested by the application - Ensure all posted receives are flushed to the application - Update ofi_mr_cache_search API for hmem IPC support - Reduce logging verbosity for "no active ports" - Fix incorrect length used in memory registration - Various minor bug fixes for test failures - Fix a memory leak getting IB address - Implement verbs provider on Windows over NetworkDirect API - Set and check address format correctly - Only close qp if it was initialized - Portable detection of loopback device - Fabtests - multi_ep: Separate EP resources and fix MR registration - multi_recv: Fix possible crash and check for valid buffer - unexpected_msg: Fix printf compiler warning - dgram_pingpong.c: Use out-of-band sync - multinode: Make multinode tests platform agnostic, fix formatting - ubertest: Fix string comparison to include length, fix writedata completion check - av_test: add support for -e <ep_type> - New tests: - dmabuf-rdma: Component level test for dma-buf RDMA - sock_test: Component level performance test of poll, epoll, and select - rdm_stress: Multi-threaded, multi-process stress test for RDM endpoints - sighandler_test: Regression test for signal handler restoration - Drop patches fixed upstream: - prov-opx-Correctly-disable-OPX-if-unsupported.patch - disable-flatten-attr.patch * Mon Aug 01 2022 Martin Liška <mliska@suse.cz> - Add disable-flatten-attr.patch that drops flatten attribute. Note the flatten attribute results in huge compile time hog in inliner (same the binary size would be huge). - Use %make_build and enable LTO (boo#1133235). - Synchronize used Patches. * Thu Jun 23 2022 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.15.1 - Core - Fix fi_info indentation error in fi_tostr - hmem_ze: Add runtime option to choose specific copy engine - Cleanup of configure HMEM checks - Fixed stringop-truncation in ofi_ifaddr_get_speed - Add utility provider log suffix to make logs easier to read - Fix truncation of ipv6 addressing - hmem: add support for AWS Trainium devices - Fix potential sscanf overflows - hmem: pass through device and flags when querying memory interface - Rework locking in several areas to convert spinlocks to mutexes - Add new locking abstractions to select lock types at runtime - Add new FI_PROTO_RXM_TCP for optimized rxm over tcp path - Fix windows implementation to remove fd from poll set - EFA - Added windows support through efawin (https://github.com/aws/efawin) - Added support of AWS neuron. - Added support of using gdrcopy to copy data from host to device. - Fixed a bug that cause 0 byte read to fail. - Fixed a memory corruption issue that can caused forked process to crash. - Extended testing coverage through new pytest based testing framework. - HOOKS - Add new hooking provider dmabuf_peer_mem - Enable DL build of hooking providers - Add HMEM memory registration hook - OPX - New provider supporting Cornelis Networks Omni-path hardware - PSM3 - Updated psm3 to match IEFS 11.2.0.0 release - Added support for sockets (TCP/UDP) via a runtime selectable Hardware Abstraction Layer (HAL) - Added support for IPv6 addressing in RoCE and sockets - Added various NIC selection filtering options (wildcarded NIC name, address format, wildcarded IP subnet, link speed) - Performance tuning in conjunction with OneAPI and OneCCL - Improved PSM3_IDENTIFY output - Rename most internal symbols to psm3_ - Corrected vulnerabilities found during Coverity scans - configure options refined and help text improved - PSM3_MULTI_EP has been deprecated (recommend always enabled, default is enabled [same default as previous releases]) - Various bug fixes - RxM - Add check that atomic size is valid - Add support to passthru calls to tcp provider in specific - TCP - Add assert to verify RMA source/target msg sizes match - Wake-up threads blocked on CQ to update their poll events - Fix use of incorrect events in progress handler - Fixes for various compile warnings, mostly on Windows - Add support for FI_RMA_EVENT capability - Add support for completion counters - Fix check for CQ data in tagged messages - Add cancel support to shared rx context - Add src_addr receive buffer matching - Add provider control to assign a src_addr with an ep - Handle trecv with FI_PEEK flag - Allow binding a CQ with an SRX - Restructuring of code in source files - Handle EWOULDBLOCK returned by send call - Add hot (active) pollfd - SHM - Properly chain the original signal handlers - Avoid uninitialized variable with invalid atomic parameters - Fix 0 byte SAR read - Initialize len parameter to accept - Refactor and simplify protocol code - Remove broken support for 128-bit atomics - Fix FI_INJECT flag support - Add assert to verify RMA source/target msg sizes match - Set domain threading to thread safe - Fix possible use of uninitiated var in av_insert - Util - Fix sign warning in ofi_bufpool_region_alloc - Remove unused variable from ofi_bufpool_destroy - Fix check for valid datatype in ofi_atomic_valid - Return with error if util_coll_sched_copy fails - Fix use of uninitialized variable in ofi_ep_allreduce - Fix memory access in ip_av_insertsym - Track ep per collective operation not with multicast - Restructure collective av set creation/destruction - Change most locks from spin locks to mutexes - Allow selection of spinlocks for CQ and domain objects - Fix AV default addrlen - Update fi_getinfo checks to include hints->addr_ - Handle NULL address insertion to fi_av_insert - Verbs - Initial changes for compiling on Windows (via NetworkDirect) - Add a failover path to dma-buf based memory registration - Replace use of spin locks with mutexes - Check for valid qp prior to cleanup - Set and check for address format correct in fi_getinfo - Fabtests - hmem_cuda: used device allocated host buff to fill device buf - Add python scripts to control test execution - test_configs: include util provider in core config file - Add option "--pin-core" - Only call nrt_init once - Fix a bug in ft_neuron_cleanup - Correct help for unit test programs - Remove duplicate help prints from fi_mcast - configure.ac: fix --enable-debug=no not properly detected - msg_inject: handle the case ft_tsendmsg return -FI_EAGAIN - Add AWS Trainium device support - fi_inj_complete: Add FI_INJECT to fabtests - inj_complete.c: Make arguments align with the other tests - dgram_pingpong: handle the error return of fi_recv - recv_cancel: Remove requirement for unexpected msg handling - poll: Fix crash if unable to allocate pollset - ubertest: Add GPU testing and validation support - Add HMEM options parsing support - Update and re-enable fi_multi_ep test - Add prov-opx-Correctly-disable-OPX-if-unsupported.patch to disable OPX compilation on non x86_64 systems * Tue Apr 19 2022 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.14.1 - Core - Use non-shared memory allocations to use MADV_DONTFORK safely - Fix incorrect use of gdr_copy_from_mapping - Ensure proper timeout time for pollfds to avoid early exit - EFA - Handle read completion properly for multi_recv - Use shm's inject write when possible - Support 0 byte read - RxM - Ensure signaling the CQ fd after writing completion - Fix inject path for sending tagged messages with cq data - Negotiate credit based flow control support over CM - Add PID to CM messages to detect stale vs duplicate connections - Fix race handling unexpected messages from unknown peers - Fix possible leak of stack data in cm_accept - Restrict reported caps based on core provider - Delay starting listen until endpoint fully initialized - Verify valid atomic size - Sockets - Fix coverity reports on uninitialized data - Check for NULL pointers passed to memcpy - Add missing error return code from sock_ep_enable - TCP - Fix performance regression resulting from sparse pollfd sets - Fix assertion failure in CQ progress function - Do not generate error completions for inject msgs - Fix use of incorrect event names in progress handler - Fix check for CQ data in tagged messages - Make start_op array a static to reduce memory - Wake-up threads blocked on CQ to update their poll events - Verbs - Generate error completions for all failed transmits - Set all fields in the fi_fabric_attr for FI_CONNREQ events - Set proper completion flags for all failed transfer - Ensure that all attributes are provided when opening an endpoint - Fix error handling in vrb_eq_read - Fix memory leak in error case in vrb_get_sib - Work-around bug in verbs HW not reported correct send opcodes - Only call ibv_reg_dmabuf_mr when kernel support exists - Add a failover path to dma-buf based memory registration - Negotiate credit based flow control support over CM * Mon Nov 22 2021 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.14.0 - Add time stamps to log messages - Fix gdrcopy calculation of memory region size when aligned - Allow user to disable use of p2p transfers - Update fi_tostr print FI_SHARED_CONTEXT text instead of value - Update fi_tostr to output field names matching header file names - Fix narrow race condition in ofi_init - Add new fi_log_sparse API to rate limit repeated log output - Define memory registration for buffers used for collective operations - EFA, SHM, TCP, RXM, and verbs fixes * Wed Nov 03 2021 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Enable PSM3 provider (jsc#SLE-18754) * Fri Oct 29 2021 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.13.2 - Sort DL providers to ensure consistent load ordering - Update hooking providers to handle fi_open_ops calls to avoid crashes - Replace cassert with assert.h to avoid C++ headers in C code - Enhance serialization for memory monitors to handle external monitors - EFA, SHM, TCP, RxM and vers fixes * Wed Aug 25 2021 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.13.1 - Enable loading ZE library with dlopen() - Add IPv6 support to fi_pingpong - EFA, PSM3 and SHM fixes * Wed Jul 07 2021 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.13.0 - Fix behavior of fi_param_get parsing an invalid boolean value - Add new APIs to open, export, and import specialized fid's - Define ability to import a monitor into the registration cache - Add API support for INT128/UINT128 atomics - Fix incorrect check for provider name in getinfo filtering path - Allow core providers to return default attributes which are lower then maximum supported attributes in getinfo call - Add option prefer external providers (in order discovered) over internal providers, regardless of provider version - Separate Ze (level-0) and DRM dependencies - Always maintain a list of all discovered providers - Fix incorrect CUDA warnings - Fix bug in cuda init/cleanup checking for gdrcopy support - Shift order providers are called from in fi_getinfo, move psm2 ahead of psm3 and efa ahead of psmX - See NEWS.md for changelog * Fri Apr 02 2021 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.12.1 - Fix initialization checks for CUDA HMEM support - Fail if a memory monitor is requested but not available - Adjust priority of psm3 provider to prefer HW specific providers, such as efa and psm2 - EFA and PSM3 fixes - See NEWS.md for changelog * Tue Mar 09 2021 Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> - Update to 1.12.0 - See NEWS.md for changelog
/usr/bin/fi_info /usr/bin/fi_pingpong /usr/bin/fi_strerror /usr/share/doc/packages/libfabric /usr/share/doc/packages/libfabric/NEWS.md /usr/share/licenses/libfabric /usr/share/licenses/libfabric/COPYING /usr/share/man/man1/fi_info.1.gz /usr/share/man/man1/fi_pingpong.1.gz /usr/share/man/man1/fi_strerror.1.gz
Generated by rpm2html 1.8.1
Fabrice Bellet, Wed Dec 18 00:32:28 2024