LLFIO
v2.00
|
LLFIO on GitHub | API Documentation | CTest summary dashboard | Prebuilt binaries |
Herein lies my proposed zero whole machine memory copy file i/o and filesystem library for the C++ standard, intended for storage devices with ~1 microsecond 4Kb transfer latencies and those supporting Storage Class Memory (SCM)/Direct Access Storage (DAX). Its i/o overhead, including syscall overhead, has been benchmarked to 100 nanoseconds on Linux which corresponds to a theoretical maximum of 10M IOPS @ QD1, approx 40Gb/sec per thread. It has particularly strong support for writing portable filesystem algorithms which work well with directly mapped non-volatile storage such as Intel Optane.
It is a complete rewrite after a Boost peer review in August 2015. LLFIO is the reference implementation for these C++ standardisations:
llfio::path_view
is expected to enter the C++ 26 standard (P1030).llfio::file_handle
and llfio::mapped_file_handle
are on track for entering the C++ 26 standard (P1883).Other characteristics:
<codecvt>
if you are building in C++ 20, so linking LLFIO programs on libstdc++ on that Linux if in C++ 20 will fail. Either use a different STL, manually rebuild libstdc++, or use C++ 17.Examples of use (more examples: https://github.com/ned14/llfio/tree/develop/example):
Memory map in a file for read: // Open the mapped file for read
{}, // path_handle to base directory
"foo" // path_view to path fragment relative to base directory
// default mode is read only
// default creation is open existing
// default caching is all
// default flags is none
).value(); // If failed, throw a filesystem_error exception
// Find my text
for (char *p = reinterpret_cast<char *>(mh.address()); (p = (char *)memchr(p, 'h', reinterpret_cast<char *>(mh.address()) + length - p)); p++)
{
if (strcmp(p, "hello"))
{
std::cout << "Happy days!" << std::endl;
}
}
A memory mapped regular file or device. Definition: mapped_file_handle.hpp:170 virtual result< extent_type > maximum_extent() const noexcept override Return the current maximum permitted extent of the file, after updating the map. Definition: mapped_file_handle.hpp:648 byte * address() const noexcept The address in memory where this mapped file currently resides. Definition: mapped_file_handle.hpp:561 #define LLFIO_V2_NAMESPACE The namespace of this LLFIO v2 which will be some unknown inline namespace starting with v2_ inside t... Definition: config.hpp:206 result< section_handle::extent_type > length(const section_handle &self) noexcept Return the current maximum permitted extent of the memory section. Definition: map_handle.hpp:1071 result< mapped_file_handle > mapped_file(mapped_file_handle::size_type reservation, const path_handle &base, mapped_file_handle::path_view_type _path, mapped_file_handle::mode _mode=mapped_file_handle::mode::read, mapped_file_handle::creation _creation=mapped_file_handle::creation::open_existing, mapped_file_handle::caching _caching=mapped_file_handle::caching::all, mapped_file_handle::flag flags=mapped_file_handle::flag::none) noexcept Definition: mapped_file_handle.hpp:817 // Make me a 1 trillion element sparsely allocated integer array!
llfio::mapped_file_handle mfh = llfio::mapped_temp_inode().value();
// On an extents based filing system, doesn't actually allocate any physical
// storage but does map approximately 4Tb of all bits zero data into memory
// Create a typed view of the one trillion integers
llfio::attached<int> one_trillion_int_array(mfh);
// Write and read as you see fit, if you exceed physical RAM it'll be paged out
one_trillion_int_array[0] = 5;
one_trillion_int_array[999999999999ULL] = 6;
virtual result< extent_type > truncate(extent_type newsize) noexcept override Resize the current maximum permitted extent of the mapped file to the given extent,... result< mapped_file_handle > mapped_temp_inode(mapped_file_handle::size_type reservation=0, const path_handle &dir=path_discovery::storage_backed_temporary_files_directory(), mapped_file_handle::mode _mode=mapped_file_handle::mode::write, mapped_file_handle::caching _caching=mapped_file_handle::caching::temporary, mapped_file_handle::flag flags=mapped_file_handle::flag::none) noexcept Definition: mapped_file_handle.hpp:892 | Read metadata about a file and its storage: // Open the file for read
{}, // path_handle to base directory
"foo" // path_view to path fragment relative to base directory
// default mode is read only
// default creation is open existing
// default caching is all
// default flags is none
).value(); // If failed, throw a filesystem_error exception
// warning: default constructor does nothing. If you want all bits
// zero, construct with `nullptr`.
llfio::stat_t fhstat;
fhstat.fill(
fh // file handle from which to fill stat_t
// default stat_t::want is all
).value();
std::cout << "The file was last modified on "
<< print(fhstat.st_mtim) << std::endl;
// Note that default constructor fills with all bits one. This
// lets you do partial fills and usually detect what wasn't filled.
llfio::statfs_t fhfsstat;
fhfsstat.fill(
fh // file handle from which to fill statfs_t
// default statfs_t::want is all
).value();
std::cout << "The file's storage has "
<< "bytes free." << std::endl;
result< file_handle > file(const path_handle &base, file_handle::path_view_type path, file_handle::mode _mode=file_handle::mode::read, file_handle::creation _creation=file_handle::creation::open_existing, file_handle::caching _caching=file_handle::caching::all, file_handle::flag flags=file_handle::flag::none) noexcept Definition: file_handle.hpp:434 result< size_t > fill(const handle &h, want wanted=want::all) noexcept Metadata about a filing system. Unsupported entries are all bits set. Definition: statfs.hpp:74 result< size_t > fill(const handle &h, want wanted=want::all) noexcept Fills in the structure with metadata, returning number of items filled in. |
See https://github.com/ned14/llfio/blob/master/programs/fs-probe/fs_probe_results.yaml for a database of latencies for various previously tested OS, filing systems and storage devices.
Todo list for already implemented parts: https://ned14.github.io/llfio/todo.html
NEW in v2 | Boost peer review feedback | |
---|---|---|
✔ | ✔ | Universal native handle/fd abstraction instead of void * . |
✔ | ✔ | Perfectly/Ideally low memory (de)allocation per op (usually none). |
✔ | ✔ | noexcept API throughout returning error_code for failure instead of throwing exceptions. |
✔ | ✔ | LLFIO v1 handle type split into hierarchy of types:
|
✔ | ✔ | Cancelable i/o (made possible thanks to dropping XP support). |
✔ | ✔ | All shared_ptr usage removed as all use of multiple threads removed. |
✔ | ✔ | Use of std::vector to transport scatter-gather sequences replaced with C++ 20 span<> borrowed views. |
✔ | ✔ | Completion callbacks are now some arbitrary type U&& instead of a future continuation. Type erasure for its storage is bound into the one single memory allocation for everything needed to execute the op, and so therefore overhead is optimal. |
✔ | ✔ | Filing system algorithms made generic and broken out into public llfio::algorithms template library (the LLFIO FTL). |
✔ | ✔ | Abstraction of native handle management via bitfield specified "characteristics". |
✔ | Storage profiles, a YAML database of behaviours of hardware, OS and filing system combinations. | |
✔ | Absolute and interval deadline timed i/o throughout (made possible thanks to dropping XP support). | |
✔ | Dependency on ASIO/Networking TS removed completely. | |
✔ | Four choices of algorithm implementing a shared filing system mutex. | |
✔ | Uses CMake, CTest, CDash and CPack with automatic usage of C++ Modules or precompiled headers where available. | |
✔ | Far more comprehensive memory map and virtual memory facilities. | |
✔ | Much more granular, micro level unit testing of individual functions. | |
✔ | Much more granular, micro level internal logging of every code path taken. | |
✔ | Path views used throughout, thus avoiding string copying and allocation in std::filesystem::path . | |
✔ | Paths are equally interpreted as UTF-8 on all platforms. | |
✔ | We never store nor retain a path, as they are inherently racy and are best avoided. | |
✔ | ✔ | Parent handle caching is hard coded in, it is now an optional user applied templated adapter class. |
Todo:
NEW in v2 | Boost peer review feedback | |
---|---|---|
✔ | clang AST assisted SWIG bindings for other languages. | |
✔ | Statistical tracking of operation latencies so realtime IOPS can be measured. |
NEW in v2 | Windows | POSIX | |
---|---|---|---|
✔ | ✔ | ✔ | Native handle cloning. |
✔ (up from four) | ✔ | ✔ | Maximum possible (seven) forms of kernel caching. |
✔ | ✔ | Absolute path open. | |
✔ | ✔ | Relative "anchored" path open enabling race free file system. | |
✔ | ✔ | Win32 path support (260 path limit). | |
✔ | NT kernel path support (32,768 path limit). | ||
✔ | ✔ | ✔ | Synchronous universal scatter-gather i/o. |
✔ (POSIX AIO support) | ✔ | ✔ | Asynchronous universal scatter-gather i/o. |
✔ | ✔ | ✔ | i/o deadlines and cancellation. |
✔ | ✔ | Retrieving and setting the current maximum extent (size) of an open file. | |
✔ | ✔ | Retrieving the current path of an open file irrespective of where it has been renamed to by third parties. | |
✔ | ✔ | statfs_t ported over from LLFIO v1. | |
✔ | ✔ | utils namespace ported over from LLFIO v1. | |
✔ | ✔ | ✔ | shared_fs_mutex shared/exclusive entities locking based on lock files |
✔ | ✔ | ✔ | Byte range shared/exclusive locking. |
✔ | ✔ | ✔ | shared_fs_mutex shared/exclusive entities locking based on byte ranges |
✔ | ✔ | ✔ | shared_fs_mutex shared/exclusive entities locking based on atomic append |
✔ | ✔ | Memory mapped files and virtual memory management (section_handle , map_handle and mapped_file_handle ) | |
✔ | ✔ | ✔ | shared_fs_mutex shared/exclusive entities locking based on memory maps |
✔ | ✔ | ✔ | Universal portable UTF-8 path views. |
✔ | ✔ | "Hole punching" and hole enumeration ported over from LLFIO v1. | |
✔ | ✔ | Directory handles and very fast directory enumeration ported over from LLFIO v1. | |
✔ | ✔ | ✔ | shared_fs_mutex shared/exclusive entities locking based on safe byte ranges |
✔ | ✔ | Set random or sequential i/o (prefetch). | |
✔ | ✔ | ✔ | llfio::algorithm::trivial_vector<T> with constant time reallocation if T is trivially copyable. |
✔ | ✔ | symlink_handle . | |
✔ | ✔ | ✔ | Large, huge and massive page size support for memory allocation and (POSIX only) file maps. |
✔ | ✔ | ✔ | A mechanism for writing a stat_t onto an inode. |
✔ | ✔ | ✔ | Graph based directory hierarchy traveral algorithm. |
✔ | ✔ | ✔ | Graph based directory hierarchy summary algorithm. |
✔ | ✔ | ✔ | Graph based reliable directory hierarchy deletion algorithm. |
✔ | ✔ | ✔ | Intelligent file contents cloning between file handles. |
Todo thereafter in order of priority:
NEW in v2 | Windows | POSIX | |
---|---|---|---|
✔ | Page allocator based on an index of linked list of free pages. See notes. | ||
✔ | Optionally concurrent B+ tree index based on page allocator for key-value store. | ||
✔ | Attributes extending span<buffers_type> with DMA colouring. | ||
✔ | Coroutine generator for iterating a file's contents in DMA friendly way. | ||
✔ | Ranges & Concurrency based reliable directory hierarchy copy algorithm. | ||
✔ | Ranges & Concurrency based reliable directory hierarchy update (two and three way) algorithm. | ||
✔ | Linux io_uring support for native non-blocking O_DIRECT i/o | ||
✔ | std::pmr::memory_resource adapting a file backing if on C++ 17. | ||
✔ | Extended attributes support. | ||
✔ | Algorithm to replace all duplicate content with hard links. | ||
✔ | Algorithm to figure out all paths for a hard linked inode. | ||
✔ | Algorithm to compare two or three directory enumerations and give differences. |
Features possibly to be added after a Boost peer review:
Why you might need LLFIO | ||
---|---|---|
Manufacturer claimed 4Kb transfer latencies for the physical hardware:
| 100% read QD1 4Kb direct transfer latencies for the software with LLFIO:
| 75% read 25% write QD4 4Kb direct transfer latencies for the software with LLFIO:
|
Max bandwidth for the physical hardware: