# Filesystem ```admonish warning I have *no* idea what I'm doing here. If you do, *please* let me know, and fix this! This is just some light brainstorming of how I think this might work. ``` ## Prelude Right now, [actors](/development/design/actor.md) are stored in **RAM** only. But, what if we want them to be persistent on system reboot? They need to be saved to the disk. I don't want to provide a simple filesystem interface to programs like **UNIX** does however. Instead, all data should be just stored in *actors*, then the actors will decide whether or not they should be saved. They can save at any time, save immediately, or just save on a *shutdown* signal. Therefore, the "filesystem" code will just be a library that's simple a low-level interface for the `kernel` to use. *Actors* will simply make requests to save. ## Filesystem Layout | Name | Size | Header | |------|------|--------| | Boot Sector | `128` | `None` | | Kernel Sector | `1024` | `None` | | Config Sector | `u64` | `PartitionHeader` | | User Sector(s) | `u64` | `PartitionHeader` | ### Partition A virtual section of the disk. It's identified simply by numerical order. ```rust const LABEL_SIZE: u16; // Number of characters that can be used in the partition label let NUM_CHUNKS: u64; // Number of chunks in a specific partition struct PartitionHeader { boot: bool, // Boot flag label: [char; LABEL_SIZE], // Human-readable label. Not UTF-8 though :/ index: [(u64, Uuid); NUM_CHUNKS], // Array of tuples mapping Actor UUID's to chunk indexes // TODO: What if a Uuid is on multiple chunks? num_chunks: NUM_CHUNKS, // Chunks in this partition } ``` ### Chunk Small pieces that each partition is split into. Contains fixed-length metadata (checksum, extension flag) at the beginning, and then arbitrary data afterwards. If the saved data exceeds past a single chunk, the `extends` flag is set. ```rust const CHUNK_SIZE: u64; // Example static chunk size struct ChunkHeader { checksum: u64, extends: bool, encrypted: bool, modified: u64, // Timestamp of last modified } struct Chunk { header: ChunkHeader, data: [u8; CHUNK_SIZE], } ``` This struct is then encoded into bytes and written to the disk. Drivers for the disk are *to be implemented*. It *should* be possible to do autodetection, and maybe for *Actors* to specify which disk/partition they want to be saved to. Compression of the data should also be possible, due to `bincode` supporting [flate2](https://lib.rs/crates/flate2) compression. Similarely **AES** encryption can be used, and this allows for only specific chunks to be encrypted.[^encryption] ### Reading On boot, we start executing code from the **Boot Sector**. This contains the assembly instructions, which then jump to the `kernel` code in the **Kernel Sector**. The `kernel` then reads in bytes from the first partition *(as the sectors are fixed-size, we know when this starts)* into memory, serializing it into a `PartitionHeader` struct via [bincode](https://lib.rs/crates/bincode). From here, as we have a fixed `CHUNK_SIZE`, and know how many chunks are in our first partition, we can read from any chunk on any partition now. On startup, an *Actor* can request to read data from the disk. If it has the right [capabilities](/development/design/actor.md#ocap), we find the chunk it's looking for[^find_chunk], parse the data (using `bincode` again), and send it back. Also, we are able to verify data. Before passing off the data, we re-hash it using [HighwayHash](https://lib.rs/crates/highway) to see if it matches. If it does, we simply pass it along like normal. If not, we refuse, and send an error [message](/development/design/actor.md#messages). Basically, `part1_offset = BOOT_PARTITION_SIZE`, `part1_data_start = part1_offset + part_header_size`, `chunk1_data_start = part1_data_start + chunk_header_size`. ### Writing Writing uses a similar process. An *Actor* can request to write data. If it has proper capabilties, we serialize the data, allocate a free chunk[^free_chunk], and write to it. We *hash* the data first to generate a checksum, and set proper metadata if the data extends past the `CHUNK_SIZE`. Then the `ParitionHeader` *index* is updated to contain the new chunk(s) being used. ### Permissions Again, whether actors can: - Write to a specific disk/partition - Write to disk at all - Read from disk will be determined via [capabilities](/development/design/actor.md#ocap) ### To-Do - Snapshots - Isolation ## Executable Format Programs written in userspace will need to follow a specific format. First, users will write a program in **Rust**, using the **Mercury** libraries, and with `no-std`. They'll use [Actors](/development/design/actor.md) to communicate with the `kernel`. Then, they'll compile it for the proper platform and get a pure binary. This will be ran through an *executable packer* program, and the output of which can be downloaded by the package manager, put on disk, etc. It'll then parsed in via `bincode`, then the core is ran by the `kernel` in userspace. Additionally, the raw bytes will be compressed. Then, whether reading from [chunks](#chunk) from memory or disk, we can know whether it will run on the current system, how long to read for, and when the compressed bytes start (due to the fixed length header). It is then simple to decompress the raw bytes and run them from the `kernel`. ```rust enum Architecture { RiscV, Arm, } struct PackedExecutable { arch: Architecture, size: u64, compressed_bytes: [u8], } ``` [^encryption]: Specific details to be figured out later [^find_chunk]: The `PartitionHeader` has a tuple `(Uuid, u64)` which maps each `Actor` to a chunk number, allowing for easy finding of a specific chunk from an actor-provided `Uuid`. [^free_chunk]: Because we know which chunks are used, we know which ones aren't.