|  |  |  | @ -15,54 +15,37 @@ They can save at any time, save immediately, or just save on a *shutdown* signal | 
		
	
		
			
				|  |  |  |  | Therefore, the "filesystem" code will just be a library that's simple a low-level interface for the `kernel` to use. | 
		
	
		
			
				|  |  |  |  | *Actors* will simply make requests to save. | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | ## Performance | 
		
	
		
			
				|  |  |  |  | I believe that this format should be fairly fast, but only implementation and testing will tell for sure. | 
		
	
		
			
				|  |  |  |  | 1. Minimal data needs to read in - bit offsets can be used, and only fixed-size metadata must be known | 
		
	
		
			
				|  |  |  |  | 2. `serde` is fairly optimized for deserialization/serialization | 
		
	
		
			
				|  |  |  |  | 3. `HighwayHash` is a very fast and well-optimized hashing algorithm | 
		
	
		
			
				|  |  |  |  | 4. Async and multithreading will allow for concurrent access, and splitting of resource-intensive tasks across threads | 
		
	
		
			
				|  |  |  |  | 5. `hashbrown` is quite high-performance | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | ## Filesystem Layout | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | | Name | Size | Header | | 
		
	
		
			
				|  |  |  |  | |------|------|--------| | 
		
	
		
			
				|  |  |  |  | | Boot Sector | `128 B` | `None` | | 
		
	
		
			
				|  |  |  |  | | Kernel Sector | `4096 KB` | `None` | | 
		
	
		
			
				|  |  |  |  | | Index Sector | `4096 KB` | `None` | | 
		
	
		
			
				|  |  |  |  | | Config Sector | `u64` | `PartitionHeader` | | 
		
	
		
			
				|  |  |  |  | | User Sector(s) | `u64` | `PartitionHeader` | | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | ### Partition | 
		
	
		
			
				|  |  |  |  | A virtual section of the disk. | 
		
	
		
			
				|  |  |  |  | Additionally, it has a **UUID** generated via [lolid](https://lib.rs/crates/lolid) to enable identifying a specific partition. | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | It's identified simply by numerical order. | 
		
	
		
			
				|  |  |  |  | ```rust | 
		
	
		
			
				|  |  |  |  | const LABEL_SIZE: u16 = 128; // Example number of characters that can be used in the partition label | 
		
	
		
			
				|  |  |  |  | const BOOT_SIZE: u64; // How large the BOOT partition will be | 
		
	
		
			
				|  |  |  |  | const LABEL_SIZE: u64; // Number of characters that can be used in the partition label | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | struct PartitionHeader { | 
		
	
		
			
				|  |  |  |  |     boot: bool, // Boot flag | 
		
	
		
			
				|  |  |  |  |     label: [char; LABEL_SIZE], // Human-readable label. Not UTF-8 though :/ | 
		
	
		
			
				|  |  |  |  |     num_chunks: u64, // Chunks in this partition | 
		
	
		
			
				|  |  |  |  |     uuid: Uuid,4096 | 
		
	
		
			
				|  |  |  |  | } | 
		
	
		
			
				|  |  |  |  | ``` | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | ### Chunk | 
		
	
		
			
				|  |  |  |  | Small pieces that each partition is split into. | 
		
	
		
			
				|  |  |  |  | Contains fixed-length metadata (checksum, encryption flag, modification date, etc.) at the beginning, and then arbitrary data afterwards. | 
		
	
		
			
				|  |  |  |  | Contains fixed-length metadata (checksum, extension flag, uuid) at the beginning, and then arbitrary data afterwards. | 
		
	
		
			
				|  |  |  |  | If the saved data exceeds past a single chunk, the `extends` flag is set. | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | Additionally, it has a **UUID** generated via [lolid](https://lib.rs/crates/lolid) to enable identifying a specific chunk. | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | ```rust | 
		
	
		
			
				|  |  |  |  | const CHUNK_SIZE: u64 = 4096; // Example static chunk size | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | struct ChunkHeader { | 
		
	
		
			
				|  |  |  |  |     checksum: u64, | 
		
	
		
			
				|  |  |  |  |     encrypted: bool, | 
		
	
		
			
				|  |  |  |  |     modified: u64, // Timestamp of last modified | 
		
	
		
			
				|  |  |  |  |     uuid: Uuid, | 
		
	
		
			
				|  |  |  |  | } | 
		
	
		
			
				|  |  |  |  | const CHUNK_SIZE: u64; // Example static chunk size | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | struct Chunk { | 
		
	
		
			
				|  |  |  |  |     header: ChunkHeader, | 
		
	
		
			
				|  |  |  |  |     checksum: u64, | 
		
	
		
			
				|  |  |  |  |     extends: bool, | 
		
	
		
			
				|  |  |  |  |     encrypted: bool, | 
		
	
		
			
				|  |  |  |  |     uuid: Uuid, | 
		
	
		
			
				|  |  |  |  |     data: [u8; CHUNK_SIZE], | 
		
	
		
			
				|  |  |  |  | } | 
		
	
		
			
				|  |  |  |  | ``` | 
		
	
	
		
			
				
					|  |  |  | @ -70,21 +53,21 @@ This struct is then encoded into bytes and written to the disk. Drivers for the | 
		
	
		
			
				|  |  |  |  | It *should* be possible to do autodetection, and maybe for *Actors* to specify which disk/partition they want to be saved to. | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | Compression of the data should also be possible, due to `bincode` supporting [flate2](https://lib.rs/crates/flate2) compression. | 
		
	
		
			
				|  |  |  |  | Similarly **AES** encryption can be used, and this allows for only specific chunks to be encrypted.[^encryption] | 
		
	
		
			
				|  |  |  |  | Similarely **AES** encryption can be used, and this allows for only specific chunks to be encrypted.[^encryption] | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | ### Reading | 
		
	
		
			
				|  |  |  |  | On boot, we start executing code from the **Boot Sector**. This contains the assembly instructions, which then jump to the `kernel` code in the **Kernel Sector**. | 
		
	
		
			
				|  |  |  |  | The `kernel` then reads in bytes from the first partition *(as the sectors are fixed-size, we know when this starts)* into memory, serializing it into a `PartitionHeader` struct via [bincode](https://lib.rs/crates/bincode). | 
		
	
		
			
				|  |  |  |  | On boot, we start executing code from the beginning of the disk (the boot partition, although that's meaningless at this point). | 
		
	
		
			
				|  |  |  |  | The `kernel` then reads in bytes from the first partition *(as the **BOOT** partition is fixed-size, we know when this starts)* into memory, serializing it into a `PartitionHeader` struct via [bincode](https://lib.rs/crates/bincode). | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | From here, as we have a fixed `CHUNK_SIZE`, and know how many chunks are in our first partition, we can read from any chunk on any partition now. | 
		
	
		
			
				|  |  |  |  | On startup, an *Actor* can request to read data from the disk. If it has the right [capabilities](/development/design/actor.md#ocap), we find the chunk it's looking for from the index, parse the data (using `bincode` again), and send it back. | 
		
	
		
			
				|  |  |  |  | On startup, an *Actor* can request to read data from the disk. If it has the right [capabilities](/development/design/actor.md#ocap), we find the chunk it's looking for[^find_chunk], parse the data (using `bincode` again), and send it back. | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | Also, we are able to verify data. Before passing off the data, we re-hash it using [HighwayHash](https://lib.rs/crates/highway) to see if it matches. | 
		
	
		
			
				|  |  |  |  | If it does, we simply pass it along like normal. If not, we refuse, and send an error [message](/development/design/actor.md#messages). | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | ### Writing | 
		
	
		
			
				|  |  |  |  | Writing uses a similar process. An *Actor* can request to write data. If it has proper capabilties, we serialize the data, allocate a free chunk[^free_chunk], and write to it. | 
		
	
		
			
				|  |  |  |  | We *hash* the data first to generate a checksum, and set proper metadata. | 
		
	
		
			
				|  |  |  |  | We *hash* the data first to generate a checksum, and set proper metadata if the data extends past the `CHUNK_SIZE`. | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | ### Permissions | 
		
	
		
			
				|  |  |  |  | Again, whether actors can: | 
		
	
	
		
			
				
					|  |  |  | @ -94,44 +77,9 @@ Again, whether actors can: | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | will be determined via [capabilities](/development/design/actor.md#ocap) | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | ### Indexing | 
		
	
		
			
				|  |  |  |  | Created in-memory on startup, modified directly whenever the filesystem is modified. | 
		
	
		
			
				|  |  |  |  | It's saved in the *Index Sector* (which is at a known offset & size), allowing it to be read in easily on boot. | 
		
	
		
			
				|  |  |  |  | It again simply uses `bincode` and compression. | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | While the index is not necessarily a fixed size, we read until we have enough data from the fixed sector size. | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | ```rust | 
		
	
		
			
				|  |  |  |  | use hashbrown::HashMap; | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | let mut index = HashMap::new(); // Create the Uuid storage index | 
		
	
		
			
				|  |  |  |  | let mut free_index = HashMap::new(); // Create the freespace index | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | struct Location { | 
		
	
		
			
				|  |  |  |  |     partition: Uuid, // Partition identified via Uuid | 
		
	
		
			
				|  |  |  |  |     chunks: Vec<u64>, // Which chunk(s) in the partition it is | 
		
	
		
			
				|  |  |  |  | } | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | let new_data = (Uuid::new(), b"data"); // Test data w/ an actor Uuid & bytes | 
		
	
		
			
				|  |  |  |  | let new_data_location = Location { | 
		
	
		
			
				|  |  |  |  |     partition: Uuid::new(), | 
		
	
		
			
				|  |  |  |  |     chunks: vec![5, 8], // 5th & 8th chunk in that partition | 
		
	
		
			
				|  |  |  |  | }; | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | index.insert(&new_data.0, &new_data_location); // Insert a new entry mapping a data Uuid to a location | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | let uuid_location = index.get(&new_data.0).unwrap(); // Get the location of a Uuid | 
		
	
		
			
				|  |  |  |  | ``` | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | This then allows the index to be searched easily to find the data location of a specific `Uuid`. | 
		
	
		
			
				|  |  |  |  | Whenever an actor makes a request to save data to it's `Uuid` location, this can be easily found. | 
		
	
		
			
				|  |  |  |  | It also allows us to tell if an actor *hasn't* been saved yet, allowing us to know whether we need to allocate new space for writing, or if there's actually something to read. | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | ### To-Do | 
		
	
		
			
				|  |  |  |  | - Snapshots | 
		
	
		
			
				|  |  |  |  | - Isolation | 
		
	
		
			
				|  |  |  |  | - Journaling | 
		
	
		
			
				|  |  |  |  | - Resizing | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | ## Executable Format | 
		
	
		
			
				|  |  |  |  | Programs written in userspace will need to follow a specific format. | 
		
	
	
		
			
				
					|  |  |  | @ -161,4 +109,6 @@ struct PackedExecutable { | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | [^encryption]: Specific details to be figured out later | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | [^free_chunk]: Need to figure out how to efficiently do this. **XFS** seems to just keep another index of free chunks. It also uses a **B+Tree** rather than a hashmap - to look into. | 
		
	
		
			
				|  |  |  |  | [^find_chunk]: Currently via magic. I have no idea how to do this other than a simple search. Maybe generate an index, or use a **UUID**? | 
		
	
		
			
				|  |  |  |  | 
 | 
		
	
		
			
				|  |  |  |  | [^free_chunk]: Again, no idea how. | 
		
	
	
		
			
				
					| 
							
							
							
						 |  |  | 
 |