FS performance notes
This commit is contained in:
		
							parent
							
								
									84a077834c
								
							
						
					
					
						commit
						e0097ada5d
					
				
					 1 changed files with 18 additions and 8 deletions
				
			
		|  | @ -15,6 +15,14 @@ They can save at any time, save immediately, or just save on a *shutdown* signal | ||||||
| Therefore, the "filesystem" code will just be a library that's simple a low-level interface for the `kernel` to use. | Therefore, the "filesystem" code will just be a library that's simple a low-level interface for the `kernel` to use. | ||||||
| *Actors* will simply make requests to save. | *Actors* will simply make requests to save. | ||||||
| 
 | 
 | ||||||
|  | ## Performance | ||||||
|  | I believe that this format should be fairly fast, but only implementation and testing will tell for sure. | ||||||
|  | 1. Minimal data needs to read in - bit offsets can be used, and only fixed-size metadata must be known | ||||||
|  | 2. `serde` is fairly optimized for deserialization/serialization | ||||||
|  | 3. `HighwayHash` is a very fast and well-optimized hashing algorithm | ||||||
|  | 4. Async and multithreading will allow for concurrent access, and splitting of resource-intensive tasks across threads | ||||||
|  | 5. `hashbrown` is quite high-performance | ||||||
|  | 
 | ||||||
| ## Filesystem Layout | ## Filesystem Layout | ||||||
| 
 | 
 | ||||||
| | Name | Size | Header | | | Name | Size | Header | | ||||||
|  | @ -35,7 +43,7 @@ const LABEL_SIZE: u16 = 128; // Example number of characters that can be used in | ||||||
| struct PartitionHeader { | struct PartitionHeader { | ||||||
|     label: [char; LABEL_SIZE], // Human-readable label. Not UTF-8 though :/ |     label: [char; LABEL_SIZE], // Human-readable label. Not UTF-8 though :/ | ||||||
|     num_chunks: u64, // Chunks in this partition |     num_chunks: u64, // Chunks in this partition | ||||||
|     uuid: Uuid, |     uuid: Uuid,4096 | ||||||
| } | } | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
|  | @ -69,7 +77,7 @@ On boot, we start executing code from the **Boot Sector**. This contains the ass | ||||||
| The `kernel` then reads in bytes from the first partition *(as the sectors are fixed-size, we know when this starts)* into memory, serializing it into a `PartitionHeader` struct via [bincode](https://lib.rs/crates/bincode). | The `kernel` then reads in bytes from the first partition *(as the sectors are fixed-size, we know when this starts)* into memory, serializing it into a `PartitionHeader` struct via [bincode](https://lib.rs/crates/bincode). | ||||||
| 
 | 
 | ||||||
| From here, as we have a fixed `CHUNK_SIZE`, and know how many chunks are in our first partition, we can read from any chunk on any partition now. | From here, as we have a fixed `CHUNK_SIZE`, and know how many chunks are in our first partition, we can read from any chunk on any partition now. | ||||||
| On startup, an *Actor* can request to read data from the disk. If it has the right [capabilities](/development/design/actor.md#ocap), we find the chunk it's looking for[^find_chunk], parse the data (using `bincode` again), and send it back. | On startup, an *Actor* can request to read data from the disk. If it has the right [capabilities](/development/design/actor.md#ocap), we find the chunk it's looking for from the index, parse the data (using `bincode` again), and send it back. | ||||||
| 
 | 
 | ||||||
| Also, we are able to verify data. Before passing off the data, we re-hash it using [HighwayHash](https://lib.rs/crates/highway) to see if it matches. | Also, we are able to verify data. Before passing off the data, we re-hash it using [HighwayHash](https://lib.rs/crates/highway) to see if it matches. | ||||||
| If it does, we simply pass it along like normal. If not, we refuse, and send an error [message](/development/design/actor.md#messages). | If it does, we simply pass it along like normal. If not, we refuse, and send an error [message](/development/design/actor.md#messages). | ||||||
|  | @ -96,7 +104,9 @@ While the index is not necessarily a fixed size, we read until we have enough da | ||||||
| ```rust | ```rust | ||||||
| use hashbrown::HashMap; | use hashbrown::HashMap; | ||||||
| 
 | 
 | ||||||
| let mut index = HashMap::new(); // Create the index | let mut index = HashMap::new(); // Create the Uuid storage index | ||||||
|  | let mut free_index = HashMap::new(); // Create the freespace index | ||||||
|  | 
 | ||||||
| struct Location { | struct Location { | ||||||
|     partition: Uuid, // Partition identified via Uuid |     partition: Uuid, // Partition identified via Uuid | ||||||
|     chunks: Vec<u64>, // Which chunk(s) in the partition it is |     chunks: Vec<u64>, // Which chunk(s) in the partition it is | ||||||
|  | @ -104,11 +114,11 @@ struct Location { | ||||||
| 
 | 
 | ||||||
| let new_data = (Uuid::new(), b"data"); // Test data w/ an actor Uuid & bytes | let new_data = (Uuid::new(), b"data"); // Test data w/ an actor Uuid & bytes | ||||||
| let new_data_location = Location { | let new_data_location = Location { | ||||||
|     partition_offset: Uuid::new(), |     partition: Uuid::new(), | ||||||
|     chunks: vec![5, 8], // 5th & 8th chunk in that partition |     chunks: vec![5, 8], // 5th & 8th chunk in that partition | ||||||
| }; | }; | ||||||
| 
 | 
 | ||||||
| index.insert(&new_data.0, new_data_location); // Insert a new entry mapping a data Uuid to a location | index.insert(&new_data.0, &new_data_location); // Insert a new entry mapping a data Uuid to a location | ||||||
| 
 | 
 | ||||||
| let uuid_location = index.get(&new_data.0).unwrap(); // Get the location of a Uuid | let uuid_location = index.get(&new_data.0).unwrap(); // Get the location of a Uuid | ||||||
| ``` | ``` | ||||||
|  | @ -120,6 +130,8 @@ It also allows us to tell if an actor *hasn't* been saved yet, allowing us to kn | ||||||
| ### To-Do | ### To-Do | ||||||
| - Snapshots | - Snapshots | ||||||
| - Isolation | - Isolation | ||||||
|  | - Journaling | ||||||
|  | - Resizing | ||||||
| 
 | 
 | ||||||
| ## Executable Format | ## Executable Format | ||||||
| Programs written in userspace will need to follow a specific format. | Programs written in userspace will need to follow a specific format. | ||||||
|  | @ -149,6 +161,4 @@ struct PackedExecutable { | ||||||
| 
 | 
 | ||||||
| [^encryption]: Specific details to be figured out later | [^encryption]: Specific details to be figured out later | ||||||
| 
 | 
 | ||||||
| [^find_chunk]: On startup, the `kernel` builds an index of the filesystem in-memory. This is then modified whenever chunks are modified, and saved on disk on shutdown, and read again on startup. | [^free_chunk]: Need to figure out how to efficiently do this. **XFS** seems to just keep another index of free chunks. It also uses a **B+Tree** rather than a hashmap - to look into. | ||||||
| 
 |  | ||||||
| [^free_chunk]: Because we know which chunks are used, we know which ones aren't. |  | ||||||
|  |  | ||||||
		Loading…
	
		Reference in a new issue