FS performance notes
parent
84a077834c
commit
e0097ada5d
|
@ -15,6 +15,14 @@ They can save at any time, save immediately, or just save on a *shutdown* signal
|
||||||
Therefore, the "filesystem" code will just be a library that's simple a low-level interface for the `kernel` to use.
|
Therefore, the "filesystem" code will just be a library that's simple a low-level interface for the `kernel` to use.
|
||||||
*Actors* will simply make requests to save.
|
*Actors* will simply make requests to save.
|
||||||
|
|
||||||
|
## Performance
|
||||||
|
I believe that this format should be fairly fast, but only implementation and testing will tell for sure.
|
||||||
|
1. Minimal data needs to read in - bit offsets can be used, and only fixed-size metadata must be known
|
||||||
|
2. `serde` is fairly optimized for deserialization/serialization
|
||||||
|
3. `HighwayHash` is a very fast and well-optimized hashing algorithm
|
||||||
|
4. Async and multithreading will allow for concurrent access, and splitting of resource-intensive tasks across threads
|
||||||
|
5. `hashbrown` is quite high-performance
|
||||||
|
|
||||||
## Filesystem Layout
|
## Filesystem Layout
|
||||||
|
|
||||||
| Name | Size | Header |
|
| Name | Size | Header |
|
||||||
|
@ -35,7 +43,7 @@ const LABEL_SIZE: u16 = 128; // Example number of characters that can be used in
|
||||||
struct PartitionHeader {
|
struct PartitionHeader {
|
||||||
label: [char; LABEL_SIZE], // Human-readable label. Not UTF-8 though :/
|
label: [char; LABEL_SIZE], // Human-readable label. Not UTF-8 though :/
|
||||||
num_chunks: u64, // Chunks in this partition
|
num_chunks: u64, // Chunks in this partition
|
||||||
uuid: Uuid,
|
uuid: Uuid,4096
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -69,7 +77,7 @@ On boot, we start executing code from the **Boot Sector**. This contains the ass
|
||||||
The `kernel` then reads in bytes from the first partition *(as the sectors are fixed-size, we know when this starts)* into memory, serializing it into a `PartitionHeader` struct via [bincode](https://lib.rs/crates/bincode).
|
The `kernel` then reads in bytes from the first partition *(as the sectors are fixed-size, we know when this starts)* into memory, serializing it into a `PartitionHeader` struct via [bincode](https://lib.rs/crates/bincode).
|
||||||
|
|
||||||
From here, as we have a fixed `CHUNK_SIZE`, and know how many chunks are in our first partition, we can read from any chunk on any partition now.
|
From here, as we have a fixed `CHUNK_SIZE`, and know how many chunks are in our first partition, we can read from any chunk on any partition now.
|
||||||
On startup, an *Actor* can request to read data from the disk. If it has the right [capabilities](/development/design/actor.md#ocap), we find the chunk it's looking for[^find_chunk], parse the data (using `bincode` again), and send it back.
|
On startup, an *Actor* can request to read data from the disk. If it has the right [capabilities](/development/design/actor.md#ocap), we find the chunk it's looking for from the index, parse the data (using `bincode` again), and send it back.
|
||||||
|
|
||||||
Also, we are able to verify data. Before passing off the data, we re-hash it using [HighwayHash](https://lib.rs/crates/highway) to see if it matches.
|
Also, we are able to verify data. Before passing off the data, we re-hash it using [HighwayHash](https://lib.rs/crates/highway) to see if it matches.
|
||||||
If it does, we simply pass it along like normal. If not, we refuse, and send an error [message](/development/design/actor.md#messages).
|
If it does, we simply pass it along like normal. If not, we refuse, and send an error [message](/development/design/actor.md#messages).
|
||||||
|
@ -96,7 +104,9 @@ While the index is not necessarily a fixed size, we read until we have enough da
|
||||||
```rust
|
```rust
|
||||||
use hashbrown::HashMap;
|
use hashbrown::HashMap;
|
||||||
|
|
||||||
let mut index = HashMap::new(); // Create the index
|
let mut index = HashMap::new(); // Create the Uuid storage index
|
||||||
|
let mut free_index = HashMap::new(); // Create the freespace index
|
||||||
|
|
||||||
struct Location {
|
struct Location {
|
||||||
partition: Uuid, // Partition identified via Uuid
|
partition: Uuid, // Partition identified via Uuid
|
||||||
chunks: Vec<u64>, // Which chunk(s) in the partition it is
|
chunks: Vec<u64>, // Which chunk(s) in the partition it is
|
||||||
|
@ -104,11 +114,11 @@ struct Location {
|
||||||
|
|
||||||
let new_data = (Uuid::new(), b"data"); // Test data w/ an actor Uuid & bytes
|
let new_data = (Uuid::new(), b"data"); // Test data w/ an actor Uuid & bytes
|
||||||
let new_data_location = Location {
|
let new_data_location = Location {
|
||||||
partition_offset: Uuid::new(),
|
partition: Uuid::new(),
|
||||||
chunks: vec![5, 8], // 5th & 8th chunk in that partition
|
chunks: vec![5, 8], // 5th & 8th chunk in that partition
|
||||||
};
|
};
|
||||||
|
|
||||||
index.insert(&new_data.0, new_data_location); // Insert a new entry mapping a data Uuid to a location
|
index.insert(&new_data.0, &new_data_location); // Insert a new entry mapping a data Uuid to a location
|
||||||
|
|
||||||
let uuid_location = index.get(&new_data.0).unwrap(); // Get the location of a Uuid
|
let uuid_location = index.get(&new_data.0).unwrap(); // Get the location of a Uuid
|
||||||
```
|
```
|
||||||
|
@ -120,6 +130,8 @@ It also allows us to tell if an actor *hasn't* been saved yet, allowing us to kn
|
||||||
### To-Do
|
### To-Do
|
||||||
- Snapshots
|
- Snapshots
|
||||||
- Isolation
|
- Isolation
|
||||||
|
- Journaling
|
||||||
|
- Resizing
|
||||||
|
|
||||||
## Executable Format
|
## Executable Format
|
||||||
Programs written in userspace will need to follow a specific format.
|
Programs written in userspace will need to follow a specific format.
|
||||||
|
@ -149,6 +161,4 @@ struct PackedExecutable {
|
||||||
|
|
||||||
[^encryption]: Specific details to be figured out later
|
[^encryption]: Specific details to be figured out later
|
||||||
|
|
||||||
[^find_chunk]: On startup, the `kernel` builds an index of the filesystem in-memory. This is then modified whenever chunks are modified, and saved on disk on shutdown, and read again on startup.
|
[^free_chunk]: Need to figure out how to efficiently do this. **XFS** seems to just keep another index of free chunks. It also uses a **B+Tree** rather than a hashmap - to look into.
|
||||||
|
|
||||||
[^free_chunk]: Because we know which chunks are used, we know which ones aren't.
|
|
||||||
|
|
Loading…
Reference in New Issue