Free chunk list

Switch binary decoding library
Basic queue & actor notes
2023-04-20 17:26:40 -04:00 · 2023-04-20 17:20:16 -04:00 · 2023-04-20 14:40:38 -04:00 · 2023-04-20 14:38:55 -04:00 · 2023-04-20 14:38:24 -04:00
7 changed files with 85 additions and 39 deletions
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@ -15,6 +15,7 @@
 - [Debugging](debugging/README.md)
    - [GDB]()
    - [Logging]()
+    - [Automated Tests]()
    - [Performance Profiling]()

 # User Guide
--- a/src/development/README.md
+++ b/src/development/README.md
@ -46,6 +46,7 @@ A thorough series of steps might be:
 5. Read the [RISC-V Guide](https://github.com/mikeroyal/RISC-V-Guide)/[RISC-V Bytes](https://danielmangum.com/categories/risc-v-bytes/) to learn more about the **RISC-V** architecture
 6. Read the OSDev Wiki entries on [Microkernels](https://wiki.osdev.org/Microkernel) and [Message Passing](https://wiki.osdev.org/Message_Passing)
 7. Read the [Async Book](https://rust-lang.github.io/async-book/01_getting_started/01_chapter.html)
+8. [This](https://sled.rs/perf.html) has some good information about performance

 Additionally you might want to learn about **Vulkan** if you're going to be hacking on the [GUI](/development/design/gui.md):
 1. Go through the [Vulkan Tutorial (Rust)](https://kylemayes.github.io/vulkanalia/introduction.html) to learn some of the basics
--- a/src/development/design/README.md
+++ b/src/development/design/README.md
@ -33,7 +33,7 @@ Further design decisions are gone into detail in the next few chapters.
 These names and layout are all **WIP**.
 ```

-All of the code will take place in seperate repositories.
+All of the code will take place in separate repositories.
 Information on actually commiting, pulling, etc. is in the [Workflow](/development/workflow.md) chapter.

 Most of the code will be implemented as libraries, enabling for them to be used across systems, and worked on separately.
--- a/src/development/design/actor.md
+++ b/src/development/design/actor.md
@ -1 +1,11 @@
 # Actor System
+
+## OCAP
+**TODO**
+
+## Messages
+**TODO**
+- [postcard](https://lib.rs/crates/postcard) for message passing
+- Priority Queue for processing multiple messages, while dealing with higher-priority ones first
+
+### Latency
--- a/src/development/design/filesystem.md
+++ b/src/development/design/filesystem.md
@ -17,11 +17,18 @@ Therefore, the "filesystem" code will just be a library that's simple a low-leve

 ## Performance
 I believe that this format should be fairly fast, but only implementation and testing will tell for sure.
+Throughput is the main concern here, rather than latency. We can be asynchronous as wait for many requests to finish, rather than worrying about when they finish. This is also better for **SSD** performance.
 1. Minimal data needs to read in - bit offsets can be used, and only fixed-size metadata must be known
 2. `serde` is fairly optimized for deserialization/serialization
-3. `HighwayHash` is a very fast and well-optimized hashing algorithm
-4. Async and multithreading will allow for concurrent access, and splitting of resource-intensive tasks across threads
+3. `BTreeMap` is a very fast and simple data structure
+4. Async and multithreading will allow for concurrent access, and splitting of resource-intensive tasks across threads.
 5. `hashbrown` is quite high-performance
+6. Batch processing increases throughput
+
+### Buffering
+The `kernel` will hold two read/write buffers in-memory and will queue reading & writing operations into them.
+They can then be organized and batch processed, in order to optimize **HDD** speed (not having to move the head around), and **SSD** performance (minimizing operations).
+

 ## Filesystem Layout

@ -29,7 +36,7 @@ I believe that this format should be fairly fast, but only implementation and te
 |------|------|--------|
 | Boot Sector | `128 B` | `None` |
 | Kernel Sector | `4096 KB` | `None` |
-| Index Sector | `4096 KB` | `None` |
+| Index Sector | `u64` | `PartitionHeader` |
 | Config Sector | `u64` | `PartitionHeader` |
 | User Sector(s) | `u64` | `PartitionHeader` |

@ -37,13 +44,31 @@ I believe that this format should be fairly fast, but only implementation and te
 A virtual section of the disk.
 Additionally, it has a **UUID** generated via [lolid](https://lib.rs/crates/lolid) to enable identifying a specific partition.

+[binary-layout](https://lib.rs/crates/binary-layout) can be used to parse data from raw bytes on the disk into a structured format, with `no-std`.
+
 ```rust
+use binary_layout::prelude::*;
 const LABEL_SIZE: u16 = 128; // Example number of characters that can be used in the partition label

-struct PartitionHeader {
-    label: [char; LABEL_SIZE], // Human-readable label. Not UTF-8 though :/
+define_layout!(partition_header, BigEndian, {
+    partition_type: PartitionType, // Which type of partition it is
    num_chunks: u64, // Chunks in this partition
-    uuid: Uuid,4096
+    uuid: Uuid
+});
+
+enum PartitionType {
+    Index, // Used for FS indexing
+    Config, // Used for system configuration
+    User, // User-defined partition
+}
+
+fn parse_data(partition_data: &mut [u8]) -> View {
+    let mut view = partition_header::View::new(partition_data);
+
+    let id: u64 = view.uuid().read(); // Read some data
+    view.num_chunks_mut().write(10); // Write data
+
+    return view;
 }
 ```

@ -51,39 +76,36 @@ struct PartitionHeader {
 Small pieces that each partition is split into.
 Contains fixed-length metadata (checksum, encryption flag, modification date, etc.) at the beginning, and then arbitrary data afterwards.

+`binary-layout` is similarly used to parse the raw bytes of a chunk.
+
 ```rust
-const CHUNK_SIZE: u64 = 4096; // Example static chunk size
+use binary_layout::prelude::*;
+const CHUNK_SIZE: u64 = 4096; // Example static chunk size (in bytes)

-struct ChunkHeader {
+define_layout!(chunk, BigEndian, {
    checksum: u64,
-    encrypted: bool,
    modified: u64, // Timestamp of last modified
-    uuid: Uuid,
-}
-
-struct Chunk {
-    header: ChunkHeader,
+    uuid: u128,
    data: [u8; CHUNK_SIZE],
-}
+});
 ```
 This struct is then encoded into bytes and written to the disk. Drivers for the disk are *to be implemented*.
 It *should* be possible to do autodetection, and maybe for *Actors* to specify which disk/partition they want to be saved to.

-Compression of the data should also be possible, due to `bincode` supporting [flate2](https://lib.rs/crates/flate2) compression.
-Similarly **AES** encryption can be used, and this allows for only specific chunks to be encrypted.[^encryption]
+**AES** encryption can be used, and this allows for only specific chunks to be encrypted.[^encryption]

 ### Reading
 On boot, we start executing code from the **Boot Sector**. This contains the assembly instructions, which then jump to the `kernel` code in the **Kernel Sector**.
-The `kernel` then reads in bytes from the first partition *(as the sectors are fixed-size, we know when this starts)* into memory, serializing it into a `PartitionHeader` struct via [bincode](https://lib.rs/crates/bincode).
+The `kernel` then reads in bytes from the first partition *(as the sectors are fixed-size, we know when this starts)* into memory, parsing it into a structured form.

 From here, as we have a fixed `CHUNK_SIZE`, and know how many chunks are in our first partition, we can read from any chunk on any partition now.
-On startup, an *Actor* can request to read data from the disk. If it has the right [capabilities](/development/design/actor.md#ocap), we find the chunk it's looking for from the index, parse the data (using `bincode` again), and send it back.
+On startup, an *Actor* can request to read data from the disk. If it has the right [capabilities](/development/design/actor.md#ocap), we find the chunk it's looking for from the index, parse the data, and send it back.

 Also, we are able to verify data. Before passing off the data, we re-hash it using [HighwayHash](https://lib.rs/crates/highway) to see if it matches.
 If it does, we simply pass it along like normal. If not, we refuse, and send an error [message](/development/design/actor.md#messages).

 ### Writing
-Writing uses a similar process. An *Actor* can request to write data. If it has proper capabilties, we serialize the data, allocate a free chunk[^free_chunk], and write to it.
+Writing uses a similar process. An *Actor* can request to write data. If it has proper capabilties, we serialize the data, allocate a free chunk, and write to it.
 We *hash* the data first to generate a checksum, and set proper metadata.

 ### Permissions
@ -96,31 +118,37 @@ will be determined via [capabilities](/development/design/actor.md#ocap)

 ### Indexing
 Created in-memory on startup, modified directly whenever the filesystem is modified.
-It's saved in the *Index Sector* (which is at a known offset & size), allowing it to be read in easily on boot.
-It again simply uses `bincode` and compression.
+It's saved in the *Index Sector* (which is at a known offset), allowing it to be read in easily on boot.

-While the index is not necessarily a fixed size, we read until we have enough data from the fixed sector size.
+The index is simply an `alloc::` [BTreeMap](https://doc.rust-lang.org/stable/alloc/collections/btree_map/struct.BTreeMap.html).
+
+We also have a simple `Vec` of the chunks that are free, which we modify in reverse.

 ```rust
-use hashbrown::HashMap;
-
-let mut index = HashMap::new(); // Create the Uuid storage index
-let mut free_index = HashMap::new(); // Create the freespace index
+let mut index = BTreeMap::new(); // Basic Actor index
+let mut free_index = Vec<u64>; // Index of free chunks

 struct Location {
    partition: Uuid, // Partition identified via Uuid
    chunks: Vec<u64>, // Which chunk(s) in the partition it is
 }

-let new_data = (Uuid::new(), b"data"); // Test data w/ an actor Uuid & bytes
 let new_data_location = Location {
    partition: Uuid::new(),
    chunks: vec![5, 8], // 5th & 8th chunk in that partition
-};
+}

-index.insert(&new_data.0, &new_data_location); // Insert a new entry mapping a data Uuid to a location
+index.entry(&actor.uuid).or_insert(&new_data_location); // Insert an Actor's storage location if it's not already stored
+for i in &new_data_location.chunks {
+    free_index.pop(&i); // Remove used chunks from the free chunks list
+}

-let uuid_location = index.get(&new_data.0).unwrap(); // Get the location of a Uuid
+index.contains_key(&actor.uuid); // Check if the index contains an Actor's data
+index.get(&actor.uuid); // Get the Location of the actor
+index.remove(&actor.uuid); // Remove an Actor's data from the index (e.g. on deletion)
+for i in &new_data_location.chunks {
+    free_index.push(&i); // Add back the now free chunks
+}
 ```

 This then allows the index to be searched easily to find the data location of a specific `Uuid`.
@ -132,6 +160,7 @@ It also allows us to tell if an actor *hasn't* been saved yet, allowing us to kn
 - Isolation
 - Journaling
 - Resizing
+- Atomic Operations 

 ## Executable Format
 Programs written in userspace will need to follow a specific format.
@ -160,5 +189,3 @@ struct PackedExecutable {
 ```

 [^encryption]: Specific details to be figured out later
-
-[^free_chunk]: Need to figure out how to efficiently do this. **XFS** seems to just keep another index of free chunks. It also uses a **B+Tree** rather than a hashmap - to look into.
--- a/src/development/design/gui.md
+++ b/src/development/design/gui.md
@ -2,9 +2,17 @@
 Eventually, programs will be able to use the `photon` library to have access to a graphics API.
 This will initialize various [actors](/development/design/actor.md) to represent parts of the UI.

+## Performance
+The **GUI** is one of the systems where latency is far more important than throughput.
+There are several things that aid with performance of this system:
+1. Each window is drawn in a separate buffer, allowing for easy concurrency
+3. Messages use the [latency bus](/development/design/actor.md#latency)
+3. All draws are based on optimized and simple low-level operations
+4. Only changes are re-rendered
+
 ## Drawing
 When a **GUI** element wants to update, it first sends a [message](/development/design/actor.md#messages) to the `kernel`.
-The `kernel` then calculates the overlaying of each window, writes each window to it's own buffer, then updates the screen buffer with ones that have changed, which is then drawn to the screen.
+The `kernel` then calculates the overlaying of each window, writes each window to its own buffer, then updates the screen buffer with ones that have changed, which is then drawn to the screen.
 This ensures that only necessary parts are re-rendered, and the rendering can be done asynchronously/threaded.

 The `photon` library will not only provide a high-level API for applications to use, but also lower-level drawing methods for the `kernel` to use.
--- a/src/development/design/kernel.md
+++ b/src/development/design/kernel.md
@ -20,14 +20,13 @@ This will include the [filesystem](/development/design/filesystem.md), [actors](
 This might include operations like *hashing*, *encryption*, and *indexing*.

 ## Boot Process
-*To be implemented*
+**TODO**

 ## Memory Management
-*To-Do*
+**TODO**

 ## Processes
-*To-Do*
- [postcard](https://lib.rs/crates/postcard) for message passing
+**TODO**

 ## Error Handling
 All errors must be handled gracefully by the `kernel`. If possible, they should simply log an error.
Author	SHA1	Message	Date
Erin Abicht	e70be2c638	Free chunk list	2023-04-20 17:26:40 -04:00
Erin Abicht	af21da0fe1	Switch binary decoding library	2023-04-20 17:20:16 -04:00
Erin Abicht	9546c11798	Basic queue & actor notes	2023-04-20 14:40:38 -04:00
Erin Abicht	323a0c9a63	Switch to BTreeMap index	2023-04-20 14:38:55 -04:00
Erin Abicht	4aad43a939	Performance notes	2023-04-20 14:38:24 -04:00