NEWM / virtual_ssd /two_tier_virtual_ssd_report.md
Factor Studios
Upload 167 files
684cc60 verified

Two-Tier Virtual SSD Implementation and Testing Report

Introduction

This report details the implementation and testing of a two-tier virtual Solid State Drive (SSD) system. The primary goal was to create a virtual storage solution that provides data persistence without directly interacting with the host operating system's file system for its internal data storage. This was achieved by introducing a persistent virtual disk (PVD) layer and a volatile virtual disk (VVD) layer that caches data from the PVD.

Architecture Design

The proposed architecture consists of two main virtual disk components:

  1. Persistent Virtual Disk (PVD): This layer is responsible for the long-term storage of data. It simulates the underlying flash memory, file system mapping, and SSD controller. Its state is saved to a snapshot file within the sandbox environment, allowing data to persist across virtual SSD mount/unmount cycles within the same sandbox session.

  2. Volatile Virtual Disk (VVD): This layer acts as a caching mechanism for the PVD. All read and write operations from the application interface initially interact with the VVD. The VVD maintains an in-memory cache of pages and tracks

dirty pages. When the virtual SSD is shut down, the VVD flushes all dirty pages to the PVD, ensuring data integrity and persistence.

Implementation Details

Persistent Virtual Disk (PVD)

The PersistentVirtualDisk class encapsulates the VirtualFlash, FileSystemMap, and SSDController components. It provides methods to save its entire state to a JSON file (pvd_snapshot.json) and load it back. This file is stored within the sandbox environment in a directory named virtual_ssd_data. This ensures that the PVD's state is preserved across instantiations of the VirtualSSD class within the same sandbox session.

Volatile Virtual Disk (VVD)

The VolatileVirtualDisk class was introduced to act as an intermediary between the VirtualDriver and the PersistentVirtualDisk. It maintains an in-memory page_cache and a dirty_pages set. All write_page, read_page, and erase_block operations from the VirtualDriver are now routed through the VVD. When a page is written, it's stored in the page_cache and marked as dirty. Reads first check the page_cache before falling back to the PVD. During shutdown, the flush_dirty_pages method is called to write all modified pages from the page_cache to the PVD.

Integration with VirtualSSD

The VirtualSSD class was modified to instantiate both the PersistentVirtualDisk and VolatileVirtualDisk. The VirtualDriver now interacts with the VolatileVirtualDisk instead of directly with the SSDController. The mount method loads the PVD state, and the shutdown method ensures that the VVD flushes its dirty pages to the PVD before the PVD's state is saved.

Testing and Verification

Comprehensive testing was performed to verify the functionality and persistence of the two-tier virtual SSD. The test script in virtual_ssd.py performs the following steps:

  1. Initialization and File Saves: A VirtualSSD instance is created and mounted. Several test files of varying sizes (small, large, very large) are saved to the virtual SSD. Additionally, a file from the host OS (test_upload_file.txt) is uploaded to the virtual SSD.
  2. File Listing and Capacity Check: The files stored on the virtual SSD are listed, and capacity information is retrieved to confirm successful writes and accurate space utilization.
  3. File Reading and Verification: All saved files, including the uploaded host file, are read back from the virtual SSD, and their content is asserted against the original data to ensure data integrity.
  4. File Deletion: A file is deleted from the virtual SSD, and the file list and capacity are re-checked to confirm successful deletion and space reclamation.
  5. Persistence Test: The virtual SSD is shut down, and a new VirtualSSD instance is created and mounted. The files are then listed and read again. This step is crucial to verify that the data persists across shutdown and re-mount cycles, demonstrating the effectiveness of the PVD and VVD flushing mechanism.
  6. Formatting Test: The virtual SSD is formatted, and the file list is checked to ensure all data has been erased, confirming the format functionality.

Results

All tests passed successfully. The virtual SSD demonstrated the ability to:

  • Save and retrieve files of various sizes.
  • Maintain data integrity during read/write operations.
  • Persist data across shutdown and re-mount cycles within the sandbox environment.
  • Isolate its storage from the host OS, with its internal state saved to a dedicated snapshot file (virtual_ssd_data/pvd_snapshot.json) within the sandbox, rather than directly modifying host filesystem content outside of this designated area.
  • Successfully format and clear all stored data.

Conclusion

The implementation of the two-tier virtual SSD with a volatile caching layer and a persistent storage layer successfully addresses the requirement for data persistence without direct host OS interaction. The system effectively simulates SSD behavior, provides reliable file operations, and ensures data integrity and persistence within the defined sandbox environment. The pvd_snapshot.json file serves as the single point of persistence for the virtual SSD's state, allowing for data to be saved and loaded as needed.