Spaces:
Running
A newer version of the Gradio SDK is available:
5.44.1
graph LR
Entry_Point["Entry Point"]
Configuration["Configuration"]
Model_Abstraction["Model Abstraction"]
Data_Pipeline["Data Pipeline"]
Training_Logic["Training Logic"]
Utilities["Utilities"]
Scripts["Scripts"]
Requirements_Management["Requirements Management"]
Entry_Point -- "initializes" --> Configuration
Entry_Point -- "initializes" --> Model_Abstraction
Entry_Point -- "initializes" --> Data_Pipeline
Entry_Point -- "invokes" --> Training_Logic
Configuration -- "provides settings to" --> Model_Abstraction
Configuration -- "provides settings to" --> Data_Pipeline
Configuration -- "provides settings to" --> Training_Logic
Model_Abstraction -- "provides model to" --> Training_Logic
Data_Pipeline -- "provides data to" --> Training_Logic
Training_Logic -- "utilizes" --> Model_Abstraction
Training_Logic -- "utilizes" --> Data_Pipeline
Training_Logic -- "utilizes" --> Configuration
Training_Logic -- "utilizes" --> Utilities
Data_Pipeline -- "uses" --> Utilities
Model_Abstraction -- "uses" --> Utilities
Scripts -- "supports" --> Data_Pipeline
Scripts -- "supports" --> Model_Abstraction
Requirements_Management -- "defines environment for" --> Entry_Point
Requirements_Management -- "defines environment for" --> Configuration
Requirements_Management -- "defines environment for" --> Model_Abstraction
Requirements_Management -- "defines environment for" --> Data_Pipeline
Requirements_Management -- "defines environment for" --> Training_Logic
Requirements_Management -- "defines environment for" --> Utilities
Requirements_Management -- "defines environment for" --> Scripts
click Entry_Point href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Entry_Point.md" "Details"
click Model_Abstraction href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Model_Abstraction.md" "Details"
click Data_Pipeline href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Data_Pipeline.md" "Details"
Details
Component overview for the Machine Learning Training and Fine-tuning Framework.
Entry Point [Expand]
The primary execution script that orchestrates the entire training process. It initializes all other major components, loads configurations, sets up the training environment, and invokes the core training logic.
Related Classes/Methods:
train.py
Configuration
Centralized management of all training parameters, model hyperparameters, dataset paths, and other environment settings. It defines the schema for configurations, often using dataclasses, and supports both base and custom configurations.
Related Classes/Methods:
config/
(1:1)
Model Abstraction [Expand]
Responsible for abstracting the underlying machine learning model. This includes loading pre-trained models, handling different model architectures or variants, and preparing the model for training (e.g., quantization, device placement).
Related Classes/Methods:
Data Pipeline [Expand]
Manages the entire data flow, from loading raw datasets to preprocessing, tokenization, and creating efficient data loaders (e.g., PyTorch DataLoader
) for batching and shuffling data during training and evaluation.
Related Classes/Methods:
Training Logic
Encapsulates the core training loop, including forward and backward passes, loss calculation, optimization steps, and integration of callbacks for monitoring and control. It may include specialized trainers for different fine-tuning methods.
Related Classes/Methods:
Utilities
Provides a collection of common helper functions, classes, and modules used across various components. This includes functionalities like logging, metric calculation, checkpointing, and general data manipulation.
Related Classes/Methods:
utils/
(1:1)
Scripts
Contains auxiliary scripts that support the overall project but are separate from the main training pipeline. Examples include data preparation scripts, model conversion tools, or deployment-related utilities.
Related Classes/Methods:
scripts/
(1:1)
Requirements Management
Defines and manages all project dependencies, ensuring a consistent and reproducible development and deployment environment. This typically involves requirements.txt
files or similar dependency management tools.
Related Classes/Methods:
requirements/
(1:1)