Spaces:
				
			
			
	
			
			
					
		Running
		
	
	
	
			
			
	
	
	
	
		
		graph LR
    Entry_Point["Entry Point"]
    Configuration_Management["Configuration Management"]
    Data_Pipeline["Data Pipeline"]
    Model_Abstraction["Model Abstraction"]
    Training_Orchestrator["Training Orchestrator"]
    Entry_Point -- "Initializes and Uses" --> Configuration_Management
    Entry_Point -- "Initializes" --> Data_Pipeline
    Entry_Point -- "Initializes" --> Model_Abstraction
    Entry_Point -- "Initializes and Invokes" --> Training_Orchestrator
    Configuration_Management -- "Provides Configuration To" --> Model_Abstraction
    Configuration_Management -- "Provides Configuration To" --> Data_Pipeline
    Configuration_Management -- "Provides Configuration To" --> Training_Orchestrator
    Data_Pipeline -- "Provides Data To" --> Training_Orchestrator
    Model_Abstraction -- "Provides Model To" --> Training_Orchestrator
    click Entry_Point href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Entry_Point.md" "Details"
    click Configuration_Management href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Configuration_Management.md" "Details"
    click Data_Pipeline href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Data_Pipeline.md" "Details"
    click Model_Abstraction href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Model_Abstraction.md" "Details"
    click Training_Orchestrator href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Training_Orchestrator.md" "Details"
Details
Abstract Components Overview
Entry Point [Expand]
The primary execution script that orchestrates the entire training process. It initializes all other major components, loads configurations, sets up the training environment, and invokes the Training Orchestrator.
Related Classes/Methods:
- train
Configuration Management [Expand]
Centralized management of all training parameters, model specifications, data paths, and hyper-parameters. It is responsible for loading, validating, and providing access to configuration settings, supporting base and custom configurations.
Related Classes/Methods:
Data Pipeline [Expand]
Handles the entire data lifecycle, including dataset loading, preprocessing (e.g., tokenization, formatting), and creating efficient data loaders for both training and evaluation phases.
Related Classes/Methods:
Model Abstraction [Expand]
Encapsulates the logic for loading pre-trained models, defining model architectures, and managing different model variants (e.g., quantization, LoRA adapters). It provides a consistent interface for model interaction.
Related Classes/Methods:
Training Orchestrator [Expand]
Implements the core training and fine-tuning loop. This includes managing forward and backward passes, optimization, loss calculation, and integration with acceleration libraries (e.g., accelerate). It also handles callbacks and evaluation logic.
Related Classes/Methods:
