pradachan
/

AI-Scientist

Model card Files Files and versions Community

AI-Scientist / results /nanoGPT_lite /20241028_094340_charlevelsummarization /notes.txt

pradachan's picture

Upload folder using huggingface_hub

f71c233 verified 3 months ago

history blame contribute delete

1.8 kB

	# Title: Character-Level Text Summarization Using Transformers: A Feasibility Study
	# Experiment description: ['1. Collect or create a small, well-annotated dataset of text documents with corresponding human-generated summaries. Each document should be paired with a short, concise summary.', '2. Preprocess the text documents and summaries to fit the character-level format required by the model. This includes tokenizing the text into characters and converting them into numerical IDs.', '3. Modify the training script to include a new loss function that incorporates the summary generation task. Specifically, use a sequence-to-sequence (seq2seq) setup where the model is trained to generate summaries from the input documents.', '4. Train the model on the summarization dataset. Use a validation set to monitor the performance and avoid overfitting. Implement early stopping based on validation loss.', "5. Evaluate the model's ability to generate summaries by comparing the generated summaries with the human-generated summaries. Use metrics such as ROUGE (Recall-Oriented Understudy for Gisting Evaluation) to assess performance.", "6. Analyze the results to determine the feasibility and accuracy of the model for text summarization. Identify any patterns or insights that can be drawn from the model's performance.", '7. Compare the performance of the character-level model with a word-level model on the same task to highlight any advantages or disadvantages. Use the same evaluation metrics for a fair comparison.']
	## Run 0: Baseline
	Results: {'shakespeare_char': {'final_train_loss_mean': 0.8173830509185791, 'best_val_loss_mean': 1.4637625217437744, 'total_train_time_mean': 92.05195260047913, 'avg_inference_tokens_per_second_mean': 697.3658396135052}}
	Description: Baseline results.