DReAMy-lib
/

DB-custom-architecture

Model card Files Files and versions Community

DB-custom-architecture / README.md

lorenzoscottb's picture

Update README.md

f98c4d1 verified 27 days ago

|

raw history blame contribute delete

No virus

3.09 kB

	---
	license: apache-2.0
	---
	### DreamBank Custom Architecture

	The repo contains the weights for the custom architecture presented in the paper [Automatic Annotation of Dream Report’s Emotional Content with Large Language Models](https://aclanthology.org/2024.clpsych-1.7/).
	A working example of how to load and use the model can be found below. Please refer to the [Git repo](https://github.com/lorenzoscottb/Dream_Reports_Annotation/tree/main/Experiments/Supervised_Learning) for more details.

	#### Use

	```py
	import torch, os
	import pandas as pd
	from tqdm import tqdm
	import transformers
	from transformers import AutoModel
	from transformers import AutoConfig
	from transformers import BertTokenizerFast
	from SL_utils import *

	Coding_emotions = {
	"AN": "Anger",
	"AP": "Apprehension",
	"SD": "Sadness",
	"CO": "Confusion",
	"HA": "Happiness",
	}

	emotions_list = list(Coding_emotions.keys())

	test_sentences = [
	"In my dream I was follwed by the scary monster.",
	"I was walking in a forest, sorrounded by singing birds. I was in calm and peace."
	]

	test_sentences_target = len(test_sentences)*[[0, 0, 0, 0, 0]]
	test_sentences_df = pd.DataFrame.from_dict(
	{
	"report":test_sentences,
	"Report_as_Multilabel":test_sentences_target
	}
	)
	```

	```py
	model_name = "bert-large-cased"
	model_config = AutoConfig.from_pretrained(model_name)
	tokenizer = BertTokenizerFast.from_pretrained(model_name, do_lower_case=False)
	testing_set = CustomDataset(test_sentences_df, tokenizer, max_length=512)

	test_params = {
	'batch_size': 2,
	'shuffle': True,
	'num_workers': 0
	}

	testing_loader = DataLoader(testing_set, **test_params)

	model = BERT_PTM(
	model_config,
	model_name=model_name,
	n_classes=len(emotions_list),
	freeze_BERT=False,
	)

	# Load the models' weights from the pre-treined model
	model.load_state_dict(torch.load("path/to/pytorch_model.bin"))
	model.to("cuda")
	```

	```py
	outputs, targets, ids = validation(model, testing_loader, device="cuda", return_inputs=True)

	corr_outputs = np.array(outputs) >= 0.5
	corr_outputs_df = pd.DataFrame(corr_outputs, columns=emotions_list)
	corr_outputs_df = corr_outputs_df.astype(int)

	corr_outputs_df["report"] = decoded_ids = [decode_clean(x, tokenizer) for x in tqdm(ids)]
	```

	### Cite
	If you use this model on your work or research, please cite as:
	```bibtex
	@inproceedings{bertolini-etal-2024-automatic,
	title = "Automatic Annotation of Dream Report{'}s Emotional Content with Large Language Models",
	author = "Bertolini, Lorenzo and
	Elce, Valentina and
	Michalak, Adriana and
	Widhoelzl, Hanna-Sophia and
	Bernardi, Giulio and
	Weeds, Julie",
	booktitle = "Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024)",
	month = mar,
	year = "2024",
	address = "St. Julians, Malta",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/2024.clpsych-1.7",
	pages = "92--107",
	}

	```