File size: 5,271 Bytes
4683a9a
0729830
 
 
 
 
 
 
 
4683a9a
0729830
4683a9a
0729830
a3bdea5
0729830
4683a9a
e82b954
 
 
 
4683a9a
e82b954
 
 
 
 
4683a9a
e82b954
 
 
 
 
 
 
 
 
4683a9a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0729830
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
slurm submission log: 2024-05-24 11:42:10.904769
created following sbatch script: 

###############################

#!/bin/bash

#SBATCH --account=nlp
#SBATCH --cpus-per-task=16
#SBATCH --dependency=afterok:7648449
#SBATCH --gres=gpu:1
#SBATCH --job-name=tthrush-job-2437039
#SBATCH --mem=60G
#SBATCH --nodelist=sphinx1
#SBATCH --open-mode=append
#SBATCH --output=/juice5/scr5/tthrush/pretraining-coreset-selection/llm_pretraining/test_ordinal_constrained_initial_init_min_threshold/llms/pythia-70m_sciq_1/eval_job_output.txt
#SBATCH --partition=sphinx
#SBATCH --time=14-0

# activate your desired anaconda environment
. /nlp/scr/tthrush/miniconda3/envs/pretraining-coreset-selection/etc/profile.d/conda.sh ; conda activate pretraining-coreset-selection

# cd to working directory
cd .

# launch commands
srun --unbuffered run_as_child_processes 'lm_eval --model hf --model_args pretrained=/juice5/scr5/tthrush/pretraining-coreset-selection/llm_pretraining/test_ordinal_constrained_initial_init_min_threshold/llms/pythia-70m_sciq_1,revision=main,dtype=float16,trust_remote_code=True --tasks xnli_en,xnli_fr,sciq,piqa,lambada,arc_easy --device cuda --output_path /juice5/scr5/tthrush/pretraining-coreset-selection/llm_pretraining/test_ordinal_constrained_initial_init_min_threshold/llms/pythia-70m_sciq_1/perf'

###############################

submission to slurm complete!


###############################
slurm submission output

Submitted batch job 7648450



###############################

/var/lib/slurm/slurmd/job7648450/slurm_script: line 16: /nlp/scr/tthrush/miniconda3/envs/pretraining-coreset-selection/etc/profile.d/conda.sh: No such file or directory

CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tcsh
  - xonsh
  - zsh
  - powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.


###############################
start time: 2024-05-24 11:44:53.492934
machine: sphinx1
conda env: pretraining-coreset-selection
###############################
running following processes

	lm_eval --model hf --model_args pretrained=/juice5/scr5/tthrush/pretraining-coreset-selection/llm_pretraining/test_ordinal_constrained_initial_init_min_threshold/llms/pythia-70m_sciq_1,revision=main,dtype=float16,trust_remote_code=True --tasks xnli_en,xnli_fr,sciq,piqa,lambada,arc_easy --device cuda --output_path /juice5/scr5/tthrush/pretraining-coreset-selection/llm_pretraining/test_ordinal_constrained_initial_init_min_threshold/llms/pythia-70m_sciq_1/perf


###############################
command outputs: 


2024-05-24:11:44:56,229 INFO     [utils.py:145] Note: detected 255 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
2024-05-24:11:44:56,229 INFO     [utils.py:148] Note: NumExpr detected 255 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2024-05-24:11:44:56,229 INFO     [utils.py:160] NumExpr defaulting to 8 threads.
2024-05-24:11:44:56,591 INFO     [config.py:58] PyTorch version 2.2.2 available.
2024-05-24:11:45:00,567 INFO     [__main__.py:156] Verbosity set to INFO
2024-05-24:11:45:07,210 WARNING  [__init__.py:194] Some tasks could not be loaded due to missing dependencies. Run with `--verbosity DEBUG` for full details.
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
slurmstepd: error: *** JOB 7648450 ON sphinx1 CANCELLED AT 2024-05-24T11:45:39 ***
slurmstepd: error: *** STEP 7648450.0 ON sphinx1 CANCELLED AT 2024-05-24T11:45:39 ***
Received SIGTERM, job terminating, terminating 1 processes...
slurm submission log: 2024-05-24 11:46:16.500701
created following sbatch script: 

###############################

#!/bin/bash

#SBATCH --account=nlp
#SBATCH --cpus-per-task=16
#SBATCH --dependency=afterok:7648481
#SBATCH --gres=gpu:1
#SBATCH --job-name=tthrush-job-372659
#SBATCH --mem=60G
#SBATCH --nodelist=sphinx1
#SBATCH --open-mode=append
#SBATCH --output=/juice5/scr5/tthrush/pretraining-coreset-selection/llm_pretraining/test_ordinal_constrained_initial_init_min_threshold/llms/pythia-70m_sciq_1/eval_job_output.txt
#SBATCH --partition=sphinx
#SBATCH --time=14-0

# activate your desired anaconda environment
. /nlp/scr/tthrush/miniconda3/envs/pretraining-coreset-selection/etc/profile.d/conda.sh ; conda activate pretraining-coreset-selection

# cd to working directory
cd .

# launch commands
srun --unbuffered run_as_child_processes 'lm_eval --model hf --model_args pretrained=/juice5/scr5/tthrush/pretraining-coreset-selection/llm_pretraining/test_ordinal_constrained_initial_init_min_threshold/llms/pythia-70m_sciq_1,revision=main,dtype=float16,trust_remote_code=True --tasks xnli_en,xnli_fr,sciq,piqa,lambada,arc_easy --device cuda --output_path /juice5/scr5/tthrush/pretraining-coreset-selection/llm_pretraining/test_ordinal_constrained_initial_init_min_threshold/llms/pythia-70m_sciq_1/perf'

###############################

submission to slurm complete!


###############################
slurm submission output

Submitted batch job 7648482



###############################