sanchit-gandhi HF staff commited on
Commit
1983ed9
1 Parent(s): 67b9f84

2hx8pk65: saving weights and logs of step 30k

Browse files
flax_model.msgpack CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1b20c6ce9070a647fc4b56ff847349b2a6ad959d336f8591e71fc135e07d67a9
3
  size 2353616717
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:32e9052e8f2cf429458f533122ce50b5a0fc3cfc6e3096288daefda2766c5f0b
3
  size 2353616717
nohup.out CHANGED
The diff for this file is too large to render. See raw diff
 
wandb/run-20220828_085247-2hx8pk65/files/media/table/eval/step_30k_30000_509ad8614e16ae2800f1.table.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"columns": ["id", "label_str", "beam_1", "beam_2", "beam_3", "beam_4", "beam_5"], "data": [["2277-149896-0000", "he was in a fevered state of mind owing to the blight his wife's action threatened to cast upon his entire future", "he was in a fevered state of mind owing to the blight his wife's action threatened to cast upon his entire future", "he was in a fevered state of mind owing to the blight his wife's action threatened to cast up upon his entire future", "he was in a fevered state of mind owing to the blight his wife's action threatened to cast up on his entire future", "he was in a fervored state of mind owing to the blight his wife's action threatened to cast upon his entire future", "he was in a fevered state of mind owing to blight his wife's action threatened to cast upon his entire future"], ["2277-149896-0001", "he would have to pay her the money which she would now regularly demand or there would be trouble it did not matter what he did", "he would have to pay her the money which she would now regularly demand or there would be trouble it did not matter what he did", "he would have to pay her her the money which she would now regularly demand or there would be trouble it did not matter what he did", "he would have to pay her the money which he would now regularly demand or there would be trouble it did not matter what he did", "he would have to pay her her money which she would now regularly demand or there would be trouble it did not matter what he did", "he would have to pay her the money which she would now regularly demand or there would be trouble it did not matter what he did it"], ["2277-149896-0002", "hurstwood walked the floor mentally arranging the chief points of his situation", "hurstwood walked the floor mentally arranging the chief points of his situation", "hurstwood walked to the floor mentally arranging the chief points of his situation", "whistwood walked the floor mentally arranging the chief points of his situation", "hirstwood walked the floor mentally arranging the chief points of his situation", "hurstwood walked the floor mentally arranging the chief points of his situation"], ["2277-149896-0003", "he also thought of his managerial position", "he also thought of his managerial position", "he also thought of this managerial position", "he also thought in his managerial position", "he also thought of his managerial position and", "he also thought of his managemental position"], ["2277-149896-0004", "how would the papers talk about it", "how would the papers talk about it", "how would the papers talk about it you", "how would the papers talk about it and", "how would the papers talk about it i", "oh would the papers talk about it"], ["2277-149896-0005", "many little wrinkles gathered between his eyes as he contemplated this and his brow moistened", "many little wrinkles gathered between his eyes as he contemplated this and his brow moistened", "many little wrinkles gathered between his eyes as he considered this and his brow moistened", "many little wrinkles gathering between his eyes as he contemplated this and his brow moistened", "many little wrinkles gathered between his eyes as he contemplated this and his brow moistened", "many little wrinkles gathered between his eyes as he contemplated this then his brow moistened"], ["2277-149896-0006", "he could arrange that satisfactorily for carrie would be glad to wait if necessary", "he could arrange that satisfactorily for carrie would be glad to wait if necessary", "he could arrange that satisfactorily for carry would be glad to wait if necessary", "he could arrange that satisfactorily for perry would be glad to wait if necessary", "he could arrange this satisfactorily for carrie would be glad to wait if necessary", "he could arrange as satisfactorily for carrie would be glad to wait if necessary"], ["2277-149896-0007", "he would see how things turned out to morrow and then he would talk to her they were going to meet as usual", "he would see how things turned out to morrow and then he would talk to her they were going to meet as usual", "he would see how things turned out tomorrow and then he would talk to her they were going to meet as usual", "he would see how things turned out of morrow and then he would talk to her they were going to meet as usual", "he would see how things turned out today and then he would talk to her they were going to meet as usual", "he would see how things turned out to morrow then he would talk to her they were going to meet as usual"], ["2277-149896-0008", "for some reason he felt as if something might come that way and was relieved when all the envelopes had been scanned and nothing suspicious noticed", "for some reason he felt as if something might come that way and was relieved when all the envelopes had been scanned and nothing suspicious noticed", "for some reason he felt as if something might come that way and was relieved when all of the envelopes had been scanned and nothing suspicious noticed", "for some reason he felt as if something might come that way and was relieved when all the envelops had been scanned and nothing suspicious noticed", "for some reason he felt as if something might come this way and was relieved when all the envelopes had been scanned and nothing suspicious noticed", "for some reason he felt as if something might come that way and was relieved while all the envelopes had been scanned and nothing suspicious noticed"], ["2277-149896-0009", "while the danger had not lessened it had not as yet materialised and with him no news was good news", "while the danger had not lessened it had not as yet materialized and with him no news was good news", "while the danger had not lessened it had not as yet materialised and with him no news was good news", "while the danger had not lessen'd it had not as yet materialized and with him no news was good news", "while this danger had not lessened it had not as yet materialized and with him no news was good news", "while the danger had not lessen it it had not as yet materialized and with him no news was good news"], ["2277-149896-0010", "so little did he consider drouet that it never once occurred to him to worry about his finding out", "so little did he consider drue that it never once occurred to him to worry about his finding out", "so little did he consider druda that it never once occurred to him to worry about his finding out", "so little did he consider drua that it never once occurred to him to worry about his finding out", "so little did he consider drude that it never once occurred to him to worry about his finding out", "so little did he consider drusa that it never once occurred to him to worry about his finding out"], ["2277-149896-0011", "he grew restless as he ruminated and then decided that perhaps it was nothing", "he grew restless as he ruminated and then decided that perhaps it was nothing", "he grew restless as he ruminated and then decided that perhaps it was nothing i", "he grew restless as he ruminated and then deciding that perhaps it was nothing", "he grew restless as he ruminated and then decided that perhaps it was nothing and", "he grew restless as he ruminated and then decided that perhaps it was nothing it"], ["2277-149896-0012", "she had not been able to get away this morning", "she had not been able to get away this morning", "she had not been able to get away this morning she had not been able to get away this morning", "she had not been able to get away this morning i had not been able to get away this morning", "she had not been able to get away this morning she had not been able to go away this morning", "she had not been able to get away this morning and"], ["2277-149896-0013", "he would get one to day it would probably be on his desk when he got back he would look for it at once", "he would get one to day it would probably be on his desk when he got back he would look for it at once", "he would get one today it would probably be on his desk when he got back he would look for it at once", "he would get one to day it would probably be upon his desk when he got back he would look for it at once", "he would get one to day it would probably be in his desk when he got back he would look for it at once", "he would get one to day it would probably be on his desk when he got up he would look for it at once"], ["2277-149896-0014", "after a time he gave up waiting and drearily headed for the madison car", "after a time he gave up waiting and drearily headed for the madison car", "after a time he gave up waiting and drearily heading for the madison car", "after a time he gave up waiting and drearily headed for the maddison car", "after a time he gave up waiting and drearily headed for the hadsan car", "after a time he gave up waiting and dreadily headed for the madison car"], ["2277-149896-0015", "he went in and examined his letters but there was nothing from carrie", "he went in and examined his letters but there was nothing from carrie", "he went in and examined his letters but there was nothing from kerry", "he went in and examined his letters but there was nothing from carry", "he went in and examined his letters but there was nothing from cary", "he went in and examined his letters but there was nothing from perry"], ["2277-149896-0016", "fortunately there was nothing from his wife either", "fortunately there was nothing from his wife either", "fortunately there was nothing from his wife either", "fortunately there was nothing from his wife either", "fortunally there was nothing from his wife either", "fortunate there was nothing from his wife either"], ["2277-149896-0017", "at one thirty he went to rector's for lunch and when he returned a messenger was waiting for him", "at one thirty he went to rector's for lunch and when he returned a messenger was waiting for him", "at one thirty he went into rector's for lunch and when he returned a messenger was waiting for him", "at one thirty he went to rector's for lunch and when he returned a messenger was waiting for him", "i at one thirty he went to rector's for lunch and when he returned a messenger was waiting for him", "at one thirty he went to rectors for lunch and when he returned a messenger was waiting for him"], ["2277-149896-0018", "his first impulse was to write but four words in reply go to the devil", "his first impulse was to write but four words in reply go to the devil", "his first impulse was to write but four words in reply go to that devil", "his first impulse was to write but four words and reply go to the devil", "his first impulse was to write but four words in reply go to the devil i", "his first impulse was to write but four words in reply go to the devil'"], ["2277-149896-0019", "but he compromised by telling the boy that there would be no reply", "but he compromised by telling the boy that there would be no reply", "but he compromised by telling the boy there would be no reply", "but he compromised by telling the boy that there would be no reply i", "but hecompromised by telling the boy that there would be no reply", "but he compromised by telling the boy that there would be no reply to"], ["2277-149896-0020", "then he sat down in his chair and gazed without seeing contemplating the result of his work", "then he sat down in his chair and gazed without seeing contemplating the result of his work", "then he sate down in his chair and gazed without seeing contemplating the result of his work", "then he sat down in his chair and gazed without seeing contemplating the result of his works", "then he set down in his chair and gazed without seeing contemplating the result of his work", "then he sat down in his chair and gazed without seeing contemplating the results of his work"], ["2277-149896-0021", "what would she do about that the confounded wretch", "what would she do about that the confounded wretch", "what would she do about that that confounded wretch", "what would she do about that the con founded wretch", "what would she do about that the confounded wretch and", "what would she do about that the confounded wretch oh"], ["2277-149896-0022", "later however his old discretion asserted itself", "later however his old discretion asserted itself", "later however his old discretion asserted itself and", "later however his old discretion asserted itself", "later however this old discretion asserted itself", "later however his old discretion asserted itself if"], ["2277-149896-0023", "something had to be done a climax was near and she would not sit idle", "something had to be done a climax was near and she would not sit idle", "something had to be done a climax was near she would not sit idle", "something had to be done a climax was nearer and she would not sit idle", "something had to be done a climax was near as she would not sit idle", "something had to be done the climax was near and she would not sit idle"], ["2277-149896-0024", "he knew her well enough to know that when she had decided upon a plan she would follow it up", "he knew her well enough to know that when she had decided upon a plan she would follow it up", "he knew her well enough to know that when she decided upon a plan she would follow it up", "he knew her well enough to know that when she had decided upon the plan she would follow it up", "he knew her well enough to know that when she had decided up on a plan she would follow it up", "he knew her well enough to know that when she had decided upon plan she would follow it up"], ["2277-149896-0025", "he arose from his chair and went and looked out into the street", "he arose from his chair and went and looked out into the street", "he rose from his chair and went and looked out into the street", "he arose from his chair and went and looked out into his street", "he arose from his chair and went and looked out into the street and", "he arose from his chair and went and looked out into the street it"], ["2277-149896-0026", "the long drizzle had begun pedestrians had turned up collars and trousers at the bottom", "the long drizzle had begun pedestrians had turned up collars and trousers at the bottom", "the long drizzle had begun petersians had turned up collars and trousers at the bottom", "the long drizzle had begun pedestrians had turned up collars and trousers on the bottom", "the long drizzle had begun pedestrian had turned up collars and trousers at the bottom", "the long drizzle had begun pedestrians had turned up collars and trousers in the bottom"], ["2277-149896-0027", "hurstwood almost exclaimed out loud at the insistency of this thing", "hurstwood almost exclaimed out loud at the insistence of this thing", "hurstwood almost exclaimed out loud at the insistency of this thing", "hurstwood almost exclaimed out loud at the insistency of this thing", "hurstwood almost exclaimed out loud at the insistence of this thing", " hurstwood almost exclaimed out loud at the insistence of this thing"], ["2277-149896-0028", "he put on his hat and looked around for his umbrella", "he put on his hat and looked around for his umbrella", "he put on his hat and looked round for his umbrella", "he put on his hat and looked around for his umbrella and", "he put on his shirt and looked around for his umbrella", "he put on his hand and looked around for his umbrella"], ["2277-149896-0029", "he would have some arrangement of this thing", "he would have some arrangement of this thing", "he would have some arrangement of the thing", "he would have some arrangement of this thing he would have some arrangement of this thing", "he would have some arrangement of his thing", "he would have some arrangement of this thing he would have some arrangement of the thing"], ["2277-149896-0030", "he began to wish that he had compromised in some way or other that he had sent the money perhaps he could do it up here", "he began to wish that he had compromised in some way or other that he had sent the money perhaps he could do it up here", "he began to wish he had compromised in some way or other that he had sent the money perhaps he could do it up here", "he began to wish that he had compromised in some way or other than he had sent the money perhaps he could do it up here", "he began to wish that he had compromised in some way or other that he had set the money perhaps he could do it up here", "he began to wish that he had compromised in some way or other that he had sent the money perhaps he would do it up here"], ["2277-149896-0031", "he would go in and see anyhow he would have no row", "he would go in and see anyhow he would have no row", "he would go in and see anyhow he would have no row he", "he would go in and see anyhow he would have no rue", "he would go in and see anyhow he would have no rowhe", "he would go in and see anyhow he would have no row and"], ["2277-149896-0032", "by the time he reached his own street he was keenly alive to the difficulties of his situation and wished over and over that some solution would offer itself that he could see his way out", "by the time he reached his own street he was keenly alive to the difficulties of his situation and wished over and over that some solution would offer itself that he could see his way out", "by the time he reached his own street he was keenly alive to the difficulties of this situation and wished over and over that some solution would offer itself that he could see his way out", "by that time he reached his own street he was keenly alive to the difficulties of his situation and wished over and over that some solution would offer itself that he could see his way out", "by the time he reached his own street he was keenly alive to the difficulties of his situation and wished over and over that some solution would offer itself that he can see his way out", ""], ["2277-149896-0033", "then he rang the bell no answer", "then he rang the bell no answer", "then he ranged the bell no answer", "then he rang the bell no answer i", "then he rang the bell no answer to", "then he rang the bell no answer you"], ["2277-149896-0034", "he rang again this time harder still no answer", "he rang again this time harder still no answer", "he ring again this time harder still no answer", "he ringed again this time harder still no answer", "he rang again this time harder still no answer i", "he ranged again this time harder still no answer"], ["2277-149897-0000", "when hurstwood got back to his office again he was in a greater quandary than ever", "when hurstwood got back to his office again he was in a greater quandary than ever", "when hurstwood got back to his office again he was in a greater quondary than ever", "when hurstwood got back to his office again he was in a greater quadrille than ever", "when hurstwood got back to his office again he was in a greater quadry than ever", "when hurstwood went back to his office again he was in a greater quandary than ever"], ["2277-149897-0001", "he could hardly realise how it had all come about", "he could hardly realise how it had all come about", "he could hardly realize how it had all come about", "he could hardly realise how it had all come about", "he could hardly realise how it had all come above", "he could hardly realize how it had all come above"], ["2277-149897-0002", "no letter had come no word of any kind and yet here it was late in the evening and she had agreed to meet him that morning", "no letter had come no word of any kind and yet here it was late in the evening and she had agreed to meet him that morning", "no letter had come no word of an kind and yet here it was late in the evening and she had agreed to meet him that morning", "no letter had come no word of any kind and yet here it was late in the evening she had agreed to meet him that morning", "no letter had come no word of any kind and yet here it was late at the evening and she had agreed to meet him that morning", "no letter had come no word of the kind and yet here it was late in the evening and she had agreed to meet him that morning"], ["2277-149897-0003", "he saw that in the excitement of recent events he had not formulated a plan upon that score", "he saw that in the excitement of recent events he had not formulated a plan upon that score", "he saw that in the excitement of recent events he had not formulated a plan upon the score", "he saw that in the excitement of recent events he had not formulated a plan upon this score", "he saw that in the excitement of recent events he had not formulated a plan upon his score", "he saw that in the excitement of recent events he had not formulated a plan upon that score and"], ["2277-149897-0004", "he was getting some vague comfort out of a good cigar but it was no panacea for the ill which affected him", "he was getting some vague comfort out of a good cigar but it was no panacea for the ill which affected him", "he was getting some vague comfort out of a good cigar but it was no panatia for the ill which affected him", "he was getting some vague comfort out of a good cigar but it was no panatya for the ill which affected him", "he was getting some vague comfort out of a good cigar but it was no panaty for the ill which affected him", "he was getting some vague comfort out of a good cigar but it was no penatia for the ill which affected him"], ["2277-149897-0005", "it was with great opposition after two or three hours of the most urgent mental affirmation and denial that at last he got an envelope placed in it the requested amount and slowly sealed it up", "it was with great opposition after two or three hours of the most urgent mental affirmation and denial that at last he got an envelope placed in it the requested amount and slowly sealed it up", "it was with great opposition after two or three hours of the most urgent mental affirmation and denial that at last he got an envelope placed in it the requested amount and slowly sealed it up", "it was with great opposition after two or three hours of the most urgent personal affirmation and denial that at last he got an envelope placed in it the requested amount and slowly sealed it up", "it was with great opposition after two or three hours of that most urgent mental affirmation and denial that at last he got an envelope placed in it the requested amount and slowly sealed it up", "it was with great opposition after two or three hours of the most urgent mental affirmation and denial that at last he got an envelope placed in it the requested account and slowly sealed it up"], ["2277-149897-0006", "then he called harry the boy of all work around the place", "then he called harry the boy of all work around the place", "then he called harry the boy of all work round the place", "then he called harry that boy of all work around the place", "then he called harry the boy of all work around the place and", "then he called harry the boy of all work around the place i"], ["2277-149897-0007", "you take this to this address he said handing him the envelope and give it to missus hurstwood yes sir said the boy", "you take this to this address he said handing him the envelope and give it to missus hurstwood yes sir said the boy", "you take this to this address he said handing him the envelope and give it to missus hurstwood yes sir said the boy i", "you take this to this address he said handing him the envelope and give it me to missus hurstwood yes sir said the boy", "you take this to this address he said handing him an envelope and give it to missus hurstwood yes sir said the boy", "you take this to this address he said handing him the envelope an give it to missus hurstwood yes sir said the boy"], ["2277-149897-0008", "any answer i guess not", "any answer i guess not", "any answer i guess no", "any answer i guess not i", "any answer i guess not you", "any answer i guess not to"], ["2277-149897-0009", "the boy hastened away and the manager fell to his musings", "the boy hastened away and the manager fell to his musings", "the boy hasted away and the manager fell to his musings", "the boy hastily away and the manager fell to his musings", "the boy hurried away and the manager fell to his musings", "the boy hastened away and the major fell to his musings"], ["2277-149897-0010", "he was beaten for to night and he might just as well make the best of it", "he was beaten for to night and he might just as well make the best of it", "he was beaten for tonight and he might just as well make the best of it", "he was beaten for to night and he might just as well make best of it", "he was beaten for to night and i might just as well make the best of it", "he was beaten for today and he might just as well make the best of it"], ["2277-149897-0011", "she would take the envelope and know that she had triumphed", "she would take the envelope and know that she had triumphed", "she would take the envelope and know what she had triumphed", "she would take a envelope and know that she had triumphed", "she would take an envelope and know that she had triumphed", "she would take the envelope and know that she had triumphed and"], ["2277-149897-0012", "if he only had that letter back he wouldn't send it", "if he only had that letter back he wouldn't send it", "if he only had that letter back he wouldn't send it", "if he only had that letter backhe wouldn't send it", "if he only had that letter round he wouldn't send it", "if he only had that letter back he wouldn't send itif"], ["2277-149897-0013", "for relief he arose and joined in conversation with a few friends who were drinking", "for relief he arose and joined in the conversation with a few friends who were drinking", "for relief he arose and joined in the conversation with the few friends who were drinking", "for relief he arose and rejoined in the conversation with a few friends who were drinking", "for relief he arose and joined in his conversation with a few friends who were drinking", "or relief he arose and joined in the conversation with a few friends who were drinking"], ["2277-149897-0014", "all the time his thoughts would run out to his home and see the scene being therein enacted", "all the time his thoughts would run out to his home and see the scene being therein enacted", "all this time his thoughts would run out to his home and see the scene being therein enacted", "all the time his thoughts would run out to his home and see the scene being therein enacted", "all the time his thoughts would run out to his home and see that scene being therein enacted", "all that time his thoughts would run out to his home and see the scene being therein enacted"]]}
wandb/run-20220828_085247-2hx8pk65/files/output.log CHANGED
@@ -23303,5 +23303,10315 @@ To disable this warning, you can either:
23303
  - Avoid using `tokenizers` before the fork if possible
23304
  - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
23305
  huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23306
  To disable this warning, you can either:
23307
  - Avoid using `tokenizers` before the fork if possible
 
23303
  - Avoid using `tokenizers` before the fork if possible
23304
  - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
23305
  huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
23306
+ To disable this warning, you can either:
23307
+ - Avoid using `tokenizers` before the fork if possible
23308
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
23309
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
23310
+ To disable this warning, you can either:
23311
+ - Avoid using `tokenizers` before the fork if possible
23312
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
23313
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
23314
+ To disable this warning, you can either:
23315
+ - Avoid using `tokenizers` before the fork if possible
23316
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
23317
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
23318
+ To disable this warning, you can either:
23319
+ - Avoid using `tokenizers` before the fork if possible
23320
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
23321
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
23322
+ To disable this warning, you can either:
23323
+ - Avoid using `tokenizers` before the fork if possible
23324
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
23325
+ Training...: 55% 2428/4393 [3:21:24<91:44:40, 168.08s/it]
23326
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
23327
+ To disable this warning, you can either:
23328
+ - Avoid using `tokenizers` before the fork if possible
23329
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
23330
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
23331
+ To disable this warning, you can either:
23332
+ - Avoid using `tokenizers` before the fork if possible
23333
+ return jax.tree_map(/4393 [3:21:24<91:44:40, 168.08s/it]
23334
+
23335
+
23336
+
23337
+
23338
+
23339
+
23340
+
23341
+
23342
+
23343
+
23344
+
23345
+
23346
+
23347
+
23348
+
23349
+
23350
+
23351
+
23352
+
23353
+
23354
+
23355
+
23356
+ return jax.tree_map(lambda x: x[0], tree)7, 4.17s/it]
23357
+ run_flax_speech_recognition_seq2seq.py:336: FutureWarning: jax.tree_map is deprecated, and will be removed in a future release. Use jax.tree_util.tree_map instead.
23358
+ return jax.tree_map(lambda x: x.astype(jnp.float32) if x.dtype == jnp.bfloat16 else x, t)
23359
+ Step... (20000/50000 | Eval Loss: 1.021510124206543 | Eval wer: 0.05054961214661226 | Eval cer: 0.0362100285658818 |): 33% 4/12 [26:43:52<46:43:02, 21022.76s/it]
23360
+
23361
+
23362
+
23363
+
23364
+
23365
+
23366
+
23367
+
23368
+
23369
+
23370
+
23371
+
23372
+
23373
+
23374
+
23375
+
23376
+
23377
+
23378
+
23379
+
23380
+
23381
+
23382
+
23383
+
23384
+
23385
+
23386
+
23387
+
23388
+
23389
+
23390
+
23391
+
23392
+
23393
+
23394
+
23395
+
23396
+
23397
+
23398
+
23399
+
23400
+
23401
+
23402
+
23403
+
23404
+
23405
+
23406
+
23407
+
23408
+
23409
+
23410
+
23411
+
23412
+
23413
+
23414
+
23415
+
23416
+
23417
+
23418
+
23419
+
23420
+
23421
+
23422
+
23423
+
23424
+
23425
+
23426
+
23427
+
23428
+
23429
+
23430
+
23431
+
23432
+
23433
+
23434
+
23435
+
23436
+
23437
+
23438
+
23439
+
23440
+
23441
+
23442
+
23443
+
23444
+
23445
+
23446
+
23447
+
23448
+
23449
+
23450
+
23451
+
23452
+
23453
+
23454
+
23455
+
23456
+
23457
+
23458
+
23459
+
23460
+
23461
+
23462
+
23463
+
23464
+
23465
+
23466
+
23467
+
23468
+
23469
+
23470
+
23471
+
23472
+
23473
+
23474
+
23475
+
23476
+
23477
+
23478
+
23479
+
23480
+
23481
+
23482
+
23483
+
23484
+
23485
+
23486
+
23487
+
23488
+
23489
+
23490
+
23491
+
23492
+
23493
+
23494
+
23495
+
23496
+
23497
+
23498
+
23499
+
23500
+
23501
+
23502
+
23503
+
23504
+
23505
+
23506
+
23507
+
23508
+
23509
+
23510
+
23511
+
23512
+
23513
+
23514
+
23515
+
23516
+
23517
+
23518
+
23519
+
23520
+
23521
+
23522
+
23523
+
23524
+
23525
+
23526
+
23527
+
23528
+
23529
+
23530
+
23531
+
23532
+
23533
+
23534
+
23535
+
23536
+
23537
+
23538
+
23539
+
23540
+
23541
+
23542
+
23543
+
23544
+
23545
+
23546
+
23547
+
23548
+
23549
+
23550
+
23551
+
23552
+
23553
+
23554
+
23555
+
23556
+
23557
+
23558
+
23559
+
23560
+
23561
+
23562
+
23563
+
23564
+
23565
+
23566
+
23567
+
23568
+
23569
+
23570
+
23571
+
23572
+
23573
+
23574
+
23575
+
23576
+
23577
+
23578
+
23579
+
23580
+
23581
+
23582
+
23583
+
23584
+
23585
+
23586
+
23587
+
23588
+
23589
+
23590
+
23591
+
23592
+
23593
+
23594
+
23595
+
23596
+
23597
+
23598
+
23599
+
23600
+
23601
+
23602
+
23603
+
23604
+
23605
+
23606
+
23607
+
23608
+
23609
+
23610
+
23611
+
23612
+
23613
+
23614
+
23615
+
23616
+
23617
+
23618
+
23619
+
23620
+
23621
+
23622
+
23623
+
23624
+
23625
+
23626
+
23627
+
23628
+
23629
+
23630
+
23631
+
23632
+
23633
+
23634
+
23635
+
23636
+
23637
+
23638
+
23639
+
23640
+
23641
+
23642
+
23643
+
23644
+
23645
+
23646
+
23647
+
23648
+
23649
+
23650
+
23651
+
23652
+
23653
+
23654
+
23655
+
23656
+
23657
+
23658
+
23659
+
23660
+
23661
+
23662
+
23663
+
23664
+
23665
+
23666
+
23667
+
23668
+
23669
+
23670
+
23671
+
23672
+
23673
+
23674
+
23675
+
23676
+
23677
+
23678
+
23679
+
23680
+
23681
+
23682
+
23683
+
23684
+
23685
+
23686
+
23687
+
23688
+
23689
+
23690
+
23691
+
23692
+
23693
+
23694
+
23695
+
23696
+
23697
+
23698
+
23699
+
23700
+
23701
+
23702
+
23703
+
23704
+
23705
+
23706
+
23707
+
23708
+
23709
+
23710
+
23711
+
23712
+
23713
+
23714
+
23715
+
23716
+
23717
+
23718
+
23719
+
23720
+
23721
+
23722
+
23723
+
23724
+
23725
+
23726
+
23727
+
23728
+
23729
+
23730
+
23731
+
23732
+
23733
+
23734
+
23735
+
23736
+
23737
+
23738
+
23739
+
23740
+
23741
+
23742
+
23743
+
23744
+
23745
+
23746
+
23747
+
23748
+
23749
+
23750
+
23751
+
23752
+
23753
+
23754
+
23755
+
23756
+
23757
+
23758
+
23759
+
23760
+
23761
+
23762
+
23763
+
23764
+
23765
+
23766
+
23767
+
23768
+
23769
+
23770
+
23771
+
23772
+
23773
+
23774
+
23775
+
23776
+
23777
+
23778
+
23779
+
23780
+
23781
+
23782
+
23783
+
23784
+
23785
+
23786
+
23787
+
23788
+
23789
+
23790
+
23791
+
23792
+
23793
+
23794
+
23795
+
23796
+
23797
+
23798
+
23799
+
23800
+
23801
+
23802
+
23803
+
23804
+
23805
+
23806
+
23807
+
23808
+
23809
+
23810
+
23811
+
23812
+
23813
+
23814
+
23815
+
23816
+
23817
+
23818
+
23819
+
23820
+
23821
+
23822
+
23823
+
23824
+
23825
+
23826
+
23827
+
23828
+
23829
+
23830
+
23831
+
23832
+
23833
+
23834
+
23835
+
23836
+
23837
+
23838
+
23839
+
23840
+
23841
+
23842
+
23843
+
23844
+
23845
+
23846
+
23847
+
23848
+
23849
+
23850
+
23851
+
23852
+
23853
+
23854
+
23855
+
23856
+
23857
+
23858
+
23859
+
23860
+
23861
+
23862
+
23863
+
23864
+
23865
+
23866
+
23867
+
23868
+
23869
+
23870
+
23871
+
23872
+
23873
+
23874
+
23875
+
23876
+
23877
+
23878
+
23879
+
23880
+
23881
+
23882
+
23883
+
23884
+
23885
+
23886
+
23887
+
23888
+
23889
+
23890
+
23891
+
23892
+
23893
+
23894
+
23895
+
23896
+
23897
+
23898
+
23899
+
23900
+
23901
+
23902
+
23903
+
23904
+
23905
+
23906
+
23907
+
23908
+
23909
+
23910
+
23911
+
23912
+
23913
+
23914
+
23915
+
23916
+
23917
+
23918
+
23919
+
23920
+
23921
+
23922
+
23923
+
23924
+
23925
+
23926
+
23927
+
23928
+
23929
+
23930
+
23931
+
23932
+
23933
+
23934
+
23935
+
23936
+
23937
+
23938
+
23939
+
23940
+
23941
+
23942
+
23943
+
23944
+
23945
+
23946
+
23947
+
23948
+
23949
+
23950
+
23951
+
23952
+
23953
+
23954
+
23955
+
23956
+
23957
+
23958
+
23959
+
23960
+
23961
+
23962
+
23963
+
23964
+
23965
+
23966
+
23967
+
23968
+
23969
+
23970
+
23971
+
23972
+
23973
+
23974
+
23975
+
23976
+
23977
+
23978
+
23979
+
23980
+
23981
+
23982
+
23983
+
23984
+
23985
+
23986
+
23987
+
23988
+
23989
+
23990
+
23991
+
23992
+
23993
+
23994
+
23995
+
23996
+
23997
+
23998
+
23999
+
24000
+
24001
+
24002
+
24003
+
24004
+
24005
+
24006
+
24007
+
24008
+
24009
+
24010
+
24011
+
24012
+
24013
+
24014
+
24015
+
24016
+
24017
+
24018
+
24019
+
24020
+
24021
+
24022
+
24023
+
24024
+
24025
+
24026
+
24027
+
24028
+
24029
+
24030
+
24031
+
24032
+
24033
+
24034
+
24035
+
24036
+
24037
+
24038
+
24039
+
24040
+
24041
+
24042
+
24043
+
24044
+
24045
+
24046
+
24047
+
24048
+
24049
+
24050
+
24051
+
24052
+
24053
+
24054
+
24055
+
24056
+
24057
+
24058
+
24059
+
24060
+
24061
+
24062
+
24063
+
24064
+
24065
+
24066
+
24067
+
24068
+
24069
+
24070
+
24071
+
24072
+
24073
+
24074
+
24075
+
24076
+
24077
+
24078
+
24079
+
24080
+
24081
+
24082
+
24083
+
24084
+
24085
+
24086
+
24087
+
24088
+
24089
+
24090
+
24091
+
24092
+
24093
+
24094
+
24095
+
24096
+
24097
+
24098
+
24099
+
24100
+
24101
+
24102
+
24103
+
24104
+
24105
+
24106
+
24107
+
24108
+
24109
+
24110
+
24111
+
24112
+
24113
+
24114
+
24115
+
24116
+
24117
+
24118
+
24119
+
24120
+
24121
+
24122
+
24123
+
24124
+
24125
+
24126
+
24127
+
24128
+
24129
+
24130
+
24131
+
24132
+
24133
+
24134
+
24135
+
24136
+
24137
+
24138
+
24139
+
24140
+
24141
+
24142
+
24143
+
24144
+
24145
+
24146
+
24147
+
24148
+
24149
+
24150
+
24151
+
24152
+
24153
+
24154
+
24155
+
24156
+
24157
+
24158
+
24159
+
24160
+
24161
+
24162
+
24163
+
24164
+
24165
+
24166
+
24167
+
24168
+
24169
+
24170
+
24171
+
24172
+
24173
+
24174
+
24175
+
24176
+
24177
+
24178
+
24179
+
24180
+
24181
+
24182
+
24183
+
24184
+
24185
+
24186
+
24187
+
24188
+
24189
+
24190
+
24191
+
24192
+
24193
+
24194
+
24195
+
24196
+
24197
+
24198
+
24199
+
24200
+
24201
+
24202
+
24203
+
24204
+
24205
+
24206
+
24207
+
24208
+
24209
+
24210
+
24211
+
24212
+
24213
+
24214
+
24215
+
24216
+
24217
+
24218
+
24219
+
24220
+
24221
+
24222
+
24223
+
24224
+
24225
+
24226
+
24227
+
24228
+
24229
+
24230
+
24231
+
24232
+
24233
+
24234
+
24235
+
24236
+
24237
+
24238
+
24239
+
24240
+
24241
+
24242
+
24243
+
24244
+
24245
+
24246
+
24247
+
24248
+
24249
+
24250
+
24251
+
24252
+
24253
+
24254
+
24255
+
24256
+
24257
+
24258
+
24259
+
24260
+
24261
+
24262
+
24263
+
24264
+
24265
+
24266
+
24267
+
24268
+
24269
+
24270
+
24271
+
24272
+
24273
+
24274
+
24275
+
24276
+
24277
+
24278
+
24279
+
24280
+
24281
+
24282
+
24283
+
24284
+
24285
+
24286
+
24287
+
24288
+
24289
+
24290
+
24291
+
24292
+
24293
+
24294
+
24295
+
24296
+
24297
+
24298
+
24299
+
24300
+
24301
+
24302
+
24303
+
24304
+
24305
+
24306
+
24307
+
24308
+
24309
+
24310
+
24311
+
24312
+
24313
+
24314
+
24315
+
24316
+
24317
+
24318
+
24319
+
24320
+
24321
+
24322
+
24323
+
24324
+
24325
+
24326
+
24327
+
24328
+
24329
+
24330
+
24331
+
24332
+
24333
+
24334
+
24335
+
24336
+
24337
+
24338
+
24339
+
24340
+
24341
+
24342
+
24343
+
24344
+
24345
+
24346
+
24347
+
24348
+
24349
+
24350
+
24351
+
24352
+
24353
+
24354
+
24355
+
24356
+
24357
+
24358
+
24359
+
24360
+
24361
+
24362
+
24363
+
24364
+
24365
+
24366
+
24367
+
24368
+
24369
+
24370
+
24371
+
24372
+
24373
+
24374
+
24375
+
24376
+
24377
+
24378
+
24379
+
24380
+
24381
+
24382
+
24383
+
24384
+
24385
+
24386
+
24387
+
24388
+
24389
+
24390
+
24391
+
24392
+
24393
+
24394
+
24395
+
24396
+
24397
+
24398
+
24399
+
24400
+
24401
+
24402
+
24403
+
24404
+
24405
+
24406
+
24407
+
24408
+
24409
+
24410
+
24411
+
24412
+
24413
+
24414
+
24415
+
24416
+
24417
+
24418
+
24419
+
24420
+
24421
+
24422
+
24423
+
24424
+
24425
+
24426
+
24427
+
24428
+
24429
+
24430
+
24431
+
24432
+
24433
+
24434
+
24435
+
24436
+
24437
+
24438
+
24439
+
24440
+
24441
+
24442
+
24443
+
24444
+
24445
+
24446
+
24447
+
24448
+
24449
+
24450
+
24451
+
24452
+
24453
+
24454
+
24455
+
24456
+
24457
+
24458
+
24459
+
24460
+
24461
+
24462
+
24463
+
24464
+
24465
+
24466
+
24467
+
24468
+
24469
+
24470
+
24471
+
24472
+
24473
+
24474
+
24475
+
24476
+
24477
+
24478
+
24479
+
24480
+
24481
+
24482
+
24483
+
24484
+
24485
+
24486
+
24487
+
24488
+
24489
+
24490
+
24491
+
24492
+
24493
+
24494
+
24495
+
24496
+
24497
+
24498
+
24499
+
24500
+
24501
+
24502
+
24503
+
24504
+
24505
+
24506
+
24507
+
24508
+
24509
+
24510
+
24511
+
24512
+
24513
+
24514
+
24515
+
24516
+
24517
+
24518
+
24519
+
24520
+
24521
+
24522
+
24523
+
24524
+
24525
+
24526
+
24527
+
24528
+
24529
+
24530
+
24531
+
24532
+
24533
+
24534
+
24535
+
24536
+
24537
+
24538
+
24539
+
24540
+
24541
+
24542
+
24543
+
24544
+
24545
+
24546
+
24547
+
24548
+
24549
+
24550
+
24551
+
24552
+
24553
+
24554
+
24555
+
24556
+
24557
+
24558
+
24559
+
24560
+
24561
+
24562
+
24563
+
24564
+
24565
+
24566
+
24567
+
24568
+
24569
+
24570
+
24571
+
24572
+
24573
+
24574
+
24575
+
24576
+
24577
+
24578
+
24579
+
24580
+
24581
+
24582
+
24583
+
24584
+
24585
+
24586
+
24587
+
24588
+
24589
+
24590
+
24591
+
24592
+
24593
+
24594
+
24595
+
24596
+
24597
+
24598
+
24599
+
24600
+
24601
+
24602
+
24603
+
24604
+
24605
+
24606
+
24607
+
24608
+
24609
+
24610
+
24611
+
24612
+
24613
+
24614
+
24615
+
24616
+
24617
+
24618
+
24619
+
24620
+
24621
+
24622
+
24623
+
24624
+
24625
+
24626
+
24627
+
24628
+
24629
+
24630
+
24631
+
24632
+
24633
+
24634
+
24635
+
24636
+
24637
+
24638
+
24639
+
24640
+
24641
+
24642
+
24643
+
24644
+
24645
+
24646
+
24647
+
24648
+
24649
+
24650
+
24651
+
24652
+
24653
+
24654
+
24655
+
24656
+
24657
+
24658
+
24659
+
24660
+
24661
+
24662
+
24663
+
24664
+
24665
+
24666
+
24667
+
24668
+
24669
+
24670
+
24671
+
24672
+
24673
+
24674
+
24675
+
24676
+
24677
+
24678
+
24679
+
24680
+
24681
+
24682
+
24683
+
24684
+
24685
+
24686
+
24687
+
24688
+
24689
+
24690
+
24691
+
24692
+
24693
+
24694
+
24695
+
24696
+
24697
+
24698
+
24699
+
24700
+
24701
+
24702
+
24703
+
24704
+
24705
+
24706
+
24707
+
24708
+
24709
+
24710
+
24711
+
24712
+
24713
+
24714
+
24715
+
24716
+
24717
+
24718
+
24719
+
24720
+
24721
+
24722
+
24723
+
24724
+
24725
+
24726
+
24727
+
24728
+
24729
+
24730
+
24731
+
24732
+
24733
+
24734
+
24735
+
24736
+
24737
+
24738
+
24739
+
24740
+
24741
+
24742
+
24743
+
24744
+
24745
+
24746
+
24747
+
24748
+
24749
+
24750
+
24751
+
24752
+
24753
+
24754
+
24755
+
24756
+
24757
+
24758
+
24759
+
24760
+
24761
+
24762
+
24763
+
24764
+
24765
+
24766
+
24767
+
24768
+
24769
+
24770
+
24771
+
24772
+
24773
+
24774
+
24775
+
24776
+
24777
+
24778
+
24779
+
24780
+
24781
+
24782
+
24783
+
24784
+
24785
+
24786
+
24787
+
24788
+
24789
+
24790
+
24791
+
24792
+
24793
+
24794
+
24795
+
24796
+
24797
+
24798
+
24799
+
24800
+
24801
+
24802
+
24803
+
24804
+
24805
+
24806
+
24807
+
24808
+
24809
+
24810
+
24811
+
24812
+
24813
+
24814
+
24815
+
24816
+
24817
+
24818
+
24819
+
24820
+
24821
+
24822
+
24823
+
24824
+
24825
+
24826
+
24827
+
24828
+
24829
+
24830
+
24831
+
24832
+
24833
+
24834
+
24835
+
24836
+
24837
+
24838
+
24839
+
24840
+
24841
+
24842
+
24843
+
24844
+
24845
+
24846
+
24847
+
24848
+
24849
+
24850
+
24851
+
24852
+
24853
+
24854
+
24855
+
24856
+
24857
+
24858
+
24859
+
24860
+
24861
+
24862
+
24863
+
24864
+
24865
+
24866
+
24867
+
24868
+
24869
+
24870
+
24871
+
24872
+
24873
+
24874
+
24875
+
24876
+
24877
+
24878
+
24879
+
24880
+
24881
+
24882
+
24883
+
24884
+
24885
+
24886
+
24887
+
24888
+
24889
+
24890
+
24891
+
24892
+
24893
+
24894
+
24895
+
24896
+
24897
+
24898
+
24899
+
24900
+
24901
+
24902
+
24903
+
24904
+
24905
+
24906
+
24907
+
24908
+
24909
+
24910
+
24911
+
24912
+
24913
+
24914
+
24915
+
24916
+
24917
+
24918
+
24919
+
24920
+
24921
+
24922
+
24923
+
24924
+
24925
+
24926
+
24927
+
24928
+
24929
+
24930
+
24931
+
24932
+
24933
+
24934
+
24935
+
24936
+
24937
+
24938
+
24939
+
24940
+
24941
+
24942
+
24943
+
24944
+
24945
+
24946
+
24947
+
24948
+
24949
+
24950
+
24951
+
24952
+
24953
+
24954
+
24955
+
24956
+
24957
+
24958
+
24959
+
24960
+
24961
+
24962
+
24963
+
24964
+
24965
+
24966
+
24967
+
24968
+
24969
+
24970
+
24971
+
24972
+
24973
+
24974
+
24975
+
24976
+
24977
+
24978
+
24979
+
24980
+
24981
+
24982
+
24983
+
24984
+
24985
+
24986
+
24987
+
24988
+
24989
+
24990
+
24991
+
24992
+
24993
+
24994
+
24995
+
24996
+
24997
+
24998
+
24999
+
25000
+
25001
+
25002
+
25003
+
25004
+
25005
+
25006
+
25007
+
25008
+
25009
+
25010
+
25011
+
25012
+
25013
+
25014
+
25015
+
25016
+
25017
+ Training...: 95% 4152/4393 [5:36:46<15:55, 3.96s/it]
25018
+ Step... (20000/50000 | Eval Loss: 1.021510124206543 | Eval wer: 0.05054961214661226 | Eval cer: 0.0362100285658818 |)
25019
+ Step... (20025 | Loss: 0.04840395227074623, Learning Rate: 6.055757694412023e-05, Gradient Norm: 0.3350682854652405)
25020
+ Step... (20050 | Loss: 0.040021102875471115, Learning Rate: 6.0507070884341374e-05, Gradient Norm: 0.3626783788204193)
25021
+ Step... (20075 | Loss: 0.06721880286931992, Learning Rate: 6.045656118658371e-05, Gradient Norm: 0.661290168762207)
25022
+ Step... (20100 | Loss: 0.039792388677597046, Learning Rate: 6.040606240276247e-05, Gradient Norm: 0.41450798511505127)
25023
+ Step... (20125 | Loss: 0.06491632759571075, Learning Rate: 6.035555270500481e-05, Gradient Norm: 0.5559267401695251)
25024
+ Step... (20150 | Loss: 0.04650925472378731, Learning Rate: 6.030504664522596e-05, Gradient Norm: 0.35466456413269043)
25025
+ Step... (20175 | Loss: 0.04336107149720192, Learning Rate: 6.025454786140472e-05, Gradient Norm: 0.3757917582988739)
25026
+ Step... (20200 | Loss: 0.05388178676366806, Learning Rate: 6.0204038163647056e-05, Gradient Norm: 0.39275360107421875)
25027
+ Step... (20225 | Loss: 0.033549536019563675, Learning Rate: 6.01535321038682e-05, Gradient Norm: 0.33024901151657104)
25028
+ Step... (20250 | Loss: 0.04022737964987755, Learning Rate: 6.010303332004696e-05, Gradient Norm: 0.31290462613105774)
25029
+ Step... (20275 | Loss: 0.04750262200832367, Learning Rate: 6.00525236222893e-05, Gradient Norm: 0.303996205329895)
25030
+ Step... (20300 | Loss: 0.05061691999435425, Learning Rate: 6.000201392453164e-05, Gradient Norm: 0.5085467100143433)
25031
+ Step... (20325 | Loss: 0.03429495915770531, Learning Rate: 5.995151150273159e-05, Gradient Norm: 0.4001585841178894)
25032
+ Step... (20350 | Loss: 0.04250844568014145, Learning Rate: 5.9901009080931544e-05, Gradient Norm: 0.3081212639808655)
25033
+ Step... (20375 | Loss: 0.03005947545170784, Learning Rate: 5.985049938317388e-05, Gradient Norm: 0.25889503955841064)
25034
+ Step... (20400 | Loss: 0.03023005835711956, Learning Rate: 5.9799996961373836e-05, Gradient Norm: 0.323447048664093)
25035
+ Step... (20425 | Loss: 0.03575645387172699, Learning Rate: 5.974949453957379e-05, Gradient Norm: 0.28628385066986084)
25036
+ Step... (20450 | Loss: 0.03535941615700722, Learning Rate: 5.969898484181613e-05, Gradient Norm: 0.29935961961746216)
25037
+ Step... (20475 | Loss: 0.035111647099256516, Learning Rate: 5.964848242001608e-05, Gradient Norm: 0.32121407985687256)
25038
+ Step... (20500 | Loss: 0.07113418728113174, Learning Rate: 5.959797999821603e-05, Gradient Norm: 0.7147608399391174)
25039
+ Step... (20525 | Loss: 0.04110198840498924, Learning Rate: 5.954747030045837e-05, Gradient Norm: 0.2974580228328705)
25040
+ Step... (20550 | Loss: 0.06231129169464111, Learning Rate: 5.9496967878658324e-05, Gradient Norm: 0.40457063913345337)
25041
+ Step... (20575 | Loss: 0.03932349756360054, Learning Rate: 5.944646545685828e-05, Gradient Norm: 0.28113406896591187)
25042
+ Step... (20600 | Loss: 0.035716500133275986, Learning Rate: 5.9395955759100616e-05, Gradient Norm: 0.30872511863708496)
25043
+ Step... (20625 | Loss: 0.0612182579934597, Learning Rate: 5.934545333730057e-05, Gradient Norm: 0.4794360101222992)
25044
+ Step... (20650 | Loss: 0.06335697323083878, Learning Rate: 5.929495091550052e-05, Gradient Norm: 0.9689046144485474)
25045
+ Step... (20675 | Loss: 0.056457117199897766, Learning Rate: 5.924444121774286e-05, Gradient Norm: 0.4037126302719116)
25046
+ Step... (20700 | Loss: 0.04219113290309906, Learning Rate: 5.919393879594281e-05, Gradient Norm: 0.49446389079093933)
25047
+ Step... (20725 | Loss: 0.03880039229989052, Learning Rate: 5.914343273616396e-05, Gradient Norm: 0.522761344909668)
25048
+ Step... (20750 | Loss: 0.051740579307079315, Learning Rate: 5.9092926676385105e-05, Gradient Norm: 0.4382433295249939)
25049
+ Step... (20775 | Loss: 0.04807671159505844, Learning Rate: 5.904242425458506e-05, Gradient Norm: 0.3645383417606354)
25050
+ Step... (20800 | Loss: 0.03817301243543625, Learning Rate: 5.89919181948062e-05, Gradient Norm: 0.3344947099685669)
25051
+ Step... (20825 | Loss: 0.04550916701555252, Learning Rate: 5.894141213502735e-05, Gradient Norm: 0.30412057042121887)
25052
+ Step... (20850 | Loss: 0.02587965875864029, Learning Rate: 5.8890906075248495e-05, Gradient Norm: 0.2640466094017029)
25053
+ Step... (20875 | Loss: 0.04702045023441315, Learning Rate: 5.884040365344845e-05, Gradient Norm: 0.405086487531662)
25054
+ Step... (20900 | Loss: 0.03893226757645607, Learning Rate: 5.8789893955690786e-05, Gradient Norm: 0.2916216552257538)
25055
+ Step... (20925 | Loss: 0.05458155274391174, Learning Rate: 5.873939153389074e-05, Gradient Norm: 0.5137640237808228)
25056
+ Step... (20950 | Loss: 0.026842793449759483, Learning Rate: 5.868888911209069e-05, Gradient Norm: 0.29405421018600464)
25057
+ Step... (20975 | Loss: 0.04555520787835121, Learning Rate: 5.863837941433303e-05, Gradient Norm: 0.39362606406211853)
25058
+ Step... (21000 | Loss: 0.06246699020266533, Learning Rate: 5.8587876992532983e-05, Gradient Norm: 0.6405248641967773)
25059
+ Step... (21025 | Loss: 0.07497163116931915, Learning Rate: 5.8537374570732936e-05, Gradient Norm: 0.649531364440918)
25060
+ Step... (21050 | Loss: 0.05232103168964386, Learning Rate: 5.8486864872975275e-05, Gradient Norm: 0.40732765197753906)
25061
+ Step... (21075 | Loss: 0.055781468749046326, Learning Rate: 5.843636245117523e-05, Gradient Norm: 0.3719920217990875)
25062
+ Step... (21100 | Loss: 0.04000868275761604, Learning Rate: 5.838586002937518e-05, Gradient Norm: 0.30638593435287476)
25063
+ Step... (21125 | Loss: 0.0311842430382967, Learning Rate: 5.833535033161752e-05, Gradient Norm: 0.31067410111427307)
25064
+ Step... (21150 | Loss: 0.06602907925844193, Learning Rate: 5.828484790981747e-05, Gradient Norm: 0.4416637122631073)
25065
+ Step... (21175 | Loss: 0.04677121713757515, Learning Rate: 5.8234345488017425e-05, Gradient Norm: 0.5177642107009888)
25066
+ Step... (21200 | Loss: 0.045037493109703064, Learning Rate: 5.8183835790259764e-05, Gradient Norm: 0.34402015805244446)
25067
+ Step... (21225 | Loss: 0.03223634138703346, Learning Rate: 5.8133333368459716e-05, Gradient Norm: 0.3138290047645569)
25068
+ Step... (21250 | Loss: 0.0355977863073349, Learning Rate: 5.808283094665967e-05, Gradient Norm: 0.2971906363964081)
25069
+ Step... (21275 | Loss: 0.07619132101535797, Learning Rate: 5.803232124890201e-05, Gradient Norm: 0.5961235165596008)
25070
+ Step... (21300 | Loss: 0.031767457723617554, Learning Rate: 5.798181882710196e-05, Gradient Norm: 0.3297065496444702)
25071
+ Step... (21325 | Loss: 0.04114881902933121, Learning Rate: 5.7931312767323107e-05, Gradient Norm: 0.29805976152420044)
25072
+ Step... (21350 | Loss: 0.06210632622241974, Learning Rate: 5.788080670754425e-05, Gradient Norm: 0.4565739035606384)
25073
+ Step... (21375 | Loss: 0.05203666165471077, Learning Rate: 5.78303006477654e-05, Gradient Norm: 0.5167878270149231)
25074
+ Step... (21400 | Loss: 0.04761577770113945, Learning Rate: 5.777979822596535e-05, Gradient Norm: 0.7846924662590027)
25075
+ Step... (21425 | Loss: 0.05161000043153763, Learning Rate: 5.77292921661865e-05, Gradient Norm: 0.4291480779647827)
25076
+ Step... (21450 | Loss: 0.04251234605908394, Learning Rate: 5.767878610640764e-05, Gradient Norm: 0.402444064617157)
25077
+ Step... (21475 | Loss: 0.05318749323487282, Learning Rate: 5.7628283684607595e-05, Gradient Norm: 0.4376218020915985)
25078
+ Step... (21500 | Loss: 0.04826676845550537, Learning Rate: 5.7577773986849934e-05, Gradient Norm: 0.38663220405578613)
25079
+ Step... (21525 | Loss: 0.06910662353038788, Learning Rate: 5.752727156504989e-05, Gradient Norm: 0.44211477041244507)
25080
+ Step... (21550 | Loss: 0.06470490247011185, Learning Rate: 5.747676914324984e-05, Gradient Norm: 0.4820682406425476)
25081
+ Step... (21575 | Loss: 0.036486972123384476, Learning Rate: 5.742625944549218e-05, Gradient Norm: 0.3340790271759033)
25082
+ Step... (21600 | Loss: 0.043834902346134186, Learning Rate: 5.737575702369213e-05, Gradient Norm: 0.40074360370635986)
25083
+ Step... (21625 | Loss: 0.029955435544252396, Learning Rate: 5.7325254601892084e-05, Gradient Norm: 0.3754229247570038)
25084
+ Step... (21650 | Loss: 0.039664458483457565, Learning Rate: 5.727474490413442e-05, Gradient Norm: 0.4904446005821228)
25085
+ Step... (21675 | Loss: 0.04634978622198105, Learning Rate: 5.7224242482334375e-05, Gradient Norm: 0.40155136585235596)
25086
+
25087
+
25088
+
25089
+
25090
+
25091
+
25092
+
25093
+
25094
+
25095
+
25096
+
25097
+
25098
+
25099
+
25100
+
25101
+
25102
+
25103
+
25104
+
25105
+
25106
+
25107
+
25108
+
25109
+
25110
+
25111
+
25112
+
25113
+
25114
+
25115
+
25116
+
25117
+
25118
+
25119
+
25120
+
25121
+
25122
+
25123
+
25124
+
25125
+
25126
+
25127
+
25128
+
25129
+
25130
+
25131
+
25132
+
25133
+
25134
+
25135
+
25136
+
25137
+
25138
+
25139
+
25140
+
25141
+
25142
+
25143
+
25144
+
25145
+
25146
+
25147
+
25148
+
25149
+
25150
+
25151
+
25152
+
25153
+
25154
+
25155
+
25156
+
25157
+
25158
+
25159
+
25160
+
25161
+
25162
+
25163
+
25164
+
25165
+
25166
+
25167
+
25168
+
25169
+
25170
+
25171
+
25172
+
25173
+
25174
+
25175
+
25176
+
25177
+
25178
+
25179
+
25180
+
25181
+
25182
+
25183
+
25184
+
25185
+
25186
+
25187
+
25188
+
25189
+
25190
+
25191
+
25192
+
25193
+
25194
+
25195
+
25196
+
25197
+
25198
+
25199
+
25200
+
25201
+
25202
+
25203
+
25204
+
25205
+
25206
+
25207
+
25208
+
25209
+
25210
+
25211
+
25212
+
25213
+
25214
+
25215
+
25216
+
25217
+
25218
+
25219
+
25220
+
25221
+
25222
+
25223
+
25224
+
25225
+
25226
+
25227
+
25228
+
25229
+
25230
+
25231
+
25232
+
25233
+
25234
+
25235
+
25236
+
25237
+
25238
+
25239
+
25240
+
25241
+
25242
+
25243
+
25244
+
25245
+
25246
+
25247
+
25248
+
25249
+
25250
+
25251
+
25252
+
25253
+
25254
+
25255
+
25256
+
25257
+
25258
+
25259
+
25260
+
25261
+
25262
+
25263
+
25264
+
25265
+
25266
+
25267
+
25268
+
25269
+
25270
+
25271
+
25272
+
25273
+
25274
+
25275
+
25276
+
25277
+
25278
+
25279
+
25280
+
25281
+
25282
+
25283
+
25284
+
25285
+
25286
+
25287
+
25288
+
25289
+
25290
+
25291
+
25292
+
25293
+
25294
+
25295
+
25296
+
25297
+
25298
+
25299
+
25300
+
25301
+
25302
+
25303
+
25304
+
25305
+
25306
+
25307
+
25308
+
25309
+
25310
+
25311
+
25312
+
25313
+
25314
+
25315
+
25316
+
25317
+
25318
+
25319
+
25320
+
25321
+
25322
+ Step... (20000/50000 | Eval Loss: 1.021510124206543 | Eval wer: 0.05054961214661226 | Eval cer: 0.0362100285658818 |): 42% 5/12 [29:16:34<41:06:17, 21139.65s/it]
25323
+ Training...: 0% 0/4393 [00:00<?, ?it/s]
25324
+ Step... (21725 | Loss: 0.044285327196121216, Learning Rate: 5.712323036277667e-05, Gradient Norm: 0.30181413888931274)
25325
+ Step... (21750 | Loss: 0.04501153156161308, Learning Rate: 5.707272794097662e-05, Gradient Norm: 0.3694307208061218)
25326
+ Step... (21775 | Loss: 0.0468716062605381, Learning Rate: 5.702222551917657e-05, Gradient Norm: 0.4017857015132904)
25327
+ Step... (21800 | Loss: 0.08984293788671494, Learning Rate: 5.697171582141891e-05, Gradient Norm: 0.7361651062965393)
25328
+ Step... (21825 | Loss: 0.06769392639398575, Learning Rate: 5.692120612366125e-05, Gradient Norm: 0.44246554374694824)
25329
+ Step... (21850 | Loss: 0.04700716957449913, Learning Rate: 5.687071097781882e-05, Gradient Norm: 0.3806375563144684)
25330
+ Step... (21875 | Loss: 0.052144188433885574, Learning Rate: 5.6820201280061156e-05, Gradient Norm: 0.5597290992736816)
25331
+ Step... (21900 | Loss: 0.04265350103378296, Learning Rate: 5.6769691582303494e-05, Gradient Norm: 0.7775082588195801)
25332
+ Step... (21925 | Loss: 0.07321219891309738, Learning Rate: 5.6719192798482254e-05, Gradient Norm: 0.4794605076313019)
25333
+
25334
+
25335
+
25336
+
25337
+
25338
+
25339
+
25340
+
25341
+
25342
+
25343
+
25344
+
25345
+
25346
+
25347
+
25348
+
25349
+
25350
+
25351
+
25352
+
25353
+
25354
+
25355
+
25356
+
25357
+
25358
+
25359
+
25360
+
25361
+
25362
+
25363
+
25364
+
25365
+
25366
+
25367
+
25368
+
25369
+
25370
+
25371
+
25372
+
25373
+
25374
+
25375
+
25376
+
25377
+
25378
+
25379
+
25380
+
25381
+
25382
+
25383
+
25384
+
25385
+
25386
+
25387
+
25388
+
25389
+
25390
+
25391
+
25392
+
25393
+
25394
+
25395
+
25396
+
25397
+
25398
+
25399
+
25400
+
25401
+
25402
+
25403
+
25404
+
25405
+
25406
+
25407
+
25408
+
25409
+
25410
+
25411
+
25412
+
25413
+
25414
+
25415
+
25416
+
25417
+
25418
+
25419
+
25420
+
25421
+
25422
+
25423
+
25424
+
25425
+
25426
+
25427
+
25428
+
25429
+
25430
+
25431
+
25432
+
25433
+
25434
+
25435
+
25436
+
25437
+
25438
+
25439
+
25440
+
25441
+
25442
+
25443
+
25444
+
25445
+
25446
+
25447
+
25448
+
25449
+
25450
+
25451
+
25452
+
25453
+
25454
+
25455
+
25456
+
25457
+
25458
+
25459
+
25460
+
25461
+
25462
+
25463
+
25464
+
25465
+
25466
+
25467
+
25468
+
25469
+
25470
+
25471
+
25472
+
25473
+
25474
+
25475
+
25476
+
25477
+
25478
+
25479
+
25480
+
25481
+
25482
+
25483
+
25484
+
25485
+
25486
+
25487
+
25488
+
25489
+
25490
+
25491
+
25492
+
25493
+
25494
+
25495
+
25496
+
25497
+
25498
+
25499
+
25500
+
25501
+
25502
+
25503
+
25504
+
25505
+
25506
+
25507
+
25508
+
25509
+
25510
+
25511
+
25512
+
25513
+
25514
+
25515
+
25516
+
25517
+
25518
+
25519
+
25520
+
25521
+
25522
+
25523
+
25524
+
25525
+
25526
+
25527
+
25528
+
25529
+
25530
+
25531
+
25532
+
25533
+
25534
+
25535
+
25536
+
25537
+
25538
+
25539
+
25540
+
25541
+
25542
+
25543
+
25544
+
25545
+
25546
+
25547
+
25548
+
25549
+
25550
+
25551
+
25552
+
25553
+
25554
+
25555
+
25556
+
25557
+
25558
+
25559
+
25560
+
25561
+
25562
+
25563
+
25564
+
25565
+
25566
+
25567
+
25568
+
25569
+
25570
+
25571
+
25572
+
25573
+
25574
+
25575
+
25576
+
25577
+
25578
+
25579
+
25580
+
25581
+
25582
+
25583
+
25584
+
25585
+
25586
+
25587
+
25588
+
25589
+
25590
+
25591
+
25592
+
25593
+
25594
+
25595
+
25596
+
25597
+
25598
+
25599
+
25600
+
25601
+
25602
+
25603
+
25604
+
25605
+
25606
+
25607
+
25608
+
25609
+
25610
+
25611
+
25612
+
25613
+
25614
+
25615
+
25616
+
25617
+
25618
+
25619
+
25620
+
25621
+
25622
+
25623
+
25624
+
25625
+
25626
+
25627
+
25628
+
25629
+
25630
+
25631
+
25632
+
25633
+
25634
+
25635
+
25636
+
25637
+
25638
+
25639
+
25640
+
25641
+
25642
+
25643
+
25644
+
25645
+
25646
+
25647
+
25648
+
25649
+
25650
+
25651
+
25652
+
25653
+
25654
+
25655
+
25656
+
25657
+
25658
+
25659
+
25660
+
25661
+
25662
+
25663
+
25664
+
25665
+
25666
+
25667
+
25668
+
25669
+
25670
+
25671
+
25672
+
25673
+
25674
+
25675
+
25676
+
25677
+
25678
+
25679
+
25680
+
25681
+
25682
+
25683
+
25684
+
25685
+
25686
+
25687
+
25688
+
25689
+
25690
+
25691
+
25692
+
25693
+
25694
+
25695
+
25696
+
25697
+
25698
+
25699
+
25700
+
25701
+
25702
+
25703
+
25704
+
25705
+
25706
+
25707
+
25708
+
25709
+
25710
+
25711
+
25712
+
25713
+
25714
+
25715
+
25716
+
25717
+
25718
+
25719
+
25720
+
25721
+
25722
+
25723
+
25724
+
25725
+
25726
+
25727
+
25728
+
25729
+
25730
+
25731
+
25732
+
25733
+
25734
+
25735
+
25736
+
25737
+
25738
+
25739
+
25740
+
25741
+
25742
+
25743
+
25744
+
25745
+
25746
+
25747
+
25748
+
25749
+
25750
+
25751
+
25752
+
25753
+
25754
+
25755
+
25756
+
25757
+
25758
+
25759
+
25760
+
25761
+
25762
+
25763
+
25764
+
25765
+
25766
+
25767
+
25768
+
25769
+
25770
+
25771
+
25772
+
25773
+
25774
+
25775
+
25776
+
25777
+
25778
+
25779
+
25780
+
25781
+
25782
+
25783
+
25784
+
25785
+
25786
+
25787
+
25788
+
25789
+
25790
+
25791
+
25792
+
25793
+
25794
+
25795
+
25796
+
25797
+
25798
+
25799
+
25800
+
25801
+
25802
+
25803
+
25804
+
25805
+
25806
+
25807
+
25808
+
25809
+
25810
+
25811
+
25812
+
25813
+
25814
+
25815
+
25816
+
25817
+
25818
+
25819
+
25820
+
25821
+
25822
+
25823
+
25824
+
25825
+
25826
+
25827
+
25828
+
25829
+
25830
+
25831
+
25832
+
25833
+
25834
+
25835
+
25836
+
25837
+
25838
+
25839
+
25840
+
25841
+
25842
+
25843
+
25844
+
25845
+
25846
+
25847
+
25848
+
25849
+
25850
+
25851
+
25852
+
25853
+
25854
+
25855
+
25856
+
25857
+
25858
+
25859
+
25860
+
25861
+
25862
+
25863
+
25864
+
25865
+
25866
+
25867
+
25868
+
25869
+
25870
+
25871
+
25872
+
25873
+
25874
+
25875
+
25876
+
25877
+
25878
+
25879
+
25880
+
25881
+
25882
+
25883
+
25884
+
25885
+
25886
+
25887
+
25888
+
25889
+
25890
+
25891
+
25892
+
25893
+
25894
+
25895
+
25896
+
25897
+
25898
+
25899
+
25900
+
25901
+
25902
+
25903
+
25904
+
25905
+
25906
+
25907
+
25908
+
25909
+
25910
+
25911
+
25912
+
25913
+
25914
+
25915
+
25916
+
25917
+
25918
+
25919
+
25920
+
25921
+
25922
+
25923
+
25924
+
25925
+
25926
+
25927
+
25928
+
25929
+
25930
+
25931
+
25932
+
25933
+
25934
+
25935
+
25936
+
25937
+
25938
+
25939
+
25940
+
25941
+
25942
+
25943
+
25944
+
25945
+
25946
+
25947
+
25948
+
25949
+
25950
+
25951
+
25952
+
25953
+
25954
+
25955
+
25956
+
25957
+
25958
+
25959
+
25960
+
25961
+
25962
+
25963
+
25964
+
25965
+
25966
+
25967
+
25968
+
25969
+
25970
+
25971
+
25972
+
25973
+
25974
+
25975
+
25976
+
25977
+
25978
+
25979
+
25980
+
25981
+
25982
+
25983
+
25984
+
25985
+
25986
+
25987
+
25988
+
25989
+
25990
+
25991
+
25992
+
25993
+
25994
+
25995
+
25996
+
25997
+
25998
+
25999
+
26000
+
26001
+
26002
+
26003
+
26004
+
26005
+
26006
+
26007
+
26008
+
26009
+
26010
+
26011
+
26012
+
26013
+
26014
+
26015
+
26016
+
26017
+
26018
+
26019
+
26020
+
26021
+
26022
+
26023
+
26024
+
26025
+
26026
+
26027
+
26028
+
26029
+
26030
+
26031
+
26032
+
26033
+
26034
+
26035
+
26036
+
26037
+
26038
+
26039
+
26040
+
26041
+
26042
+
26043
+
26044
+
26045
+
26046
+
26047
+
26048
+
26049
+
26050
+
26051
+
26052
+
26053
+
26054
+
26055
+
26056
+
26057
+
26058
+
26059
+
26060
+
26061
+
26062
+
26063
+
26064
+
26065
+
26066
+
26067
+
26068
+
26069
+
26070
+
26071
+
26072
+
26073
+
26074
+
26075
+
26076
+
26077
+
26078
+
26079
+
26080
+
26081
+
26082
+
26083
+
26084
+
26085
+
26086
+
26087
+
26088
+
26089
+
26090
+
26091
+
26092
+
26093
+
26094
+
26095
+
26096
+
26097
+
26098
+
26099
+
26100
+
26101
+
26102
+
26103
+
26104
+
26105
+
26106
+
26107
+
26108
+
26109
+
26110
+
26111
+
26112
+
26113
+
26114
+
26115
+
26116
+
26117
+
26118
+
26119
+
26120
+
26121
+
26122
+
26123
+
26124
+
26125
+
26126
+
26127
+
26128
+
26129
+
26130
+
26131
+
26132
+
26133
+
26134
+
26135
+
26136
+
26137
+
26138
+
26139
+
26140
+
26141
+
26142
+
26143
+
26144
+
26145
+
26146
+
26147
+
26148
+
26149
+
26150
+
26151
+
26152
+
26153
+
26154
+
26155
+
26156
+
26157
+
26158
+
26159
+
26160
+
26161
+
26162
+
26163
+
26164
+
26165
+
26166
+
26167
+
26168
+
26169
+
26170
+
26171
+
26172
+
26173
+
26174
+
26175
+
26176
+
26177
+
26178
+
26179
+
26180
+
26181
+
26182
+
26183
+
26184
+
26185
+
26186
+
26187
+
26188
+
26189
+
26190
+
26191
+
26192
+
26193
+
26194
+
26195
+
26196
+
26197
+
26198
+
26199
+
26200
+
26201
+
26202
+
26203
+
26204
+
26205
+
26206
+
26207
+
26208
+
26209
+
26210
+
26211
+
26212
+
26213
+
26214
+
26215
+
26216
+
26217
+
26218
+
26219
+
26220
+
26221
+
26222
+
26223
+
26224
+
26225
+
26226
+
26227
+
26228
+
26229
+
26230
+
26231
+
26232
+
26233
+
26234
+
26235
+
26236
+
26237
+
26238
+
26239
+
26240
+
26241
+
26242
+
26243
+
26244
+
26245
+
26246
+
26247
+
26248
+
26249
+
26250
+
26251
+
26252
+
26253
+
26254
+
26255
+
26256
+
26257
+
26258
+
26259
+
26260
+
26261
+
26262
+
26263
+
26264
+
26265
+
26266
+
26267
+
26268
+
26269
+
26270
+
26271
+
26272
+
26273
+
26274
+
26275
+
26276
+
26277
+
26278
+
26279
+
26280
+
26281
+
26282
+
26283
+
26284
+
26285
+
26286
+
26287
+
26288
+
26289
+
26290
+
26291
+
26292
+
26293
+
26294
+
26295
+
26296
+
26297
+
26298
+
26299
+
26300
+
26301
+
26302
+
26303
+
26304
+
26305
+
26306
+
26307
+
26308
+
26309
+
26310
+
26311
+
26312
+
26313
+
26314
+
26315
+
26316
+
26317
+
26318
+
26319
+
26320
+
26321
+
26322
+
26323
+
26324
+
26325
+
26326
+
26327
+
26328
+
26329
+
26330
+
26331
+
26332
+
26333
+
26334
+
26335
+
26336
+
26337
+
26338
+
26339
+
26340
+
26341
+
26342
+
26343
+
26344
+
26345
+
26346
+
26347
+
26348
+
26349
+
26350
+
26351
+
26352
+
26353
+
26354
+
26355
+
26356
+
26357
+
26358
+
26359
+
26360
+
26361
+
26362
+
26363
+
26364
+
26365
+
26366
+
26367
+
26368
+
26369
+
26370
+
26371
+
26372
+
26373
+
26374
+
26375
+
26376
+
26377
+
26378
+
26379
+
26380
+
26381
+
26382
+
26383
+
26384
+
26385
+
26386
+
26387
+
26388
+
26389
+
26390
+
26391
+
26392
+
26393
+
26394
+
26395
+
26396
+
26397
+
26398
+
26399
+
26400
+
26401
+
26402
+
26403
+
26404
+
26405
+
26406
+
26407
+
26408
+
26409
+
26410
+
26411
+
26412
+
26413
+
26414
+
26415
+
26416
+
26417
+
26418
+
26419
+
26420
+
26421
+
26422
+
26423
+
26424
+
26425
+
26426
+
26427
+
26428
+
26429
+
26430
+
26431
+
26432
+
26433
+
26434
+
26435
+
26436
+
26437
+
26438
+
26439
+
26440
+
26441
+
26442
+
26443
+
26444
+
26445
+
26446
+
26447
+
26448
+
26449
+
26450
+
26451
+
26452
+
26453
+
26454
+
26455
+
26456
+
26457
+
26458
+
26459
+
26460
+
26461
+
26462
+
26463
+
26464
+
26465
+
26466
+
26467
+
26468
+
26469
+
26470
+
26471
+
26472
+
26473
+
26474
+
26475
+
26476
+
26477
+
26478
+
26479
+
26480
+
26481
+
26482
+
26483
+
26484
+
26485
+
26486
+
26487
+
26488
+
26489
+
26490
+
26491
+
26492
+
26493
+
26494
+
26495
+
26496
+
26497
+
26498
+
26499
+
26500
+
26501
+
26502
+
26503
+
26504
+
26505
+
26506
+
26507
+
26508
+
26509
+
26510
+
26511
+
26512
+
26513
+
26514
+
26515
+
26516
+
26517
+
26518
+
26519
+
26520
+
26521
+
26522
+
26523
+
26524
+
26525
+
26526
+
26527
+
26528
+
26529
+
26530
+
26531
+
26532
+
26533
+
26534
+
26535
+
26536
+
26537
+
26538
+
26539
+
26540
+
26541
+
26542
+
26543
+
26544
+
26545
+
26546
+
26547
+
26548
+
26549
+
26550
+
26551
+
26552
+
26553
+
26554
+
26555
+
26556
+
26557
+
26558
+
26559
+
26560
+
26561
+
26562
+
26563
+
26564
+
26565
+
26566
+
26567
+
26568
+
26569
+
26570
+
26571
+
26572
+
26573
+
26574
+
26575
+
26576
+
26577
+
26578
+
26579
+
26580
+
26581
+
26582
+
26583
+
26584
+
26585
+
26586
+
26587
+
26588
+
26589
+
26590
+
26591
+
26592
+
26593
+
26594
+
26595
+
26596
+
26597
+
26598
+
26599
+
26600
+
26601
+
26602
+
26603
+
26604
+
26605
+
26606
+
26607
+
26608
+
26609
+
26610
+
26611
+
26612
+
26613
+
26614
+
26615
+
26616
+
26617
+
26618
+
26619
+
26620
+
26621
+
26622
+
26623
+
26624
+
26625
+
26626
+
26627
+
26628
+
26629
+
26630
+
26631
+
26632
+
26633
+
26634
+
26635
+
26636
+
26637
+
26638
+
26639
+
26640
+
26641
+
26642
+
26643
+
26644
+
26645
+
26646
+
26647
+
26648
+
26649
+
26650
+
26651
+
26652
+
26653
+
26654
+
26655
+
26656
+
26657
+
26658
+
26659
+
26660
+
26661
+
26662
+
26663
+
26664
+
26665
+
26666
+
26667
+
26668
+
26669
+
26670
+
26671
+
26672
+
26673
+
26674
+
26675
+
26676
+
26677
+
26678
+
26679
+
26680
+
26681
+
26682
+
26683
+
26684
+
26685
+
26686
+
26687
+
26688
+
26689
+
26690
+
26691
+
26692
+
26693
+
26694
+
26695
+
26696
+
26697
+
26698
+
26699
+
26700
+
26701
+
26702
+
26703
+
26704
+
26705
+
26706
+
26707
+
26708
+
26709
+
26710
+
26711
+
26712
+
26713
+
26714
+
26715
+
26716
+
26717
+
26718
+
26719
+
26720
+
26721
+
26722
+
26723
+
26724
+
26725
+
26726
+
26727
+
26728
+
26729
+
26730
+
26731
+
26732
+
26733
+
26734
+
26735
+
26736
+
26737
+
26738
+
26739
+
26740
+
26741
+
26742
+
26743
+
26744
+
26745
+
26746
+
26747
+
26748
+
26749
+
26750
+
26751
+
26752
+
26753
+
26754
+
26755
+
26756
+
26757
+
26758
+
26759
+
26760
+
26761
+
26762
+
26763
+
26764
+
26765
+
26766
+
26767
+
26768
+
26769
+
26770
+
26771
+
26772
+
26773
+
26774
+
26775
+
26776
+
26777
+
26778
+
26779
+
26780
+
26781
+
26782
+
26783
+
26784
+
26785
+
26786
+
26787
+
26788
+
26789
+
26790
+
26791
+
26792
+
26793
+
26794
+
26795
+
26796
+
26797
+
26798
+
26799
+
26800
+
26801
+
26802
+
26803
+
26804
+
26805
+
26806
+
26807
+
26808
+
26809
+
26810
+
26811
+
26812
+
26813
+
26814
+
26815
+
26816
+
26817
+
26818
+
26819
+
26820
+
26821
+
26822
+
26823
+
26824
+
26825
+
26826
+
26827
+
26828
+
26829
+
26830
+
26831
+
26832
+
26833
+
26834
+
26835
+
26836
+
26837
+
26838
+
26839
+
26840
+
26841
+
26842
+
26843
+
26844
+
26845
+
26846
+
26847
+
26848
+
26849
+
26850
+
26851
+
26852
+
26853
+
26854
+
26855
+
26856
+
26857
+
26858
+
26859
+
26860
+
26861
+
26862
+
26863
+
26864
+
26865
+
26866
+
26867
+
26868
+
26869
+
26870
+
26871
+
26872
+
26873
+
26874
+
26875
+
26876
+
26877
+
26878
+
26879
+
26880
+
26881
+
26882
+
26883
+
26884
+
26885
+
26886
+
26887
+
26888
+
26889
+
26890
+
26891
+
26892
+
26893
+
26894
+
26895
+
26896
+
26897
+
26898
+
26899
+
26900
+
26901
+
26902
+
26903
+
26904
+
26905
+
26906
+
26907
+
26908
+
26909
+
26910
+
26911
+
26912
+
26913
+
26914
+
26915
+
26916
+
26917
+
26918
+
26919
+
26920
+
26921
+
26922
+
26923
+
26924
+
26925
+
26926
+
26927
+
26928
+
26929
+
26930
+
26931
+
26932
+
26933
+
26934
+
26935
+
26936
+
26937
+
26938
+
26939
+
26940
+
26941
+
26942
+
26943
+
26944
+
26945
+
26946
+
26947
+
26948
+
26949
+
26950
+
26951
+
26952
+
26953
+
26954
+
26955
+
26956
+
26957
+
26958
+
26959
+
26960
+
26961
+
26962
+
26963
+
26964
+
26965
+
26966
+
26967
+
26968
+
26969
+
26970
+
26971
+
26972
+
26973
+
26974
+
26975
+
26976
+
26977
+
26978
+
26979
+
26980
+
26981
+
26982
+
26983
+
26984
+
26985
+
26986
+
26987
+
26988
+
26989
+
26990
+
26991
+
26992
+
26993
+
26994
+
26995
+
26996
+
26997
+
26998
+
26999
+
27000
+
27001
+
27002
+
27003
+
27004
+
27005
+
27006
+
27007
+
27008
+
27009
+
27010
+
27011
+
27012
+
27013
+
27014
+
27015
+
27016
+
27017
+
27018
+
27019
+
27020
+
27021
+
27022
+
27023
+
27024
+
27025
+ Training...: 39% 1735/4393 [2:17:11<3:39:45, 4.96s/it]
27026
+ Step... (21975 | Loss: 0.028883015736937523, Learning Rate: 5.661817704094574e-05, Gradient Norm: 0.27709388732910156)
27027
+ Step... (22000 | Loss: 0.032305553555488586, Learning Rate: 5.65676782571245e-05, Gradient Norm: 0.36947354674339294)
27028
+ Step... (22025 | Loss: 0.029987312853336334, Learning Rate: 5.6517172197345644e-05, Gradient Norm: 0.3085069954395294)
27029
+ Step... (22050 | Loss: 0.03197702020406723, Learning Rate: 5.646666249958798e-05, Gradient Norm: 0.27674245834350586)
27030
+ Step... (22075 | Loss: 0.04232995584607124, Learning Rate: 5.641616371576674e-05, Gradient Norm: 0.34863319993019104)
27031
+ Step... (22100 | Loss: 0.01950874924659729, Learning Rate: 5.636565401800908e-05, Gradient Norm: 0.258765310049057)
27032
+ Step... (22125 | Loss: 0.0584186315536499, Learning Rate: 5.631514795823023e-05, Gradient Norm: 0.5440660715103149)
27033
+ Step... (22150 | Loss: 0.029707320034503937, Learning Rate: 5.626464917440899e-05, Gradient Norm: 0.3775123953819275)
27034
+ Step... (22175 | Loss: 0.04275369271636009, Learning Rate: 5.6214139476651326e-05, Gradient Norm: 0.33941468596458435)
27035
+ Step... (22200 | Loss: 0.0313958078622818, Learning Rate: 5.6163629778893664e-05, Gradient Norm: 0.38821160793304443)
27036
+ Step... (22225 | Loss: 0.03712507337331772, Learning Rate: 5.611313463305123e-05, Gradient Norm: 0.28947848081588745)
27037
+ Step... (22250 | Loss: 0.04902244731783867, Learning Rate: 5.606262493529357e-05, Gradient Norm: 0.3721737265586853)
27038
+ Step... (22275 | Loss: 0.027602573856711388, Learning Rate: 5.601211523753591e-05, Gradient Norm: 0.2835336923599243)
27039
+ Step... (22300 | Loss: 0.04686018079519272, Learning Rate: 5.5961620091693476e-05, Gradient Norm: 0.43980643153190613)
27040
+ Step... (22325 | Loss: 0.031802356243133545, Learning Rate: 5.5911110393935814e-05, Gradient Norm: 0.3008425235748291)
27041
+ Step... (22350 | Loss: 0.02849770523607731, Learning Rate: 5.586060069617815e-05, Gradient Norm: 0.365633100271225)
27042
+ Step... (22375 | Loss: 0.028170589357614517, Learning Rate: 5.5810098274378106e-05, Gradient Norm: 0.31893348693847656)
27043
+ Step... (22400 | Loss: 0.027144117280840874, Learning Rate: 5.575959585257806e-05, Gradient Norm: 0.3039880096912384)
27044
+ Step... (22425 | Loss: 0.040041469037532806, Learning Rate: 5.57090861548204e-05, Gradient Norm: 0.4088221788406372)
27045
+ Step... (22450 | Loss: 0.03567557409405708, Learning Rate: 5.565858373302035e-05, Gradient Norm: 0.35828736424446106)
27046
+ Step... (22475 | Loss: 0.020282002165913582, Learning Rate: 5.56080813112203e-05, Gradient Norm: 0.254996120929718)
27047
+ Step... (22500 | Loss: 0.04376010596752167, Learning Rate: 5.555757161346264e-05, Gradient Norm: 0.7653269171714783)
27048
+ Step... (22525 | Loss: 0.04300732538104057, Learning Rate: 5.5507069191662595e-05, Gradient Norm: 0.3325899541378021)
27049
+ Step... (22550 | Loss: 0.04675268009305, Learning Rate: 5.545656676986255e-05, Gradient Norm: 0.4212055802345276)
27050
+ Step... (22575 | Loss: 0.03605210781097412, Learning Rate: 5.5406057072104886e-05, Gradient Norm: 0.5294095277786255)
27051
+ Step... (22600 | Loss: 0.02685914747416973, Learning Rate: 5.535555465030484e-05, Gradient Norm: 0.378471314907074)
27052
+ Step... (22625 | Loss: 0.037933044135570526, Learning Rate: 5.5305048590525985e-05, Gradient Norm: 0.6845491528511047)
27053
+ Step... (22650 | Loss: 0.03763023763895035, Learning Rate: 5.525454253074713e-05, Gradient Norm: 0.3493119180202484)
27054
+ Step... (22675 | Loss: 0.03691916540265083, Learning Rate: 5.520404010894708e-05, Gradient Norm: 0.3682186007499695)
27055
+ Step... (22700 | Loss: 0.02609202265739441, Learning Rate: 5.515353404916823e-05, Gradient Norm: 0.6833606362342834)
27056
+ Step... (22725 | Loss: 0.047541745007038116, Learning Rate: 5.5103027989389375e-05, Gradient Norm: 0.4514157474040985)
27057
+ Step... (22750 | Loss: 0.033493347465991974, Learning Rate: 5.505252192961052e-05, Gradient Norm: 0.3551611006259918)
27058
+ Step... (22775 | Loss: 0.04429641366004944, Learning Rate: 5.5002019507810473e-05, Gradient Norm: 0.4048822522163391)
27059
+ Step... (22800 | Loss: 0.04791601374745369, Learning Rate: 5.495150981005281e-05, Gradient Norm: 0.44947221875190735)
27060
+ Step... (22825 | Loss: 0.031418006867170334, Learning Rate: 5.4901007388252765e-05, Gradient Norm: 0.44471606612205505)
27061
+ Step... (22850 | Loss: 0.03244362398982048, Learning Rate: 5.485050496645272e-05, Gradient Norm: 0.4113829433917999)
27062
+ Step... (22875 | Loss: 0.03053554892539978, Learning Rate: 5.4799995268695056e-05, Gradient Norm: 0.3072253167629242)
27063
+ Step... (22900 | Loss: 0.04872916638851166, Learning Rate: 5.474949284689501e-05, Gradient Norm: 0.5141371488571167)
27064
+ Step... (22925 | Loss: 0.03374813497066498, Learning Rate: 5.469899042509496e-05, Gradient Norm: 0.5266069173812866)
27065
+ Step... (22950 | Loss: 0.031113576143980026, Learning Rate: 5.46484807273373e-05, Gradient Norm: 0.3346673548221588)
27066
+ Step... (22975 | Loss: 0.03883219510316849, Learning Rate: 5.4597978305537254e-05, Gradient Norm: 0.3280452489852905)
27067
+ Step... (23000 | Loss: 0.025303302332758904, Learning Rate: 5.4547475883737206e-05, Gradient Norm: 0.26770514249801636)
27068
+ Step... (23025 | Loss: 0.03750527650117874, Learning Rate: 5.4496966185979545e-05, Gradient Norm: 0.29126110672950745)
27069
+ Step... (23050 | Loss: 0.04249219223856926, Learning Rate: 5.44464637641795e-05, Gradient Norm: 0.374131441116333)
27070
+ Step... (23075 | Loss: 0.04200226441025734, Learning Rate: 5.439596134237945e-05, Gradient Norm: 0.43781113624572754)
27071
+ Step... (23100 | Loss: 0.047300681471824646, Learning Rate: 5.434545164462179e-05, Gradient Norm: 0.40795448422431946)
27072
+ Step... (23125 | Loss: 0.049821797758340836, Learning Rate: 5.429494922282174e-05, Gradient Norm: 0.375959187746048)
27073
+ Step... (23150 | Loss: 0.02656714618206024, Learning Rate: 5.4244446801021695e-05, Gradient Norm: 0.376505583524704)
27074
+ Step... (23175 | Loss: 0.03251540660858154, Learning Rate: 5.4193937103264034e-05, Gradient Norm: 0.327536940574646)
27075
+ Step... (23200 | Loss: 0.029152745380997658, Learning Rate: 5.414343468146399e-05, Gradient Norm: 0.39311301708221436)
27076
+ Step... (23225 | Loss: 0.03586068004369736, Learning Rate: 5.409292862168513e-05, Gradient Norm: 0.36502012610435486)
27077
+ Step... (23250 | Loss: 0.03459014371037483, Learning Rate: 5.404242256190628e-05, Gradient Norm: 0.4572829306125641)
27078
+ Step... (23275 | Loss: 0.04172315075993538, Learning Rate: 5.399192014010623e-05, Gradient Norm: 0.3692755401134491)
27079
+ Step... (23300 | Loss: 0.02289118990302086, Learning Rate: 5.394141408032738e-05, Gradient Norm: 0.36271020770072937)
27080
+ Step... (23325 | Loss: 0.02884475328028202, Learning Rate: 5.389090802054852e-05, Gradient Norm: 0.4460407793521881)
27081
+ Step... (23350 | Loss: 0.033021628856658936, Learning Rate: 5.384040196076967e-05, Gradient Norm: 0.3724048435688019)
27082
+ Step... (23375 | Loss: 0.057164158672094345, Learning Rate: 5.378989953896962e-05, Gradient Norm: 0.3597463369369507)
27083
+ Step... (23400 | Loss: 0.020201342180371284, Learning Rate: 5.373938984121196e-05, Gradient Norm: 0.28142717480659485)
27084
+ Step... (23425 | Loss: 0.022390734404325485, Learning Rate: 5.368888741941191e-05, Gradient Norm: 0.3492378890514374)
27085
+ Step... (23450 | Loss: 0.03480976074934006, Learning Rate: 5.3638384997611865e-05, Gradient Norm: 0.37160617113113403)
27086
+ Step... (23475 | Loss: 0.027854222804307938, Learning Rate: 5.3587875299854204e-05, Gradient Norm: 0.2876638174057007)
27087
+ Step... (23500 | Loss: 0.04249017685651779, Learning Rate: 5.353737287805416e-05, Gradient Norm: 0.41824018955230713)
27088
+ Step... (23525 | Loss: 0.0388215109705925, Learning Rate: 5.348687045625411e-05, Gradient Norm: 0.32904794812202454)
27089
+ Step... (23550 | Loss: 0.024838045239448547, Learning Rate: 5.343636075849645e-05, Gradient Norm: 0.25835245847702026)
27090
+ Step... (23575 | Loss: 0.043581828474998474, Learning Rate: 5.33858583366964e-05, Gradient Norm: 0.3848859369754791)
27091
+ Step... (23600 | Loss: 0.03622807189822197, Learning Rate: 5.3335355914896354e-05, Gradient Norm: 0.36198726296424866)
27092
+ Step... (23625 | Loss: 0.03490854427218437, Learning Rate: 5.328484621713869e-05, Gradient Norm: 0.33471915125846863)
27093
+ Step... (23650 | Loss: 0.04838581383228302, Learning Rate: 5.3234343795338646e-05, Gradient Norm: 7.672711372375488)
27094
+
27095
+
27096
+
27097
+
27098
+
27099
+
27100
+
27101
+
27102
+
27103
+
27104
+
27105
+
27106
+
27107
+
27108
+
27109
+
27110
+
27111
+
27112
+
27113
+
27114
+
27115
+
27116
+
27117
+
27118
+
27119
+
27120
+
27121
+
27122
+
27123
+
27124
+
27125
+
27126
+
27127
+
27128
+
27129
+
27130
+
27131
+
27132
+
27133
+
27134
+
27135
+
27136
+
27137
+
27138
+
27139
+
27140
+
27141
+
27142
+
27143
+
27144
+
27145
+
27146
+
27147
+
27148
+
27149
+
27150
+
27151
+
27152
+
27153
+
27154
+
27155
+
27156
+
27157
+
27158
+
27159
+
27160
+
27161
+
27162
+
27163
+
27164
+
27165
+
27166
+
27167
+
27168
+
27169
+
27170
+
27171
+
27172
+
27173
+
27174
+
27175
+
27176
+
27177
+
27178
+
27179
+
27180
+
27181
+
27182
+
27183
+
27184
+
27185
+
27186
+
27187
+
27188
+
27189
+
27190
+
27191
+
27192
+
27193
+
27194
+
27195
+
27196
+
27197
+
27198
+
27199
+
27200
+
27201
+
27202
+
27203
+
27204
+
27205
+
27206
+
27207
+
27208
+
27209
+
27210
+
27211
+
27212
+
27213
+
27214
+
27215
+
27216
+
27217
+
27218
+
27219
+
27220
+
27221
+
27222
+
27223
+
27224
+
27225
+
27226
+
27227
+
27228
+
27229
+
27230
+
27231
+
27232
+
27233
+
27234
+
27235
+
27236
+
27237
+
27238
+
27239
+
27240
+
27241
+
27242
+
27243
+
27244
+
27245
+
27246
+
27247
+
27248
+
27249
+
27250
+
27251
+
27252
+
27253
+
27254
+
27255
+
27256
+
27257
+
27258
+
27259
+
27260
+
27261
+
27262
+
27263
+
27264
+
27265
+
27266
+
27267
+
27268
+
27269
+
27270
+
27271
+
27272
+
27273
+
27274
+
27275
+
27276
+
27277
+
27278
+
27279
+
27280
+
27281
+
27282
+
27283
+
27284
+
27285
+
27286
+
27287
+
27288
+
27289
+
27290
+
27291
+
27292
+
27293
+
27294
+
27295
+
27296
+
27297
+
27298
+
27299
+
27300
+
27301
+
27302
+
27303
+
27304
+
27305
+
27306
+
27307
+
27308
+
27309
+
27310
+
27311
+
27312
+
27313
+
27314
+
27315
+
27316
+
27317
+
27318
+
27319
+
27320
+
27321
+
27322
+
27323
+
27324
+
27325
+
27326
+
27327
+
27328
+
27329
+
27330
+
27331
+
27332
+
27333
+
27334
+
27335
+
27336
+
27337
+
27338
+
27339
+
27340
+
27341
+
27342
+
27343
+
27344
+
27345
+
27346
+
27347
+
27348
+
27349
+
27350
+
27351
+
27352
+
27353
+
27354
+
27355
+
27356
+
27357
+
27358
+
27359
+
27360
+
27361
+
27362
+
27363
+
27364
+
27365
+
27366
+
27367
+
27368
+
27369
+
27370
+
27371
+
27372
+
27373
+
27374
+
27375
+
27376
+
27377
+
27378
+
27379
+
27380
+
27381
+
27382
+
27383
+
27384
+
27385
+
27386
+
27387
+
27388
+
27389
+
27390
+
27391
+
27392
+
27393
+
27394
+
27395
+
27396
+
27397
+
27398
+
27399
+
27400
+
27401
+
27402
+
27403
+
27404
+
27405
+
27406
+
27407
+
27408
+
27409
+
27410
+
27411
+
27412
+
27413
+
27414
+
27415
+
27416
+
27417
+
27418
+
27419
+
27420
+
27421
+
27422
+
27423
+
27424
+
27425
+
27426
+
27427
+
27428
+
27429
+
27430
+
27431
+
27432
+
27433
+
27434
+
27435
+
27436
+
27437
+
27438
+
27439
+
27440
+
27441
+
27442
+
27443
+
27444
+
27445
+
27446
+
27447
+
27448
+
27449
+
27450
+
27451
+
27452
+
27453
+
27454
+
27455
+
27456
+
27457
+
27458
+
27459
+
27460
+
27461
+
27462
+
27463
+
27464
+
27465
+
27466
+
27467
+
27468
+
27469
+
27470
+
27471
+
27472
+
27473
+
27474
+
27475
+
27476
+
27477
+
27478
+
27479
+
27480
+
27481
+
27482
+
27483
+
27484
+
27485
+
27486
+
27487
+
27488
+
27489
+
27490
+
27491
+
27492
+
27493
+
27494
+
27495
+
27496
+
27497
+
27498
+
27499
+
27500
+
27501
+
27502
+
27503
+
27504
+
27505
+
27506
+
27507
+
27508
+
27509
+
27510
+
27511
+
27512
+
27513
+
27514
+
27515
+
27516
+
27517
+
27518
+
27519
+
27520
+
27521
+
27522
+
27523
+
27524
+
27525
+
27526
+
27527
+
27528
+
27529
+
27530
+
27531
+
27532
+
27533
+
27534
+
27535
+
27536
+
27537
+
27538
+
27539
+
27540
+
27541
+
27542
+
27543
+
27544
+
27545
+
27546
+
27547
+
27548
+
27549
+
27550
+
27551
+
27552
+
27553
+
27554
+
27555
+
27556
+
27557
+
27558
+
27559
+
27560
+
27561
+
27562
+
27563
+
27564
+
27565
+
27566
+
27567
+
27568
+
27569
+
27570
+
27571
+
27572
+
27573
+
27574
+
27575
+
27576
+
27577
+
27578
+
27579
+
27580
+
27581
+
27582
+
27583
+
27584
+
27585
+
27586
+
27587
+
27588
+
27589
+
27590
+
27591
+
27592
+
27593
+
27594
+
27595
+
27596
+
27597
+
27598
+
27599
+
27600
+
27601
+
27602
+
27603
+
27604
+
27605
+
27606
+
27607
+
27608
+
27609
+
27610
+
27611
+
27612
+
27613
+
27614
+
27615
+
27616
+
27617
+
27618
+
27619
+
27620
+
27621
+
27622
+
27623
+
27624
+
27625
+
27626
+
27627
+
27628
+
27629
+
27630
+
27631
+
27632
+
27633
+
27634
+
27635
+
27636
+
27637
+
27638
+
27639
+
27640
+
27641
+
27642
+
27643
+
27644
+
27645
+
27646
+
27647
+
27648
+
27649
+
27650
+
27651
+
27652
+
27653
+
27654
+
27655
+
27656
+
27657
+
27658
+
27659
+
27660
+
27661
+
27662
+
27663
+
27664
+
27665
+
27666
+
27667
+
27668
+
27669
+
27670
+
27671
+
27672
+
27673
+
27674
+
27675
+
27676
+
27677
+
27678
+
27679
+
27680
+
27681
+
27682
+
27683
+
27684
+
27685
+
27686
+
27687
+
27688
+
27689
+
27690
+
27691
+
27692
+
27693
+
27694
+
27695
+
27696
+
27697
+
27698
+
27699
+
27700
+
27701
+
27702
+
27703
+
27704
+
27705
+
27706
+
27707
+
27708
+
27709
+
27710
+
27711
+
27712
+
27713
+
27714
+
27715
+
27716
+
27717
+
27718
+
27719
+
27720
+
27721
+
27722
+
27723
+
27724
+
27725
+
27726
+
27727
+
27728
+
27729
+
27730
+
27731
+
27732
+
27733
+
27734
+
27735
+
27736
+
27737
+
27738
+
27739
+
27740
+
27741
+
27742
+
27743
+
27744
+
27745
+
27746
+
27747
+
27748
+
27749
+
27750
+
27751
+
27752
+
27753
+
27754
+
27755
+
27756
+
27757
+
27758
+
27759
+
27760
+
27761
+
27762
+
27763
+
27764
+
27765
+
27766
+
27767
+
27768
+
27769
+
27770
+
27771
+
27772
+
27773
+
27774
+
27775
+
27776
+
27777
+
27778
+
27779
+
27780
+
27781
+
27782
+
27783
+
27784
+
27785
+
27786
+
27787
+
27788
+
27789
+
27790
+
27791
+
27792
+
27793
+
27794
+
27795
+
27796
+
27797
+
27798
+
27799
+
27800
+
27801
+
27802
+
27803
+
27804
+
27805
+
27806
+
27807
+
27808
+
27809
+
27810
+
27811
+
27812
+
27813
+
27814
+
27815
+
27816
+
27817
+
27818
+
27819
+
27820
+
27821
+
27822
+
27823
+
27824
+
27825
+
27826
+
27827
+
27828
+
27829
+
27830
+
27831
+
27832
+
27833
+
27834
+
27835
+
27836
+
27837
+
27838
+
27839
+
27840
+
27841
+
27842
+
27843
+
27844
+
27845
+
27846
+
27847
+
27848
+
27849
+
27850
+
27851
+
27852
+
27853
+
27854
+
27855
+
27856
+
27857
+
27858
+
27859
+
27860
+
27861
+
27862
+
27863
+
27864
+
27865
+
27866
+
27867
+
27868
+
27869
+
27870
+
27871
+
27872
+
27873
+
27874
+
27875
+
27876
+
27877
+
27878
+
27879
+
27880
+
27881
+
27882
+
27883
+
27884
+
27885
+
27886
+
27887
+
27888
+
27889
+
27890
+
27891
+
27892
+
27893
+
27894
+
27895
+
27896
+
27897
+
27898
+
27899
+
27900
+
27901
+
27902
+
27903
+
27904
+
27905
+
27906
+
27907
+
27908
+
27909
+
27910
+
27911
+
27912
+
27913
+
27914
+
27915
+
27916
+
27917
+
27918
+
27919
+
27920
+
27921
+
27922
+
27923
+
27924
+
27925
+
27926
+
27927
+
27928
+
27929
+
27930
+
27931
+
27932
+
27933
+
27934
+
27935
+
27936
+
27937
+
27938
+
27939
+
27940
+
27941
+
27942
+
27943
+
27944
+
27945
+
27946
+
27947
+
27948
+
27949
+
27950
+
27951
+
27952
+
27953
+
27954
+
27955
+
27956
+
27957
+
27958
+
27959
+
27960
+
27961
+
27962
+
27963
+
27964
+
27965
+
27966
+
27967
+
27968
+
27969
+
27970
+
27971
+
27972
+
27973
+
27974
+
27975
+
27976
+
27977
+
27978
+
27979
+
27980
+
27981
+
27982
+
27983
+
27984
+
27985
+
27986
+
27987
+
27988
+
27989
+
27990
+
27991
+
27992
+
27993
+
27994
+
27995
+
27996
+
27997
+
27998
+
27999
+
28000
+
28001
+
28002
+
28003
+
28004
+
28005
+
28006
+
28007
+
28008
+
28009
+
28010
+
28011
+
28012
+
28013
+
28014
+
28015
+
28016
+
28017
+
28018
+
28019
+
28020
+
28021
+
28022
+
28023
+
28024
+
28025
+
28026
+
28027
+
28028
+
28029
+
28030
+
28031
+
28032
+
28033
+
28034
+
28035
+
28036
+
28037
+
28038
+
28039
+
28040
+
28041
+
28042
+
28043
+
28044
+
28045
+
28046
+
28047
+
28048
+
28049
+
28050
+
28051
+
28052
+
28053
+
28054
+
28055
+
28056
+
28057
+
28058
+
28059
+
28060
+
28061
+
28062
+
28063
+
28064
+
28065
+
28066
+
28067
+
28068
+
28069
+
28070
+
28071
+
28072
+
28073
+
28074
+
28075
+
28076
+
28077
+
28078
+
28079
+
28080
+
28081
+
28082
+
28083
+
28084
+
28085
+
28086
+
28087
+
28088
+
28089
+
28090
+
28091
+
28092
+
28093
+
28094
+
28095
+
28096
+
28097
+
28098
+
28099
+
28100
+
28101
+
28102
+
28103
+
28104
+
28105
+
28106
+
28107
+
28108
+
28109
+
28110
+
28111
+
28112
+
28113
+
28114
+
28115
+
28116
+
28117
+
28118
+
28119
+
28120
+
28121
+
28122
+
28123
+
28124
+
28125
+
28126
+
28127
+
28128
+
28129
+
28130
+
28131
+
28132
+
28133
+
28134
+
28135
+
28136
+
28137
+
28138
+
28139
+
28140
+
28141
+
28142
+
28143
+
28144
+
28145
+
28146
+
28147
+
28148
+
28149
+
28150
+
28151
+
28152
+
28153
+
28154
+
28155
+
28156
+
28157
+
28158
+
28159
+
28160
+
28161
+
28162
+
28163
+
28164
+
28165
+
28166
+
28167
+
28168
+
28169
+
28170
+
28171
+
28172
+
28173
+
28174
+
28175
+
28176
+
28177
+
28178
+
28179
+
28180
+
28181
+
28182
+
28183
+
28184
+
28185
+
28186
+
28187
+
28188
+
28189
+
28190
+
28191
+
28192
+
28193
+
28194
+
28195
+
28196
+
28197
+
28198
+
28199
+
28200
+
28201
+
28202
+
28203
+
28204
+
28205
+
28206
+
28207
+
28208
+
28209
+
28210
+
28211
+
28212
+
28213
+
28214
+
28215
+
28216
+
28217
+
28218
+
28219
+
28220
+
28221
+
28222
+
28223
+
28224
+
28225
+
28226
+
28227
+
28228
+
28229
+
28230
+
28231
+
28232
+
28233
+
28234
+
28235
+
28236
+
28237
+
28238
+
28239
+
28240
+
28241
+
28242
+
28243
+
28244
+
28245
+
28246
+
28247
+
28248
+
28249
+
28250
+
28251
+
28252
+
28253
+
28254
+
28255
+
28256
+
28257
+
28258
+
28259
+
28260
+
28261
+
28262
+
28263
+
28264
+
28265
+
28266
+
28267
+
28268
+
28269
+
28270
+
28271
+
28272
+
28273
+
28274
+
28275
+
28276
+
28277
+
28278
+
28279
+
28280
+
28281
+
28282
+
28283
+
28284
+
28285
+
28286
+
28287
+
28288
+
28289
+
28290
+
28291
+
28292
+
28293
+
28294
+
28295
+
28296
+
28297
+
28298
+
28299
+
28300
+
28301
+
28302
+
28303
+
28304
+
28305
+
28306
+
28307
+
28308
+
28309
+
28310
+
28311
+
28312
+
28313
+
28314
+
28315
+
28316
+
28317
+
28318
+
28319
+
28320
+
28321
+
28322
+
28323
+
28324
+
28325
+
28326
+
28327
+
28328
+
28329
+
28330
+
28331
+
28332
+
28333
+
28334
+
28335
+
28336
+
28337
+
28338
+
28339
+
28340
+
28341
+
28342
+
28343
+
28344
+
28345
+
28346
+
28347
+
28348
+
28349
+
28350
+
28351
+
28352
+
28353
+
28354
+
28355
+
28356
+
28357
+
28358
+
28359
+
28360
+
28361
+
28362
+
28363
+
28364
+
28365
+
28366
+
28367
+
28368
+
28369
+
28370
+
28371
+
28372
+
28373
+
28374
+
28375
+
28376
+
28377
+
28378
+
28379
+
28380
+
28381
+
28382
+
28383
+
28384
+
28385
+
28386
+
28387
+
28388
+
28389
+
28390
+
28391
+
28392
+
28393
+
28394
+
28395
+
28396
+
28397
+
28398
+
28399
+
28400
+
28401
+
28402
+
28403
+
28404
+
28405
+
28406
+
28407
+
28408
+
28409
+
28410
+
28411
+
28412
+
28413
+
28414
+
28415
+
28416
+
28417
+
28418
+
28419
+
28420
+
28421
+
28422
+
28423
+
28424
+
28425
+
28426
+
28427
+
28428
+
28429
+
28430
+
28431
+
28432
+
28433
+
28434
+
28435
+
28436
+
28437
+
28438
+
28439
+
28440
+
28441
+
28442
+
28443
+
28444
+
28445
+
28446
+
28447
+
28448
+
28449
+
28450
+
28451
+
28452
+
28453
+
28454
+
28455
+
28456
+
28457
+
28458
+
28459
+
28460
+
28461
+
28462
+
28463
+
28464
+
28465
+
28466
+
28467
+
28468
+
28469
+
28470
+
28471
+
28472
+
28473
+
28474
+
28475
+
28476
+
28477
+
28478
+
28479
+
28480
+
28481
+
28482
+
28483
+
28484
+
28485
+
28486
+
28487
+
28488
+
28489
+
28490
+
28491
+
28492
+
28493
+
28494
+
28495
+
28496
+
28497
+
28498
+
28499
+
28500
+
28501
+
28502
+
28503
+
28504
+
28505
+
28506
+
28507
+
28508
+
28509
+
28510
+
28511
+
28512
+
28513
+
28514
+
28515
+
28516
+
28517
+
28518
+
28519
+
28520
+
28521
+
28522
+
28523
+
28524
+
28525
+
28526
+
28527
+
28528
+
28529
+
28530
+
28531
+
28532
+
28533
+
28534
+
28535
+
28536
+
28537
+
28538
+
28539
+
28540
+
28541
+
28542
+
28543
+
28544
+
28545
+
28546
+
28547
+
28548
+
28549
+
28550
+
28551
+
28552
+
28553
+
28554
+
28555
+
28556
+
28557
+
28558
+
28559
+
28560
+
28561
+
28562
+
28563
+
28564
+
28565
+
28566
+
28567
+
28568
+
28569
+
28570
+
28571
+
28572
+
28573
+
28574
+
28575
+
28576
+
28577
+
28578
+
28579
+
28580
+
28581
+
28582
+
28583
+
28584
+
28585
+
28586
+
28587
+
28588
+
28589
+
28590
+
28591
+
28592
+
28593
+
28594
+
28595
+
28596
+
28597
+
28598
+
28599
+
28600
+
28601
+
28602
+
28603
+
28604
+
28605
+
28606
+
28607
+
28608
+
28609
+
28610
+
28611
+
28612
+
28613
+
28614
+
28615
+
28616
+
28617
+
28618
+
28619
+
28620
+
28621
+
28622
+
28623
+
28624
+
28625
+
28626
+
28627
+
28628
+
28629
+
28630
+
28631
+
28632
+
28633
+
28634
+
28635
+
28636
+
28637
+
28638
+
28639
+
28640
+
28641
+
28642
+
28643
+
28644
+
28645
+
28646
+
28647
+
28648
+
28649
+
28650
+
28651
+
28652
+
28653
+
28654
+
28655
+
28656
+
28657
+
28658
+
28659
+
28660
+
28661
+
28662
+
28663
+
28664
+
28665
+
28666
+
28667
+
28668
+
28669
+
28670
+
28671
+
28672
+
28673
+
28674
+
28675
+
28676
+
28677
+
28678
+
28679
+
28680
+
28681
+
28682
+
28683
+
28684
+
28685
+
28686
+
28687
+
28688
+
28689
+
28690
+
28691
+
28692
+
28693
+
28694
+
28695
+
28696
+
28697
+
28698
+
28699
+
28700
+
28701
+
28702
+
28703
+
28704
+
28705
+
28706
+
28707
+
28708
+
28709
+
28710
+
28711
+
28712
+
28713
+
28714
+
28715
+
28716
+
28717
+
28718
+
28719
+
28720
+
28721
+
28722
+
28723
+
28724
+
28725
+
28726
+
28727
+
28728
+
28729
+
28730
+
28731
+
28732
+
28733
+
28734
+
28735
+
28736
+
28737
+
28738
+
28739
+
28740
+
28741
+
28742
+
28743
+
28744
+
28745
+
28746
+
28747
+
28748
+
28749
+
28750
+
28751
+
28752
+
28753
+
28754
+
28755
+
28756
+
28757
+
28758
+
28759
+
28760
+
28761
+
28762
+
28763
+
28764
+
28765
+
28766
+
28767
+
28768
+
28769
+
28770
+
28771
+
28772
+
28773
+
28774
+
28775
+
28776
+
28777
+
28778
+ Training...: 79% 3459/4393 [4:33:00<1:30:43, 5.83s/it]
28779
+ Step... (23700 | Loss: 0.04192772135138512, Learning Rate: 5.313333167578094e-05, Gradient Norm: 0.35688653588294983)
28780
+ Step... (23725 | Loss: 0.03448999300599098, Learning Rate: 5.308282925398089e-05, Gradient Norm: 0.3494829535484314)
28781
+ Step... (23750 | Loss: 0.02678181789815426, Learning Rate: 5.303232683218084e-05, Gradient Norm: 0.29735514521598816)
28782
+ Step... (23775 | Loss: 0.035396356135606766, Learning Rate: 5.298181713442318e-05, Gradient Norm: 0.27143338322639465)
28783
+ Step... (23800 | Loss: 0.023322325199842453, Learning Rate: 5.2931314712623134e-05, Gradient Norm: 0.2984119653701782)
28784
+ Step... (23825 | Loss: 0.037303145974874496, Learning Rate: 5.288080865284428e-05, Gradient Norm: 0.3529098331928253)
28785
+ Step... (23850 | Loss: 0.026544779539108276, Learning Rate: 5.2830302593065426e-05, Gradient Norm: 0.3024252653121948)
28786
+ Step... (23875 | Loss: 0.026226522400975227, Learning Rate: 5.277980017126538e-05, Gradient Norm: 0.242228701710701)
28787
+ Step... (23900 | Loss: 0.025965316221117973, Learning Rate: 5.2729294111486524e-05, Gradient Norm: 0.28718680143356323)
28788
+ Step... (23925 | Loss: 0.03736674785614014, Learning Rate: 5.267878805170767e-05, Gradient Norm: 0.45029348134994507)
28789
+ Step... (23950 | Loss: 0.05321226641535759, Learning Rate: 5.262827835395001e-05, Gradient Norm: 1.0017284154891968)
28790
+ Step... (23975 | Loss: 0.029298288747668266, Learning Rate: 5.257777957012877e-05, Gradient Norm: 0.552095890045166)
28791
+ Step... (24000 | Loss: 0.045580193400382996, Learning Rate: 5.252726987237111e-05, Gradient Norm: 0.6694679856300354)
28792
+ Step... (24025 | Loss: 0.038904935121536255, Learning Rate: 5.247676381259225e-05, Gradient Norm: 0.6009917259216309)
28793
+ Step... (24050 | Loss: 0.06289307773113251, Learning Rate: 5.242626502877101e-05, Gradient Norm: 0.5505169034004211)
28794
+ Step... (24075 | Loss: 0.03959391266107559, Learning Rate: 5.237575533101335e-05, Gradient Norm: 0.3400160074234009)
28795
+ Step... (24100 | Loss: 0.022501932457089424, Learning Rate: 5.232524563325569e-05, Gradient Norm: 0.3429368734359741)
28796
+ Step... (24125 | Loss: 0.03071441687643528, Learning Rate: 5.227475048741326e-05, Gradient Norm: 0.679693877696991)
28797
+ Step... (24150 | Loss: 0.043432459235191345, Learning Rate: 5.2224240789655596e-05, Gradient Norm: 0.3760949671268463)
28798
+ Step... (24175 | Loss: 0.03176066651940346, Learning Rate: 5.2173731091897935e-05, Gradient Norm: 0.3627036213874817)
28799
+ Step... (24200 | Loss: 0.029321245849132538, Learning Rate: 5.21232359460555e-05, Gradient Norm: 0.2902766466140747)
28800
+ Step... (24225 | Loss: 0.031011415645480156, Learning Rate: 5.207272624829784e-05, Gradient Norm: 0.3715132474899292)
28801
+ Step... (24250 | Loss: 0.04117705672979355, Learning Rate: 5.202221655054018e-05, Gradient Norm: 0.3848026990890503)
28802
+ Step... (24275 | Loss: 0.035337090492248535, Learning Rate: 5.1971721404697746e-05, Gradient Norm: 0.35255369544029236)
28803
+ Step... (24300 | Loss: 0.03448829799890518, Learning Rate: 5.1921211706940085e-05, Gradient Norm: 0.36964043974876404)
28804
+ Step... (24325 | Loss: 0.05197534337639809, Learning Rate: 5.187070200918242e-05, Gradient Norm: 0.45723956823349)
28805
+ Step... (24350 | Loss: 0.05632118508219719, Learning Rate: 5.182020686333999e-05, Gradient Norm: 3.6309027671813965)
28806
+ Step... (24375 | Loss: 0.04953671619296074, Learning Rate: 5.176969716558233e-05, Gradient Norm: 1.7830593585968018)
28807
+ Step... (24400 | Loss: 0.03154705464839935, Learning Rate: 5.171918746782467e-05, Gradient Norm: 0.4155527651309967)
28808
+ Step... (24425 | Loss: 0.03356127068400383, Learning Rate: 5.166868504602462e-05, Gradient Norm: 0.31667256355285645)
28809
+ Step... (24450 | Loss: 0.035732828080654144, Learning Rate: 5.161818262422457e-05, Gradient Norm: 0.399539977312088)
28810
+ Step... (24475 | Loss: 0.036945439875125885, Learning Rate: 5.156767292646691e-05, Gradient Norm: 0.38079139590263367)
28811
+ Step... (24500 | Loss: 0.04229236766695976, Learning Rate: 5.1517170504666865e-05, Gradient Norm: 0.48107612133026123)
28812
+ Step... (24525 | Loss: 0.02834494039416313, Learning Rate: 5.146666808286682e-05, Gradient Norm: 0.2878149449825287)
28813
+ Step... (24550 | Loss: 0.028665006160736084, Learning Rate: 5.1416158385109156e-05, Gradient Norm: 0.44036924839019775)
28814
+ Step... (24575 | Loss: 0.04935942962765694, Learning Rate: 5.136565596330911e-05, Gradient Norm: 0.40158262848854065)
28815
+ Step... (24600 | Loss: 0.034180428832769394, Learning Rate: 5.1315149903530255e-05, Gradient Norm: 3.239790439605713)
28816
+ Step... (24625 | Loss: 0.043823905289173126, Learning Rate: 5.12646438437514e-05, Gradient Norm: 0.3326380252838135)
28817
+ Step... (24650 | Loss: 0.035595715045928955, Learning Rate: 5.1214137783972546e-05, Gradient Norm: 0.3283407688140869)
28818
+ Step... (24675 | Loss: 0.03204268217086792, Learning Rate: 5.11636353621725e-05, Gradient Norm: 0.3907879590988159)
28819
+ Step... (24700 | Loss: 0.0314224548637867, Learning Rate: 5.111312566441484e-05, Gradient Norm: 0.3861796259880066)
28820
+ Step... (24725 | Loss: 0.02714911662042141, Learning Rate: 5.106262324261479e-05, Gradient Norm: 0.2878110110759735)
28821
+ Step... (24750 | Loss: 0.03068208508193493, Learning Rate: 5.1012120820814744e-05, Gradient Norm: 0.38559654355049133)
28822
+ Step... (24775 | Loss: 0.03362426161766052, Learning Rate: 5.096161112305708e-05, Gradient Norm: 0.327539324760437)
28823
+ Step... (24800 | Loss: 0.04312548413872719, Learning Rate: 5.0911108701257035e-05, Gradient Norm: 0.5235893130302429)
28824
+ Step... (24825 | Loss: 0.055415429174900055, Learning Rate: 5.086060627945699e-05, Gradient Norm: 0.3906141221523285)
28825
+ Step... (24850 | Loss: 0.0417981818318367, Learning Rate: 5.081009658169933e-05, Gradient Norm: 0.36772969365119934)
28826
+ Step... (24875 | Loss: 0.027505910024046898, Learning Rate: 5.075959415989928e-05, Gradient Norm: 0.48655566573143005)
28827
+ Step... (24900 | Loss: 0.04489564523100853, Learning Rate: 5.070909173809923e-05, Gradient Norm: 0.39025452733039856)
28828
+ Step... (24925 | Loss: 0.03899327293038368, Learning Rate: 5.065858204034157e-05, Gradient Norm: 0.3016781806945801)
28829
+ Step... (24950 | Loss: 0.017189113423228264, Learning Rate: 5.0608079618541524e-05, Gradient Norm: 0.29946625232696533)
28830
+ Step... (24975 | Loss: 0.045142460614442825, Learning Rate: 5.055757719674148e-05, Gradient Norm: 0.373174250125885)
28831
+ Step... (25000 | Loss: 0.02781319059431553, Learning Rate: 5.0507067498983815e-05, Gradient Norm: 0.31940504908561707)
28832
+ Step... (25025 | Loss: 0.03962790593504906, Learning Rate: 5.045656507718377e-05, Gradient Norm: 0.3029711842536926)
28833
+ Step... (25050 | Loss: 0.03775897994637489, Learning Rate: 5.040606265538372e-05, Gradient Norm: 0.43249836564064026)
28834
+ Step... (25075 | Loss: 0.04975719004869461, Learning Rate: 5.035555295762606e-05, Gradient Norm: 0.33092451095581055)
28835
+ Step... (25100 | Loss: 0.05254314839839935, Learning Rate: 5.030505053582601e-05, Gradient Norm: 0.43144869804382324)
28836
+
28837
+
28838
+
28839
+
28840
+
28841
+
28842
+
28843
+
28844
+
28845
+
28846
+
28847
+
28848
+
28849
+
28850
+
28851
+
28852
+
28853
+
28854
+
28855
+
28856
+
28857
+
28858
+
28859
+
28860
+
28861
+
28862
+
28863
+
28864
+
28865
+
28866
+
28867
+
28868
+
28869
+
28870
+
28871
+
28872
+
28873
+
28874
+
28875
+
28876
+
28877
+
28878
+
28879
+
28880
+
28881
+
28882
+
28883
+
28884
+
28885
+
28886
+
28887
+
28888
+
28889
+
28890
+
28891
+
28892
+
28893
+
28894
+
28895
+
28896
+
28897
+
28898
+
28899
+
28900
+
28901
+
28902
+
28903
+
28904
+
28905
+
28906
+
28907
+
28908
+
28909
+
28910
+
28911
+
28912
+
28913
+
28914
+
28915
+
28916
+
28917
+
28918
+
28919
+
28920
+
28921
+
28922
+
28923
+
28924
+
28925
+
28926
+
28927
+
28928
+
28929
+
28930
+
28931
+
28932
+
28933
+
28934
+
28935
+
28936
+
28937
+
28938
+
28939
+
28940
+
28941
+
28942
+
28943
+
28944
+
28945
+
28946
+
28947
+
28948
+
28949
+
28950
+
28951
+
28952
+
28953
+
28954
+
28955
+
28956
+
28957
+
28958
+
28959
+
28960
+
28961
+
28962
+
28963
+
28964
+
28965
+
28966
+
28967
+
28968
+
28969
+
28970
+
28971
+
28972
+
28973
+
28974
+
28975
+
28976
+
28977
+
28978
+
28979
+
28980
+
28981
+
28982
+
28983
+
28984
+
28985
+
28986
+
28987
+
28988
+
28989
+
28990
+
28991
+
28992
+
28993
+
28994
+
28995
+
28996
+
28997
+
28998
+
28999
+
29000
+
29001
+
29002
+
29003
+
29004
+
29005
+
29006
+
29007
+
29008
+
29009
+
29010
+
29011
+
29012
+
29013
+
29014
+
29015
+
29016
+
29017
+
29018
+
29019
+
29020
+
29021
+
29022
+
29023
+
29024
+
29025
+
29026
+
29027
+
29028
+
29029
+
29030
+
29031
+
29032
+
29033
+
29034
+
29035
+
29036
+
29037
+
29038
+
29039
+
29040
+
29041
+
29042
+
29043
+
29044
+
29045
+
29046
+
29047
+
29048
+
29049
+
29050
+
29051
+
29052
+
29053
+
29054
+
29055
+
29056
+
29057
+
29058
+
29059
+
29060
+
29061
+
29062
+
29063
+
29064
+
29065
+
29066
+
29067
+
29068
+
29069
+
29070
+
29071
+
29072
+
29073
+
29074
+
29075
+
29076
+
29077
+
29078
+
29079
+
29080
+
29081
+
29082
+
29083
+
29084
+
29085
+
29086
+
29087
+
29088
+
29089
+
29090
+
29091
+
29092
+
29093
+
29094
+
29095
+
29096
+
29097
+
29098
+
29099
+
29100
+
29101
+
29102
+
29103
+
29104
+
29105
+
29106
+
29107
+
29108
+
29109
+
29110
+
29111
+
29112
+
29113
+
29114
+
29115
+
29116
+
29117
+
29118
+
29119
+
29120
+
29121
+
29122
+
29123
+
29124
+
29125
+
29126
+
29127
+
29128
+
29129
+
29130
+
29131
+
29132
+
29133
+
29134
+
29135
+
29136
+
29137
+
29138
+
29139
+
29140
+
29141
+
29142
+
29143
+
29144
+
29145
+
29146
+
29147
+
29148
+
29149
+
29150
+
29151
+
29152
+
29153
+
29154
+
29155
+
29156
+
29157
+
29158
+
29159
+
29160
+
29161
+
29162
+
29163
+
29164
+
29165
+
29166
+
29167
+
29168
+
29169
+
29170
+
29171
+
29172
+
29173
+
29174
+
29175
+
29176
+
29177
+
29178
+
29179
+
29180
+
29181
+
29182
+
29183
+
29184
+
29185
+
29186
+
29187
+
29188
+
29189
+
29190
+
29191
+
29192
+
29193
+
29194
+
29195
+
29196
+
29197
+
29198
+
29199
+
29200
+
29201
+
29202
+
29203
+
29204
+
29205
+
29206
+
29207
+
29208
+
29209
+
29210
+
29211
+
29212
+
29213
+
29214
+
29215
+
29216
+
29217
+
29218
+
29219
+
29220
+
29221
+
29222
+
29223
+
29224
+
29225
+
29226
+
29227
+
29228
+
29229
+
29230
+
29231
+
29232
+
29233
+
29234
+
29235
+
29236
+
29237
+
29238
+
29239
+
29240
+
29241
+
29242
+
29243
+
29244
+
29245
+
29246
+
29247
+
29248
+
29249
+
29250
+
29251
+
29252
+
29253
+
29254
+
29255
+
29256
+
29257
+
29258
+
29259
+
29260
+
29261
+
29262
+
29263
+
29264
+
29265
+
29266
+
29267
+
29268
+
29269
+
29270
+
29271
+
29272
+
29273
+
29274
+
29275
+
29276
+
29277
+
29278
+
29279
+
29280
+
29281
+
29282
+
29283
+
29284
+
29285
+
29286
+
29287
+
29288
+
29289
+
29290
+
29291
+
29292
+
29293
+
29294
+
29295
+
29296
+
29297
+
29298
+
29299
+
29300
+
29301
+
29302
+
29303
+
29304
+
29305
+
29306
+
29307
+
29308
+
29309
+
29310
+
29311
+
29312
+
29313
+
29314
+
29315
+
29316
+
29317
+
29318
+
29319
+
29320
+
29321
+
29322
+
29323
+
29324
+
29325
+
29326
+
29327
+
29328
+
29329
+
29330
+
29331
+
29332
+
29333
+
29334
+
29335
+
29336
+
29337
+
29338
+
29339
+
29340
+
29341
+
29342
+
29343
+
29344
+
29345
+
29346
+
29347
+
29348
+
29349
+
29350
+
29351
+
29352
+
29353
+
29354
+
29355
+
29356
+
29357
+
29358
+
29359
+
29360
+
29361
+
29362
+
29363
+
29364
+
29365
+
29366
+
29367
+
29368
+
29369
+
29370
+
29371
+
29372
+
29373
+
29374
+
29375
+
29376
+
29377
+
29378
+
29379
+
29380
+
29381
+
29382
+
29383
+
29384
+
29385
+
29386
+
29387
+
29388
+
29389
+
29390
+
29391
+
29392
+
29393
+
29394
+
29395
+
29396
+
29397
+
29398
+
29399
+
29400
+
29401
+
29402
+
29403
+
29404
+
29405
+
29406
+
29407
+
29408
+
29409
+
29410
+
29411
+
29412
+
29413
+
29414
+
29415
+
29416
+
29417
+
29418
+
29419
+
29420
+
29421
+
29422
+
29423
+
29424
+
29425
+
29426
+
29427
+
29428
+
29429
+
29430
+
29431
+
29432
+
29433
+
29434
+
29435
+
29436
+
29437
+
29438
+
29439
+
29440
+
29441
+
29442
+
29443
+
29444
+
29445
+
29446
+
29447
+
29448
+
29449
+
29450
+
29451
+
29452
+
29453
+
29454
+
29455
+
29456
+
29457
+
29458
+
29459
+
29460
+
29461
+
29462
+
29463
+
29464
+
29465
+
29466
+
29467
+
29468
+
29469
+
29470
+
29471
+
29472
+
29473
+
29474
+
29475
+
29476
+
29477
+
29478
+
29479
+
29480
+
29481
+
29482
+
29483
+
29484
+
29485
+
29486
+
29487
+
29488
+
29489
+
29490
+
29491
+
29492
+
29493
+
29494
+
29495
+
29496
+
29497
+
29498
+
29499
+
29500
+
29501
+
29502
+
29503
+
29504
+
29505
+
29506
+
29507
+
29508
+
29509
+
29510
+
29511
+
29512
+
29513
+
29514
+
29515
+
29516
+
29517
+
29518
+
29519
+
29520
+
29521
+
29522
+
29523
+
29524
+
29525
+
29526
+
29527
+
29528
+
29529
+
29530
+
29531
+
29532
+
29533
+
29534
+
29535
+
29536
+
29537
+
29538
+
29539
+
29540
+
29541
+
29542
+
29543
+
29544
+
29545
+
29546
+
29547
+
29548
+
29549
+
29550
+
29551
+
29552
+
29553
+
29554
+
29555
+
29556
+
29557
+
29558
+
29559
+
29560
+
29561
+
29562
+
29563
+
29564
+
29565
+
29566
+
29567
+
29568
+
29569
+
29570
+
29571
+
29572
+
29573
+
29574
+
29575
+
29576
+
29577
+
29578
+
29579
+
29580
+
29581
+
29582
+
29583
+
29584
+
29585
+
29586
+
29587
+
29588
+
29589
+
29590
+
29591
+
29592
+
29593
+
29594
+
29595
+
29596
+
29597
+
29598
+
29599
+
29600
+
29601
+
29602
+
29603
+
29604
+
29605
+
29606
+
29607
+
29608
+
29609
+
29610
+
29611
+
29612
+
29613
+
29614
+
29615
+
29616
+
29617
+
29618
+
29619
+
29620
+
29621
+
29622
+
29623
+
29624
+
29625
+
29626
+
29627
+
29628
+
29629
+
29630
+
29631
+
29632
+
29633
+
29634
+
29635
+
29636
+
29637
+
29638
+
29639
+
29640
+
29641
+
29642
+
29643
+
29644
+
29645
+
29646
+
29647
+
29648
+
29649
+
29650
+
29651
+
29652
+
29653
+
29654
+
29655
+
29656
+
29657
+
29658
+
29659
+
29660
+
29661
+
29662
+
29663
+
29664
+
29665
+
29666
+
29667
+
29668
+
29669
+
29670
+
29671
+
29672
+
29673
+
29674
+
29675
+
29676
+
29677
+
29678
+
29679
+
29680
+
29681
+
29682
+
29683
+
29684
+
29685
+
29686
+
29687
+
29688
+
29689
+
29690
+
29691
+
29692
+
29693
+
29694
+
29695
+
29696
+
29697
+
29698
+
29699
+
29700
+
29701
+
29702
+
29703
+
29704
+
29705
+
29706
+
29707
+
29708
+
29709
+
29710
+
29711
+
29712
+
29713
+
29714
+
29715
+
29716
+
29717
+
29718
+
29719
+
29720
+
29721
+
29722
+
29723
+
29724
+
29725
+
29726
+
29727
+
29728
+
29729
+
29730
+
29731
+
29732
+
29733
+
29734
+
29735
+
29736
+
29737
+
29738
+
29739
+
29740
+
29741
+
29742
+
29743
+
29744
+
29745
+
29746
+
29747
+
29748
+
29749
+ Step... (20000/50000 | Eval Loss: 1.021510124206543 | Eval wer: 0.05054961214661226 | Eval cer: 0.0362100285658818 |): 50% 6/12 [35:03:02<35:02:00, 21020.10s/it]
29750
+ Step... (25175 | Loss: 0.032494306564331055, Learning Rate: 5.015353599446826e-05, Gradient Norm: 0.25956735014915466)
29751
+ Step... (25200 | Loss: 0.027480341494083405, Learning Rate: 5.01030299346894e-05, Gradient Norm: 0.28603485226631165)
29752
+ Step... (25225 | Loss: 0.036957938224077225, Learning Rate: 5.005252387491055e-05, Gradient Norm: 0.3290182650089264)
29753
+ Step... (25250 | Loss: 0.03833230957388878, Learning Rate: 5.0002017815131694e-05, Gradient Norm: 0.38001635670661926)
29754
+ Step... (25275 | Loss: 0.02907697670161724, Learning Rate: 4.995151539333165e-05, Gradient Norm: 0.25869879126548767)
29755
+ Step... (25300 | Loss: 0.03300327807664871, Learning Rate: 4.9901005695573986e-05, Gradient Norm: 0.31227272748947144)
29756
+ Step... (25325 | Loss: 0.03420973941683769, Learning Rate: 4.985050327377394e-05, Gradient Norm: 0.3454754948616028)
29757
+ Step... (25350 | Loss: 0.022977076470851898, Learning Rate: 4.980000085197389e-05, Gradient Norm: 0.37838831543922424)
29758
+ Step... (25375 | Loss: 0.025919809937477112, Learning Rate: 4.974949115421623e-05, Gradient Norm: 0.2612961530685425)
29759
+ Step... (25400 | Loss: 0.04246160387992859, Learning Rate: 4.969898873241618e-05, Gradient Norm: 0.3997235894203186)
29760
+ Step... (25425 | Loss: 0.016840150579810143, Learning Rate: 4.9648486310616136e-05, Gradient Norm: 0.2279246747493744)
29761
+ Step... (25450 | Loss: 0.03332209214568138, Learning Rate: 4.9597976612858474e-05, Gradient Norm: 0.41686922311782837)
29762
+ Step... (25475 | Loss: 0.03769592195749283, Learning Rate: 4.954747419105843e-05, Gradient Norm: 0.36882728338241577)
29763
+ Step... (25500 | Loss: 0.026205237954854965, Learning Rate: 4.949697176925838e-05, Gradient Norm: 0.2956397533416748)
29764
+ Step... (25525 | Loss: 0.03775778040289879, Learning Rate: 4.944646207150072e-05, Gradient Norm: 0.41357094049453735)
29765
+ Step... (25550 | Loss: 0.042417582124471664, Learning Rate: 4.939595964970067e-05, Gradient Norm: 0.41240692138671875)
29766
+ Step... (25575 | Loss: 0.05006255954504013, Learning Rate: 4.9345457227900624e-05, Gradient Norm: 0.34435006976127625)
29767
+ Step... (25600 | Loss: 0.03163505718111992, Learning Rate: 4.929494753014296e-05, Gradient Norm: 0.33239689469337463)
29768
+ Step... (25625 | Loss: 0.028710873797535896, Learning Rate: 4.9244445108342916e-05, Gradient Norm: 0.3661336600780487)
29769
+ Step... (25650 | Loss: 0.026209404692053795, Learning Rate: 4.919394268654287e-05, Gradient Norm: 0.33108606934547424)
29770
+ Step... (25675 | Loss: 0.04295961931347847, Learning Rate: 4.914343298878521e-05, Gradient Norm: 0.4011436700820923)
29771
+ Step... (25700 | Loss: 0.03574109822511673, Learning Rate: 4.909293056698516e-05, Gradient Norm: 0.36241382360458374)
29772
+ Step... (25725 | Loss: 0.04625528305768967, Learning Rate: 4.90424208692275e-05, Gradient Norm: 0.3794752061367035)
29773
+ Step... (25750 | Loss: 0.04854767024517059, Learning Rate: 4.899191844742745e-05, Gradient Norm: 0.3879258930683136)
29774
+ Step... (25775 | Loss: 0.035202860832214355, Learning Rate: 4.8941416025627404e-05, Gradient Norm: 0.3439372777938843)
29775
+ Step... (25800 | Loss: 0.0253062155097723, Learning Rate: 4.889090632786974e-05, Gradient Norm: 0.3394845128059387)
29776
+ Step... (25825 | Loss: 0.0289002638310194, Learning Rate: 4.8840403906069696e-05, Gradient Norm: 0.29701897501945496)
29777
+ Step... (25850 | Loss: 0.0273846834897995, Learning Rate: 4.878989784629084e-05, Gradient Norm: 0.2742794454097748)
29778
+ Step... (25875 | Loss: 0.0390518382191658, Learning Rate: 4.873939178651199e-05, Gradient Norm: 1.3369203805923462)
29779
+ Step... (25900 | Loss: 0.03895574063062668, Learning Rate: 4.868888572673313e-05, Gradient Norm: 0.42404425144195557)
29780
+ Step... (25925 | Loss: 0.021992146968841553, Learning Rate: 4.8638383304933086e-05, Gradient Norm: 0.28285321593284607)
29781
+ Step... (25950 | Loss: 0.03738517314195633, Learning Rate: 4.8587873607175425e-05, Gradient Norm: 0.3228248655796051)
29782
+ Step... (25975 | Loss: 0.0590066984295845, Learning Rate: 4.853737118537538e-05, Gradient Norm: 0.40467754006385803)
29783
+ Step... (26000 | Loss: 0.018190395087003708, Learning Rate: 4.848686876357533e-05, Gradient Norm: 0.2537316083908081)
29784
+ Step... (26025 | Loss: 0.043122727423906326, Learning Rate: 4.843635906581767e-05, Gradient Norm: 0.45839107036590576)
29785
+ Step... (26050 | Loss: 0.034402068704366684, Learning Rate: 4.838585664401762e-05, Gradient Norm: 0.3322671949863434)
29786
+ Step... (26075 | Loss: 0.03683105483651161, Learning Rate: 4.8335354222217575e-05, Gradient Norm: 0.3035377860069275)
29787
+ Step... (26100 | Loss: 0.0251045860350132, Learning Rate: 4.828484452445991e-05, Gradient Norm: 0.35312619805336)
29788
+ Step... (26125 | Loss: 0.06226085126399994, Learning Rate: 4.8234342102659866e-05, Gradient Norm: 0.409700870513916)
29789
+ Step... (26150 | Loss: 0.060523878782987595, Learning Rate: 4.818383968085982e-05, Gradient Norm: 0.48786985874176025)
29790
+ Step... (26175 | Loss: 0.026466630399227142, Learning Rate: 4.813332998310216e-05, Gradient Norm: 0.39357778429985046)
29791
+ Step... (26200 | Loss: 0.051018837839365005, Learning Rate: 4.808282756130211e-05, Gradient Norm: 0.49015292525291443)
29792
+ Step... (26225 | Loss: 0.04816717654466629, Learning Rate: 4.8032325139502063e-05, Gradient Norm: 0.3389943242073059)
29793
+ Step... (26250 | Loss: 0.04806462675333023, Learning Rate: 4.79818154417444e-05, Gradient Norm: 0.39536547660827637)
29794
+ Step... (26275 | Loss: 0.03313664346933365, Learning Rate: 4.7931313019944355e-05, Gradient Norm: 0.3253754675388336)
29795
+ Step... (26300 | Loss: 0.031168362125754356, Learning Rate: 4.788081059814431e-05, Gradient Norm: 0.3615038990974426)
29796
+ Step... (26325 | Loss: 0.02266540192067623, Learning Rate: 4.7830300900386646e-05, Gradient Norm: 0.29001644253730774)
29797
+ Step... (26350 | Loss: 0.043101128190755844, Learning Rate: 4.77797984785866e-05, Gradient Norm: 0.5296374559402466)
29798
+
29799
+
29800
+
29801
+
29802
+
29803
+
29804
+
29805
+
29806
+
29807
+
29808
+
29809
+
29810
+
29811
+
29812
+
29813
+
29814
+
29815
+
29816
+
29817
+
29818
+
29819
+
29820
+
29821
+
29822
+
29823
+
29824
+
29825
+
29826
+
29827
+
29828
+
29829
+
29830
+
29831
+
29832
+
29833
+
29834
+
29835
+
29836
+
29837
+
29838
+
29839
+
29840
+
29841
+
29842
+
29843
+
29844
+
29845
+
29846
+
29847
+
29848
+
29849
+
29850
+
29851
+
29852
+
29853
+
29854
+
29855
+
29856
+
29857
+
29858
+
29859
+
29860
+
29861
+
29862
+
29863
+
29864
+
29865
+
29866
+
29867
+
29868
+
29869
+
29870
+
29871
+
29872
+
29873
+
29874
+
29875
+
29876
+
29877
+
29878
+
29879
+
29880
+
29881
+
29882
+
29883
+
29884
+
29885
+
29886
+
29887
+
29888
+
29889
+
29890
+
29891
+
29892
+
29893
+
29894
+
29895
+
29896
+
29897
+
29898
+
29899
+
29900
+
29901
+
29902
+
29903
+
29904
+
29905
+
29906
+
29907
+
29908
+
29909
+
29910
+
29911
+
29912
+
29913
+
29914
+
29915
+
29916
+
29917
+
29918
+
29919
+
29920
+
29921
+
29922
+
29923
+
29924
+
29925
+
29926
+
29927
+
29928
+
29929
+
29930
+
29931
+
29932
+
29933
+
29934
+
29935
+
29936
+
29937
+
29938
+
29939
+
29940
+
29941
+
29942
+
29943
+
29944
+
29945
+
29946
+
29947
+
29948
+
29949
+
29950
+
29951
+
29952
+
29953
+
29954
+
29955
+
29956
+
29957
+
29958
+
29959
+
29960
+
29961
+
29962
+
29963
+
29964
+
29965
+
29966
+
29967
+
29968
+
29969
+
29970
+
29971
+
29972
+
29973
+
29974
+
29975
+
29976
+
29977
+
29978
+
29979
+
29980
+
29981
+
29982
+
29983
+
29984
+
29985
+
29986
+
29987
+
29988
+
29989
+
29990
+
29991
+
29992
+
29993
+
29994
+
29995
+
29996
+
29997
+
29998
+
29999
+
30000
+
30001
+
30002
+
30003
+
30004
+
30005
+
30006
+
30007
+
30008
+
30009
+
30010
+
30011
+
30012
+
30013
+
30014
+
30015
+
30016
+
30017
+
30018
+
30019
+
30020
+
30021
+
30022
+
30023
+
30024
+
30025
+
30026
+
30027
+
30028
+
30029
+
30030
+
30031
+
30032
+
30033
+
30034
+
30035
+
30036
+
30037
+
30038
+
30039
+
30040
+
30041
+
30042
+
30043
+
30044
+
30045
+
30046
+
30047
+
30048
+
30049
+
30050
+
30051
+
30052
+
30053
+
30054
+
30055
+
30056
+
30057
+
30058
+
30059
+
30060
+
30061
+
30062
+
30063
+
30064
+
30065
+
30066
+
30067
+
30068
+
30069
+
30070
+
30071
+
30072
+
30073
+
30074
+
30075
+
30076
+
30077
+
30078
+
30079
+
30080
+
30081
+
30082
+
30083
+
30084
+
30085
+
30086
+
30087
+
30088
+
30089
+
30090
+
30091
+
30092
+
30093
+
30094
+
30095
+
30096
+
30097
+
30098
+
30099
+
30100
+
30101
+
30102
+
30103
+
30104
+
30105
+
30106
+
30107
+
30108
+
30109
+
30110
+
30111
+
30112
+
30113
+
30114
+
30115
+
30116
+
30117
+
30118
+
30119
+
30120
+
30121
+
30122
+
30123
+
30124
+
30125
+
30126
+
30127
+
30128
+
30129
+
30130
+
30131
+
30132
+
30133
+
30134
+
30135
+
30136
+
30137
+
30138
+
30139
+
30140
+
30141
+
30142
+
30143
+
30144
+
30145
+
30146
+
30147
+
30148
+
30149
+
30150
+
30151
+
30152
+
30153
+
30154
+
30155
+
30156
+
30157
+
30158
+
30159
+
30160
+
30161
+
30162
+
30163
+
30164
+
30165
+
30166
+
30167
+
30168
+
30169
+
30170
+
30171
+
30172
+
30173
+
30174
+
30175
+
30176
+
30177
+
30178
+
30179
+
30180
+
30181
+
30182
+
30183
+
30184
+
30185
+
30186
+
30187
+
30188
+
30189
+
30190
+
30191
+
30192
+
30193
+
30194
+
30195
+
30196
+
30197
+
30198
+
30199
+
30200
+
30201
+
30202
+
30203
+
30204
+
30205
+
30206
+
30207
+
30208
+
30209
+
30210
+
30211
+
30212
+
30213
+
30214
+
30215
+
30216
+
30217
+
30218
+
30219
+
30220
+
30221
+
30222
+
30223
+
30224
+
30225
+
30226
+
30227
+
30228
+
30229
+
30230
+
30231
+
30232
+
30233
+
30234
+
30235
+
30236
+
30237
+
30238
+
30239
+
30240
+
30241
+
30242
+
30243
+
30244
+
30245
+
30246
+
30247
+
30248
+
30249
+
30250
+
30251
+
30252
+
30253
+
30254
+
30255
+
30256
+
30257
+
30258
+
30259
+
30260
+
30261
+
30262
+
30263
+
30264
+
30265
+
30266
+
30267
+
30268
+
30269
+
30270
+
30271
+
30272
+
30273
+
30274
+
30275
+
30276
+
30277
+
30278
+
30279
+
30280
+
30281
+
30282
+
30283
+
30284
+
30285
+
30286
+
30287
+
30288
+
30289
+
30290
+
30291
+
30292
+
30293
+
30294
+
30295
+
30296
+
30297
+
30298
+
30299
+
30300
+
30301
+
30302
+
30303
+
30304
+
30305
+
30306
+
30307
+
30308
+
30309
+
30310
+
30311
+
30312
+
30313
+
30314
+
30315
+
30316
+
30317
+
30318
+
30319
+
30320
+
30321
+
30322
+
30323
+
30324
+
30325
+
30326
+
30327
+
30328
+
30329
+
30330
+
30331
+
30332
+
30333
+
30334
+
30335
+
30336
+
30337
+
30338
+
30339
+
30340
+
30341
+
30342
+
30343
+
30344
+
30345
+
30346
+
30347
+
30348
+
30349
+
30350
+
30351
+
30352
+
30353
+
30354
+
30355
+
30356
+
30357
+
30358
+
30359
+
30360
+
30361
+
30362
+
30363
+
30364
+
30365
+
30366
+
30367
+
30368
+
30369
+
30370
+
30371
+
30372
+
30373
+
30374
+
30375
+
30376
+
30377
+
30378
+
30379
+
30380
+
30381
+
30382
+
30383
+
30384
+
30385
+
30386
+
30387
+
30388
+
30389
+
30390
+
30391
+
30392
+
30393
+
30394
+
30395
+
30396
+
30397
+
30398
+
30399
+
30400
+
30401
+
30402
+
30403
+
30404
+
30405
+
30406
+
30407
+
30408
+
30409
+
30410
+
30411
+
30412
+
30413
+
30414
+
30415
+
30416
+
30417
+
30418
+
30419
+
30420
+
30421
+
30422
+
30423
+
30424
+
30425
+
30426
+
30427
+
30428
+
30429
+
30430
+
30431
+
30432
+
30433
+
30434
+
30435
+
30436
+
30437
+
30438
+
30439
+
30440
+
30441
+
30442
+
30443
+
30444
+
30445
+
30446
+
30447
+
30448
+
30449
+
30450
+
30451
+
30452
+
30453
+
30454
+
30455
+
30456
+
30457
+
30458
+
30459
+
30460
+
30461
+
30462
+
30463
+
30464
+
30465
+
30466
+
30467
+
30468
+
30469
+
30470
+
30471
+
30472
+
30473
+
30474
+
30475
+
30476
+
30477
+
30478
+
30479
+
30480
+
30481
+
30482
+
30483
+
30484
+
30485
+
30486
+
30487
+
30488
+
30489
+
30490
+
30491
+
30492
+
30493
+
30494
+
30495
+
30496
+
30497
+
30498
+
30499
+
30500
+
30501
+
30502
+
30503
+
30504
+
30505
+
30506
+
30507
+
30508
+
30509
+
30510
+
30511
+
30512
+
30513
+
30514
+
30515
+
30516
+
30517
+
30518
+
30519
+
30520
+
30521
+
30522
+
30523
+
30524
+
30525
+
30526
+
30527
+
30528
+
30529
+
30530
+
30531
+
30532
+
30533
+
30534
+
30535
+
30536
+
30537
+
30538
+
30539
+
30540
+
30541
+
30542
+
30543
+
30544
+
30545
+
30546
+
30547
+
30548
+
30549
+
30550
+
30551
+
30552
+
30553
+
30554
+
30555
+
30556
+
30557
+
30558
+
30559
+
30560
+
30561
+
30562
+
30563
+
30564
+
30565
+
30566
+
30567
+
30568
+
30569
+
30570
+
30571
+
30572
+
30573
+
30574
+
30575
+
30576
+
30577
+
30578
+
30579
+
30580
+
30581
+
30582
+
30583
+
30584
+
30585
+
30586
+
30587
+
30588
+
30589
+
30590
+
30591
+
30592
+
30593
+
30594
+
30595
+
30596
+
30597
+
30598
+
30599
+
30600
+
30601
+
30602
+
30603
+
30604
+
30605
+
30606
+
30607
+
30608
+
30609
+
30610
+
30611
+
30612
+
30613
+
30614
+
30615
+
30616
+
30617
+
30618
+
30619
+
30620
+
30621
+
30622
+
30623
+
30624
+
30625
+
30626
+
30627
+
30628
+
30629
+
30630
+
30631
+
30632
+
30633
+
30634
+
30635
+
30636
+
30637
+
30638
+
30639
+
30640
+
30641
+
30642
+
30643
+
30644
+
30645
+
30646
+
30647
+
30648
+
30649
+
30650
+
30651
+
30652
+
30653
+
30654
+
30655
+
30656
+
30657
+
30658
+
30659
+
30660
+
30661
+
30662
+
30663
+
30664
+
30665
+
30666
+
30667
+
30668
+
30669
+
30670
+
30671
+
30672
+
30673
+
30674
+
30675
+
30676
+
30677
+
30678
+
30679
+
30680
+
30681
+
30682
+
30683
+
30684
+
30685
+
30686
+
30687
+
30688
+
30689
+
30690
+
30691
+
30692
+
30693
+
30694
+
30695
+
30696
+
30697
+
30698
+
30699
+
30700
+
30701
+
30702
+
30703
+
30704
+
30705
+
30706
+
30707
+
30708
+
30709
+
30710
+
30711
+
30712
+
30713
+
30714
+
30715
+
30716
+
30717
+
30718
+
30719
+
30720
+
30721
+
30722
+
30723
+
30724
+
30725
+
30726
+
30727
+
30728
+
30729
+
30730
+
30731
+
30732
+
30733
+
30734
+
30735
+
30736
+
30737
+
30738
+
30739
+
30740
+
30741
+
30742
+
30743
+
30744
+
30745
+
30746
+
30747
+
30748
+
30749
+
30750
+
30751
+
30752
+
30753
+
30754
+
30755
+
30756
+
30757
+
30758
+
30759
+
30760
+
30761
+
30762
+
30763
+
30764
+
30765
+
30766
+
30767
+
30768
+
30769
+
30770
+
30771
+
30772
+
30773
+
30774
+
30775
+
30776
+
30777
+
30778
+
30779
+
30780
+
30781
+
30782
+
30783
+
30784
+
30785
+
30786
+
30787
+
30788
+
30789
+
30790
+
30791
+
30792
+
30793
+
30794
+
30795
+
30796
+
30797
+
30798
+
30799
+
30800
+
30801
+
30802
+
30803
+
30804
+
30805
+
30806
+
30807
+
30808
+
30809
+
30810
+
30811
+
30812
+
30813
+
30814
+
30815
+
30816
+
30817
+
30818
+
30819
+
30820
+
30821
+
30822
+
30823
+
30824
+
30825
+
30826
+
30827
+
30828
+
30829
+
30830
+
30831
+
30832
+
30833
+
30834
+
30835
+
30836
+
30837
+
30838
+
30839
+
30840
+
30841
+
30842
+
30843
+
30844
+
30845
+
30846
+
30847
+
30848
+
30849
+
30850
+
30851
+
30852
+
30853
+
30854
+
30855
+
30856
+
30857
+
30858
+
30859
+
30860
+
30861
+
30862
+
30863
+
30864
+
30865
+
30866
+
30867
+
30868
+
30869
+
30870
+
30871
+
30872
+
30873
+
30874
+
30875
+
30876
+
30877
+
30878
+
30879
+
30880
+
30881
+
30882
+
30883
+
30884
+
30885
+
30886
+
30887
+
30888
+
30889
+
30890
+
30891
+
30892
+
30893
+
30894
+
30895
+
30896
+
30897
+
30898
+
30899
+
30900
+
30901
+
30902
+
30903
+
30904
+
30905
+
30906
+
30907
+
30908
+
30909
+
30910
+
30911
+
30912
+
30913
+
30914
+
30915
+
30916
+
30917
+
30918
+
30919
+
30920
+
30921
+
30922
+
30923
+
30924
+
30925
+
30926
+
30927
+
30928
+
30929
+
30930
+
30931
+
30932
+
30933
+
30934
+
30935
+
30936
+
30937
+
30938
+
30939
+
30940
+
30941
+
30942
+
30943
+
30944
+
30945
+
30946
+
30947
+
30948
+
30949
+
30950
+
30951
+
30952
+
30953
+
30954
+
30955
+
30956
+
30957
+
30958
+
30959
+
30960
+
30961
+
30962
+
30963
+
30964
+
30965
+
30966
+
30967
+
30968
+
30969
+
30970
+
30971
+
30972
+
30973
+
30974
+
30975
+
30976
+
30977
+
30978
+
30979
+
30980
+
30981
+
30982
+
30983
+
30984
+
30985
+
30986
+
30987
+
30988
+
30989
+
30990
+
30991
+
30992
+
30993
+
30994
+
30995
+
30996
+
30997
+
30998
+
30999
+
31000
+
31001
+
31002
+
31003
+
31004
+
31005
+
31006
+
31007
+
31008
+
31009
+
31010
+
31011
+
31012
+
31013
+
31014
+
31015
+
31016
+
31017
+
31018
+
31019
+
31020
+
31021
+
31022
+
31023
+
31024
+
31025
+
31026
+
31027
+
31028
+
31029
+
31030
+
31031
+
31032
+
31033
+
31034
+
31035
+
31036
+
31037
+
31038
+
31039
+
31040
+
31041
+
31042
+
31043
+
31044
+
31045
+
31046
+
31047
+
31048
+
31049
+
31050
+
31051
+
31052
+
31053
+
31054
+
31055
+
31056
+
31057
+
31058
+
31059
+
31060
+
31061
+
31062
+
31063
+
31064
+
31065
+
31066
+
31067
+
31068
+
31069
+
31070
+
31071
+
31072
+
31073
+
31074
+
31075
+
31076
+
31077
+
31078
+
31079
+
31080
+
31081
+
31082
+
31083
+
31084
+
31085
+
31086
+
31087
+
31088
+
31089
+
31090
+
31091
+
31092
+
31093
+
31094
+
31095
+
31096
+
31097
+
31098
+
31099
+
31100
+
31101
+
31102
+
31103
+
31104
+
31105
+
31106
+
31107
+
31108
+
31109
+
31110
+
31111
+
31112
+
31113
+
31114
+
31115
+
31116
+
31117
+
31118
+
31119
+
31120
+
31121
+
31122
+
31123
+
31124
+
31125
+
31126
+
31127
+
31128
+
31129
+
31130
+
31131
+
31132
+
31133
+
31134
+
31135
+
31136
+
31137
+
31138
+
31139
+
31140
+
31141
+
31142
+
31143
+
31144
+
31145
+
31146
+
31147
+
31148
+
31149
+
31150
+
31151
+
31152
+
31153
+
31154
+
31155
+
31156
+
31157
+
31158
+
31159
+
31160
+
31161
+
31162
+
31163
+
31164
+
31165
+
31166
+
31167
+
31168
+
31169
+
31170
+
31171
+
31172
+
31173
+
31174
+
31175
+
31176
+
31177
+
31178
+
31179
+
31180
+
31181
+
31182
+
31183
+
31184
+
31185
+
31186
+
31187
+
31188
+
31189
+
31190
+
31191
+
31192
+
31193
+
31194
+
31195
+
31196
+
31197
+
31198
+
31199
+
31200
+
31201
+
31202
+
31203
+
31204
+
31205
+
31206
+
31207
+
31208
+
31209
+
31210
+
31211
+
31212
+
31213
+
31214
+
31215
+
31216
+
31217
+
31218
+
31219
+
31220
+
31221
+
31222
+
31223
+
31224
+
31225
+
31226
+
31227
+
31228
+
31229
+
31230
+
31231
+
31232
+
31233
+
31234
+
31235
+
31236
+
31237
+
31238
+
31239
+
31240
+
31241
+
31242
+
31243
+
31244
+
31245
+
31246
+
31247
+
31248
+
31249
+
31250
+
31251
+
31252
+
31253
+
31254
+
31255
+
31256
+
31257
+
31258
+
31259
+
31260
+
31261
+
31262
+
31263
+
31264
+
31265
+
31266
+
31267
+
31268
+
31269
+
31270
+
31271
+
31272
+
31273
+
31274
+
31275
+
31276
+
31277
+
31278
+
31279
+
31280
+
31281
+
31282
+
31283
+
31284
+
31285
+
31286
+
31287
+
31288
+
31289
+
31290
+
31291
+
31292
+
31293
+
31294
+
31295
+
31296
+
31297
+
31298
+
31299
+
31300
+
31301
+
31302
+
31303
+
31304
+
31305
+
31306
+
31307
+
31308
+
31309
+
31310
+
31311
+
31312
+
31313
+
31314
+
31315
+
31316
+
31317
+
31318
+
31319
+
31320
+
31321
+
31322
+
31323
+
31324
+
31325
+
31326
+
31327
+
31328
+
31329
+
31330
+
31331
+
31332
+
31333
+
31334
+
31335
+
31336
+
31337
+
31338
+
31339
+
31340
+
31341
+
31342
+
31343
+
31344
+
31345
+
31346
+
31347
+
31348
+
31349
+
31350
+
31351
+
31352
+
31353
+
31354
+
31355
+
31356
+
31357
+
31358
+
31359
+
31360
+
31361
+
31362
+
31363
+
31364
+
31365
+
31366
+
31367
+
31368
+
31369
+
31370
+
31371
+
31372
+
31373
+
31374
+
31375
+
31376
+
31377
+
31378
+
31379
+
31380
+
31381
+
31382
+
31383
+
31384
+
31385
+
31386
+
31387
+
31388
+
31389
+
31390
+
31391
+
31392
+
31393
+
31394
+
31395
+
31396
+
31397
+
31398
+
31399
+
31400
+
31401
+
31402
+
31403
+
31404
+
31405
+
31406
+
31407
+
31408
+
31409
+
31410
+
31411
+
31412
+
31413
+
31414
+
31415
+
31416
+
31417
+
31418
+
31419
+
31420
+
31421
+
31422
+
31423
+
31424
+
31425
+
31426
+
31427
+
31428
+
31429
+
31430
+
31431
+
31432
+
31433
+
31434
+
31435
+
31436
+
31437
+
31438
+
31439
+
31440
+
31441
+
31442
+
31443
+
31444
+
31445
+
31446
+
31447
+
31448
+
31449
+
31450
+
31451
+
31452
+
31453
+
31454
+
31455
+
31456
+
31457
+
31458
+
31459
+
31460
+
31461
+
31462
+
31463
+
31464
+
31465
+
31466
+
31467
+
31468
+
31469
+
31470
+
31471
+
31472
+
31473
+
31474
+
31475
+
31476
+
31477
+
31478
+
31479
+
31480
+
31481
+
31482
+
31483
+
31484
+
31485
+
31486
+
31487
+
31488
+
31489
+
31490
+
31491
+
31492
+
31493
+
31494
+
31495
+
31496
+
31497
+
31498
+
31499
+ Training...: 40% 1742/4393 [2:17:29<2:51:06, 3.87s/it]
31500
+ Step... (26375 | Loss: 0.015395677648484707, Learning Rate: 4.772929605678655e-05, Gradient Norm: 1.1583268642425537)
31501
+ Step... (26400 | Loss: 0.009922747500240803, Learning Rate: 4.767878635902889e-05, Gradient Norm: 0.23366865515708923)
31502
+ Step... (26425 | Loss: 0.040154799818992615, Learning Rate: 4.7628283937228844e-05, Gradient Norm: 0.45613422989845276)
31503
+ Step... (26450 | Loss: 0.02349100075662136, Learning Rate: 4.757777787744999e-05, Gradient Norm: 0.38283392786979675)
31504
+ Step... (26475 | Loss: 0.03603263571858406, Learning Rate: 4.7527271817671135e-05, Gradient Norm: 0.3727433979511261)
31505
+ Step... (26500 | Loss: 0.018448349088430405, Learning Rate: 4.747676575789228e-05, Gradient Norm: 1.0056108236312866)
31506
+ Step... (26525 | Loss: 0.02613736502826214, Learning Rate: 4.7426263336092234e-05, Gradient Norm: 0.33278515934944153)
31507
+ Step... (26550 | Loss: 0.021993344649672508, Learning Rate: 4.737575363833457e-05, Gradient Norm: 0.3807253837585449)
31508
+ Step... (26575 | Loss: 0.018662679940462112, Learning Rate: 4.7325251216534525e-05, Gradient Norm: 0.2747914493083954)
31509
+ Step... (26600 | Loss: 0.015389377251267433, Learning Rate: 4.727474879473448e-05, Gradient Norm: 0.30432483553886414)
31510
+ Step... (26625 | Loss: 0.018875302746891975, Learning Rate: 4.722423909697682e-05, Gradient Norm: 0.21066243946552277)
31511
+ Step... (26650 | Loss: 0.013026373460888863, Learning Rate: 4.717373667517677e-05, Gradient Norm: 0.33200427889823914)
31512
+ Step... (26675 | Loss: 0.03511090576648712, Learning Rate: 4.712323425337672e-05, Gradient Norm: 0.4361608326435089)
31513
+ Step... (26700 | Loss: 0.019190248101949692, Learning Rate: 4.707272455561906e-05, Gradient Norm: 0.9028236865997314)
31514
+ Step... (26725 | Loss: 0.03395620360970497, Learning Rate: 4.7022222133819014e-05, Gradient Norm: 0.3553292155265808)
31515
+ Step... (26750 | Loss: 0.014008201658725739, Learning Rate: 4.697171243606135e-05, Gradient Norm: 0.3180076479911804)
31516
+ Step... (26775 | Loss: 0.021352039650082588, Learning Rate: 4.6921210014261305e-05, Gradient Norm: 0.34052228927612305)
31517
+ Step... (26800 | Loss: 0.01608300395309925, Learning Rate: 4.687070759246126e-05, Gradient Norm: 0.43110185861587524)
31518
+ Step... (26825 | Loss: 0.04146311432123184, Learning Rate: 4.68201978947036e-05, Gradient Norm: 0.3803328573703766)
31519
+ Step... (26850 | Loss: 0.01878603734076023, Learning Rate: 4.676969547290355e-05, Gradient Norm: 1.2297662496566772)
31520
+ Step... (26875 | Loss: 0.024104727432131767, Learning Rate: 4.67191930511035e-05, Gradient Norm: 0.32827281951904297)
31521
+ Step... (26900 | Loss: 0.006425368599593639, Learning Rate: 4.666868335334584e-05, Gradient Norm: 0.19023503363132477)
31522
+ Step... (26925 | Loss: 0.03244623914361, Learning Rate: 4.6618180931545794e-05, Gradient Norm: 0.4636324346065521)
31523
+ Step... (26950 | Loss: 0.018195796757936478, Learning Rate: 4.656767850974575e-05, Gradient Norm: 0.3885265588760376)
31524
+ Step... (26975 | Loss: 0.018303479999303818, Learning Rate: 4.6517168811988086e-05, Gradient Norm: 0.22915388643741608)
31525
+ Step... (27000 | Loss: 0.01953260228037834, Learning Rate: 4.646666639018804e-05, Gradient Norm: 0.3849063813686371)
31526
+ Step... (27025 | Loss: 0.024017462506890297, Learning Rate: 4.641616396838799e-05, Gradient Norm: 0.29185715317726135)
31527
+ Step... (27050 | Loss: 0.02381003648042679, Learning Rate: 4.636565427063033e-05, Gradient Norm: 0.3834833800792694)
31528
+ Step... (27075 | Loss: 0.02851175330579281, Learning Rate: 4.631515184883028e-05, Gradient Norm: 0.33930420875549316)
31529
+ Step... (27100 | Loss: 0.029051026329398155, Learning Rate: 4.626464578905143e-05, Gradient Norm: 0.4754555821418762)
31530
+ Step... (27125 | Loss: 0.027669742703437805, Learning Rate: 4.6214139729272574e-05, Gradient Norm: 0.23227062821388245)
31531
+ Step... (27150 | Loss: 0.02711222507059574, Learning Rate: 4.616363366949372e-05, Gradient Norm: 0.3472621738910675)
31532
+ Step... (27175 | Loss: 0.030751440674066544, Learning Rate: 4.611313124769367e-05, Gradient Norm: 0.37772148847579956)
31533
+ Step... (27200 | Loss: 0.02963493950664997, Learning Rate: 4.606262154993601e-05, Gradient Norm: 0.3551703989505768)
31534
+ Step... (27225 | Loss: 0.034281305968761444, Learning Rate: 4.6012119128135964e-05, Gradient Norm: 0.25948455929756165)
31535
+ Step... (27250 | Loss: 0.01915004849433899, Learning Rate: 4.596161670633592e-05, Gradient Norm: 0.30437567830085754)
31536
+ Step... (27275 | Loss: 0.026445778086781502, Learning Rate: 4.5911107008578256e-05, Gradient Norm: 0.348853200674057)
31537
+ Step... (27300 | Loss: 0.018407680094242096, Learning Rate: 4.586060458677821e-05, Gradient Norm: 0.40228012204170227)
31538
+ Step... (27325 | Loss: 0.016526585444808006, Learning Rate: 4.581010216497816e-05, Gradient Norm: 0.2467903345823288)
31539
+ Step... (27350 | Loss: 0.03916558250784874, Learning Rate: 4.57595924672205e-05, Gradient Norm: 0.3872826099395752)
31540
+ Step... (27375 | Loss: 0.027904687449336052, Learning Rate: 4.570909004542045e-05, Gradient Norm: 0.29256585240364075)
31541
+ Step... (27400 | Loss: 0.013314536772668362, Learning Rate: 4.5658587623620406e-05, Gradient Norm: 0.49819448590278625)
31542
+ Step... (27425 | Loss: 0.02740517258644104, Learning Rate: 4.5608077925862744e-05, Gradient Norm: 0.3470984995365143)
31543
+ Step... (27450 | Loss: 0.011915629729628563, Learning Rate: 4.55575755040627e-05, Gradient Norm: 0.28347650170326233)
31544
+ Step... (27475 | Loss: 0.037782225757837296, Learning Rate: 4.550707308226265e-05, Gradient Norm: 0.603645384311676)
31545
+ Step... (27500 | Loss: 0.020660242065787315, Learning Rate: 4.545656338450499e-05, Gradient Norm: 0.40294796228408813)
31546
+ Step... (27525 | Loss: 0.019880592823028564, Learning Rate: 4.540606096270494e-05, Gradient Norm: 0.262466698884964)
31547
+ Step... (27550 | Loss: 0.02121094986796379, Learning Rate: 4.5355558540904894e-05, Gradient Norm: 0.3522772789001465)
31548
+ Step... (27575 | Loss: 0.020857077091932297, Learning Rate: 4.530504884314723e-05, Gradient Norm: 0.35182562470436096)
31549
+ Step... (27600 | Loss: 0.01385863684117794, Learning Rate: 4.5254546421347186e-05, Gradient Norm: 0.2593011260032654)
31550
+ Step... (27625 | Loss: 0.027512112632393837, Learning Rate: 4.520404399954714e-05, Gradient Norm: 0.4048057496547699)
31551
+ Step... (27650 | Loss: 0.04779520258307457, Learning Rate: 4.515353430178948e-05, Gradient Norm: 0.5996983051300049)
31552
+ Step... (27675 | Loss: 0.037076059728860855, Learning Rate: 4.510303187998943e-05, Gradient Norm: 0.4130019247531891)
31553
+ Step... (27700 | Loss: 0.0145907336845994, Learning Rate: 4.5052525820210576e-05, Gradient Norm: 0.26569056510925293)
31554
+ Step... (27725 | Loss: 0.026986515149474144, Learning Rate: 4.500201976043172e-05, Gradient Norm: 0.3900778591632843)
31555
+ Step... (27750 | Loss: 0.021214786916971207, Learning Rate: 4.495151370065287e-05, Gradient Norm: 0.35070163011550903)
31556
+ Step... (27775 | Loss: 0.020700503140687943, Learning Rate: 4.490100764087401e-05, Gradient Norm: 0.23256829380989075)
31557
+ Step... (27800 | Loss: 0.011971977539360523, Learning Rate: 4.485050158109516e-05, Gradient Norm: 0.3890122175216675)
31558
+ Step... (27825 | Loss: 0.029216863214969635, Learning Rate: 4.479999915929511e-05, Gradient Norm: 0.3727366030216217)
31559
+ Step... (27850 | Loss: 0.021556446328759193, Learning Rate: 4.474948946153745e-05, Gradient Norm: 0.3240383565425873)
31560
+ Step... (27875 | Loss: 0.024696046486496925, Learning Rate: 4.4698987039737403e-05, Gradient Norm: 0.30146247148513794)
31561
+ Step... (27900 | Loss: 0.0251696165651083, Learning Rate: 4.4648484617937356e-05, Gradient Norm: 0.37362998723983765)
31562
+ Step... (27925 | Loss: 0.015774782747030258, Learning Rate: 4.4597974920179695e-05, Gradient Norm: 0.24134306609630585)
31563
+ Step... (27950 | Loss: 0.014067770913243294, Learning Rate: 4.454747249837965e-05, Gradient Norm: 0.3240377604961395)
31564
+ Step... (27975 | Loss: 0.019330274313688278, Learning Rate: 4.44969700765796e-05, Gradient Norm: 0.27255779504776)
31565
+ Step... (28000 | Loss: 0.022736601531505585, Learning Rate: 4.444646037882194e-05, Gradient Norm: 0.36978673934936523)
31566
+ Step... (28025 | Loss: 0.030382484197616577, Learning Rate: 4.439595795702189e-05, Gradient Norm: 0.45521456003189087)
31567
+ Step... (28050 | Loss: 0.025521015748381615, Learning Rate: 4.4345455535221845e-05, Gradient Norm: 0.40030258893966675)
31568
+
31569
+
31570
+
31571
+
31572
+
31573
+
31574
+
31575
+
31576
+
31577
+
31578
+
31579
+
31580
+
31581
+
31582
+
31583
+
31584
+
31585
+
31586
+
31587
+
31588
+
31589
+
31590
+
31591
+
31592
+
31593
+
31594
+
31595
+
31596
+
31597
+
31598
+
31599
+
31600
+
31601
+
31602
+
31603
+
31604
+
31605
+
31606
+
31607
+
31608
+
31609
+
31610
+
31611
+
31612
+
31613
+
31614
+
31615
+
31616
+
31617
+
31618
+
31619
+
31620
+
31621
+
31622
+
31623
+
31624
+
31625
+
31626
+
31627
+
31628
+
31629
+
31630
+
31631
+
31632
+
31633
+
31634
+
31635
+
31636
+
31637
+
31638
+
31639
+
31640
+
31641
+
31642
+
31643
+
31644
+
31645
+
31646
+
31647
+
31648
+
31649
+
31650
+
31651
+
31652
+
31653
+
31654
+
31655
+
31656
+
31657
+
31658
+
31659
+
31660
+
31661
+
31662
+
31663
+
31664
+
31665
+
31666
+
31667
+
31668
+
31669
+
31670
+
31671
+
31672
+
31673
+
31674
+
31675
+
31676
+
31677
+
31678
+
31679
+
31680
+
31681
+
31682
+
31683
+
31684
+
31685
+
31686
+
31687
+
31688
+
31689
+
31690
+
31691
+
31692
+
31693
+
31694
+
31695
+
31696
+
31697
+
31698
+
31699
+
31700
+
31701
+
31702
+
31703
+
31704
+
31705
+
31706
+
31707
+
31708
+
31709
+
31710
+
31711
+
31712
+
31713
+
31714
+
31715
+
31716
+
31717
+
31718
+
31719
+
31720
+
31721
+
31722
+
31723
+
31724
+
31725
+
31726
+
31727
+
31728
+
31729
+
31730
+
31731
+
31732
+
31733
+
31734
+
31735
+
31736
+
31737
+
31738
+
31739
+
31740
+
31741
+
31742
+
31743
+
31744
+
31745
+
31746
+
31747
+
31748
+
31749
+
31750
+
31751
+
31752
+
31753
+
31754
+
31755
+
31756
+
31757
+
31758
+
31759
+
31760
+
31761
+
31762
+
31763
+
31764
+
31765
+
31766
+
31767
+
31768
+
31769
+
31770
+
31771
+
31772
+
31773
+
31774
+
31775
+
31776
+
31777
+
31778
+
31779
+
31780
+
31781
+
31782
+
31783
+
31784
+
31785
+
31786
+
31787
+
31788
+
31789
+
31790
+
31791
+
31792
+
31793
+
31794
+
31795
+
31796
+
31797
+
31798
+
31799
+
31800
+
31801
+
31802
+
31803
+
31804
+
31805
+
31806
+
31807
+
31808
+
31809
+
31810
+
31811
+
31812
+
31813
+
31814
+
31815
+
31816
+
31817
+
31818
+
31819
+
31820
+
31821
+
31822
+
31823
+
31824
+
31825
+
31826
+
31827
+
31828
+
31829
+
31830
+
31831
+
31832
+
31833
+
31834
+
31835
+
31836
+
31837
+
31838
+
31839
+
31840
+
31841
+
31842
+
31843
+
31844
+
31845
+
31846
+
31847
+
31848
+
31849
+
31850
+
31851
+
31852
+
31853
+
31854
+
31855
+
31856
+
31857
+
31858
+
31859
+
31860
+
31861
+
31862
+
31863
+
31864
+
31865
+
31866
+
31867
+
31868
+
31869
+
31870
+
31871
+
31872
+
31873
+
31874
+
31875
+
31876
+
31877
+
31878
+
31879
+
31880
+
31881
+
31882
+
31883
+
31884
+
31885
+
31886
+
31887
+
31888
+
31889
+
31890
+
31891
+
31892
+
31893
+
31894
+
31895
+
31896
+
31897
+
31898
+
31899
+
31900
+
31901
+
31902
+
31903
+
31904
+
31905
+
31906
+
31907
+
31908
+
31909
+
31910
+
31911
+
31912
+
31913
+
31914
+
31915
+
31916
+
31917
+
31918
+
31919
+
31920
+
31921
+
31922
+
31923
+
31924
+
31925
+
31926
+
31927
+
31928
+
31929
+
31930
+
31931
+
31932
+
31933
+
31934
+
31935
+
31936
+
31937
+
31938
+
31939
+
31940
+
31941
+
31942
+
31943
+
31944
+
31945
+
31946
+
31947
+
31948
+
31949
+
31950
+
31951
+
31952
+
31953
+
31954
+
31955
+
31956
+
31957
+
31958
+
31959
+
31960
+
31961
+
31962
+
31963
+
31964
+
31965
+
31966
+
31967
+
31968
+
31969
+
31970
+
31971
+
31972
+
31973
+
31974
+
31975
+
31976
+
31977
+
31978
+
31979
+
31980
+
31981
+
31982
+
31983
+
31984
+
31985
+
31986
+
31987
+
31988
+
31989
+
31990
+
31991
+
31992
+
31993
+
31994
+
31995
+
31996
+
31997
+
31998
+
31999
+
32000
+
32001
+
32002
+
32003
+
32004
+
32005
+
32006
+
32007
+
32008
+
32009
+
32010
+
32011
+
32012
+
32013
+
32014
+
32015
+
32016
+
32017
+
32018
+
32019
+
32020
+
32021
+
32022
+
32023
+
32024
+
32025
+
32026
+
32027
+
32028
+
32029
+
32030
+
32031
+
32032
+
32033
+
32034
+
32035
+
32036
+
32037
+
32038
+
32039
+
32040
+
32041
+
32042
+
32043
+
32044
+
32045
+
32046
+
32047
+
32048
+
32049
+
32050
+
32051
+
32052
+
32053
+
32054
+
32055
+
32056
+
32057
+
32058
+
32059
+
32060
+
32061
+
32062
+
32063
+
32064
+
32065
+
32066
+
32067
+
32068
+
32069
+
32070
+
32071
+
32072
+
32073
+
32074
+
32075
+
32076
+
32077
+
32078
+
32079
+
32080
+
32081
+
32082
+
32083
+
32084
+
32085
+
32086
+
32087
+
32088
+
32089
+
32090
+
32091
+
32092
+
32093
+
32094
+
32095
+
32096
+
32097
+
32098
+
32099
+
32100
+
32101
+
32102
+
32103
+
32104
+
32105
+
32106
+
32107
+
32108
+
32109
+
32110
+
32111
+
32112
+
32113
+
32114
+
32115
+
32116
+
32117
+
32118
+
32119
+
32120
+
32121
+
32122
+
32123
+
32124
+
32125
+
32126
+
32127
+
32128
+
32129
+
32130
+
32131
+
32132
+
32133
+
32134
+
32135
+
32136
+
32137
+
32138
+
32139
+
32140
+
32141
+
32142
+
32143
+
32144
+
32145
+
32146
+
32147
+
32148
+
32149
+
32150
+
32151
+
32152
+
32153
+
32154
+
32155
+
32156
+
32157
+
32158
+
32159
+
32160
+
32161
+
32162
+
32163
+
32164
+
32165
+
32166
+
32167
+
32168
+
32169
+
32170
+
32171
+
32172
+
32173
+
32174
+
32175
+
32176
+
32177
+
32178
+
32179
+
32180
+
32181
+
32182
+
32183
+
32184
+
32185
+
32186
+
32187
+
32188
+
32189
+
32190
+
32191
+
32192
+
32193
+
32194
+
32195
+
32196
+
32197
+
32198
+
32199
+
32200
+
32201
+
32202
+
32203
+
32204
+
32205
+
32206
+
32207
+
32208
+
32209
+
32210
+
32211
+
32212
+
32213
+
32214
+
32215
+
32216
+
32217
+
32218
+
32219
+
32220
+
32221
+
32222
+
32223
+
32224
+
32225
+
32226
+
32227
+
32228
+
32229
+
32230
+
32231
+
32232
+
32233
+
32234
+
32235
+
32236
+
32237
+
32238
+
32239
+
32240
+
32241
+
32242
+
32243
+
32244
+
32245
+
32246
+
32247
+
32248
+
32249
+
32250
+
32251
+
32252
+
32253
+
32254
+
32255
+
32256
+
32257
+
32258
+
32259
+
32260
+
32261
+
32262
+
32263
+
32264
+
32265
+
32266
+
32267
+
32268
+
32269
+
32270
+
32271
+
32272
+
32273
+
32274
+
32275
+
32276
+
32277
+
32278
+
32279
+
32280
+
32281
+
32282
+
32283
+
32284
+
32285
+
32286
+
32287
+
32288
+
32289
+
32290
+
32291
+
32292
+
32293
+
32294
+
32295
+
32296
+
32297
+
32298
+
32299
+
32300
+
32301
+
32302
+
32303
+
32304
+
32305
+
32306
+
32307
+
32308
+
32309
+
32310
+
32311
+
32312
+
32313
+
32314
+
32315
+
32316
+
32317
+
32318
+
32319
+
32320
+
32321
+
32322
+
32323
+
32324
+
32325
+
32326
+
32327
+
32328
+
32329
+
32330
+
32331
+
32332
+
32333
+
32334
+
32335
+
32336
+
32337
+
32338
+
32339
+
32340
+
32341
+
32342
+
32343
+
32344
+
32345
+
32346
+
32347
+
32348
+
32349
+
32350
+
32351
+
32352
+
32353
+
32354
+
32355
+
32356
+
32357
+
32358
+
32359
+
32360
+
32361
+
32362
+
32363
+
32364
+
32365
+
32366
+
32367
+
32368
+
32369
+
32370
+
32371
+
32372
+
32373
+
32374
+
32375
+
32376
+
32377
+
32378
+
32379
+
32380
+
32381
+
32382
+
32383
+
32384
+
32385
+
32386
+
32387
+
32388
+
32389
+
32390
+
32391
+
32392
+
32393
+
32394
+
32395
+
32396
+
32397
+
32398
+
32399
+
32400
+
32401
+
32402
+
32403
+
32404
+
32405
+
32406
+
32407
+
32408
+
32409
+
32410
+
32411
+
32412
+
32413
+
32414
+
32415
+
32416
+
32417
+
32418
+
32419
+
32420
+
32421
+
32422
+
32423
+
32424
+
32425
+
32426
+
32427
+
32428
+
32429
+
32430
+
32431
+
32432
+
32433
+
32434
+
32435
+
32436
+
32437
+
32438
+
32439
+
32440
+
32441
+
32442
+
32443
+
32444
+
32445
+
32446
+
32447
+
32448
+
32449
+
32450
+
32451
+
32452
+
32453
+
32454
+
32455
+
32456
+
32457
+
32458
+
32459
+
32460
+
32461
+
32462
+
32463
+
32464
+
32465
+
32466
+
32467
+
32468
+
32469
+
32470
+
32471
+
32472
+
32473
+
32474
+
32475
+
32476
+
32477
+
32478
+
32479
+
32480
+
32481
+
32482
+
32483
+
32484
+
32485
+
32486
+
32487
+
32488
+
32489
+
32490
+
32491
+
32492
+
32493
+
32494
+
32495
+
32496
+
32497
+
32498
+
32499
+
32500
+
32501
+
32502
+
32503
+
32504
+
32505
+
32506
+
32507
+
32508
+
32509
+
32510
+
32511
+
32512
+
32513
+
32514
+
32515
+
32516
+
32517
+
32518
+
32519
+
32520
+
32521
+
32522
+
32523
+
32524
+
32525
+
32526
+
32527
+
32528
+
32529
+
32530
+
32531
+
32532
+
32533
+
32534
+
32535
+
32536
+
32537
+
32538
+
32539
+
32540
+
32541
+
32542
+
32543
+
32544
+
32545
+
32546
+
32547
+
32548
+
32549
+
32550
+
32551
+
32552
+
32553
+
32554
+
32555
+
32556
+
32557
+
32558
+
32559
+
32560
+
32561
+
32562
+
32563
+
32564
+
32565
+
32566
+
32567
+
32568
+
32569
+
32570
+
32571
+
32572
+
32573
+
32574
+
32575
+
32576
+
32577
+
32578
+
32579
+
32580
+
32581
+
32582
+
32583
+
32584
+
32585
+
32586
+
32587
+
32588
+
32589
+
32590
+
32591
+
32592
+
32593
+
32594
+
32595
+
32596
+
32597
+
32598
+
32599
+
32600
+
32601
+
32602
+
32603
+
32604
+
32605
+
32606
+
32607
+
32608
+
32609
+
32610
+
32611
+
32612
+
32613
+
32614
+
32615
+
32616
+
32617
+
32618
+
32619
+
32620
+
32621
+
32622
+
32623
+
32624
+
32625
+
32626
+
32627
+
32628
+
32629
+
32630
+
32631
+
32632
+
32633
+
32634
+
32635
+
32636
+
32637
+
32638
+
32639
+
32640
+
32641
+
32642
+
32643
+
32644
+
32645
+
32646
+
32647
+
32648
+
32649
+
32650
+
32651
+
32652
+
32653
+
32654
+
32655
+
32656
+
32657
+
32658
+
32659
+
32660
+
32661
+
32662
+
32663
+
32664
+
32665
+
32666
+
32667
+
32668
+
32669
+
32670
+
32671
+
32672
+
32673
+
32674
+
32675
+
32676
+
32677
+
32678
+
32679
+
32680
+
32681
+
32682
+
32683
+
32684
+
32685
+
32686
+
32687
+
32688
+
32689
+
32690
+
32691
+
32692
+
32693
+
32694
+
32695
+
32696
+
32697
+
32698
+
32699
+
32700
+
32701
+
32702
+
32703
+
32704
+
32705
+
32706
+
32707
+
32708
+
32709
+
32710
+
32711
+
32712
+
32713
+
32714
+
32715
+
32716
+
32717
+
32718
+
32719
+
32720
+
32721
+
32722
+
32723
+
32724
+
32725
+
32726
+
32727
+
32728
+
32729
+
32730
+
32731
+
32732
+
32733
+
32734
+
32735
+
32736
+
32737
+
32738
+
32739
+
32740
+
32741
+
32742
+
32743
+
32744
+
32745
+
32746
+
32747
+
32748
+
32749
+
32750
+
32751
+
32752
+
32753
+
32754
+
32755
+
32756
+
32757
+
32758
+
32759
+
32760
+
32761
+
32762
+
32763
+
32764
+
32765
+
32766
+
32767
+
32768
+
32769
+
32770
+
32771
+
32772
+
32773
+
32774
+
32775
+
32776
+
32777
+
32778
+
32779
+
32780
+
32781
+
32782
+
32783
+
32784
+
32785
+
32786
+
32787
+
32788
+
32789
+
32790
+
32791
+
32792
+
32793
+
32794
+
32795
+
32796
+
32797
+
32798
+
32799
+
32800
+
32801
+
32802
+
32803
+
32804
+
32805
+
32806
+
32807
+
32808
+
32809
+
32810
+
32811
+
32812
+
32813
+
32814
+
32815
+
32816
+
32817
+
32818
+
32819
+
32820
+
32821
+
32822
+
32823
+
32824
+
32825
+
32826
+
32827
+
32828
+
32829
+
32830
+
32831
+
32832
+
32833
+
32834
+
32835
+
32836
+
32837
+
32838
+
32839
+
32840
+
32841
+
32842
+
32843
+
32844
+
32845
+
32846
+
32847
+
32848
+
32849
+
32850
+
32851
+
32852
+
32853
+
32854
+
32855
+
32856
+
32857
+
32858
+
32859
+
32860
+
32861
+
32862
+
32863
+
32864
+
32865
+
32866
+
32867
+
32868
+
32869
+
32870
+
32871
+
32872
+
32873
+
32874
+
32875
+
32876
+
32877
+
32878
+
32879
+
32880
+
32881
+
32882
+
32883
+
32884
+
32885
+
32886
+
32887
+
32888
+
32889
+
32890
+
32891
+
32892
+
32893
+
32894
+
32895
+
32896
+
32897
+
32898
+
32899
+
32900
+
32901
+
32902
+
32903
+
32904
+
32905
+
32906
+
32907
+
32908
+
32909
+
32910
+
32911
+
32912
+
32913
+
32914
+
32915
+
32916
+
32917
+
32918
+
32919
+
32920
+
32921
+
32922
+
32923
+
32924
+
32925
+
32926
+
32927
+
32928
+
32929
+
32930
+
32931
+
32932
+
32933
+
32934
+
32935
+
32936
+
32937
+
32938
+
32939
+
32940
+
32941
+
32942
+
32943
+
32944
+
32945
+
32946
+
32947
+
32948
+
32949
+
32950
+
32951
+
32952
+
32953
+
32954
+
32955
+
32956
+
32957
+
32958
+
32959
+
32960
+
32961
+
32962
+
32963
+
32964
+
32965
+
32966
+
32967
+
32968
+
32969
+
32970
+
32971
+
32972
+
32973
+
32974
+
32975
+
32976
+
32977
+
32978
+
32979
+
32980
+
32981
+
32982
+
32983
+
32984
+
32985
+
32986
+
32987
+
32988
+
32989
+
32990
+
32991
+
32992
+
32993
+
32994
+
32995
+
32996
+
32997
+
32998
+
32999
+
33000
+
33001
+
33002
+
33003
+
33004
+
33005
+
33006
+
33007
+
33008
+
33009
+
33010
+
33011
+
33012
+
33013
+
33014
+
33015
+
33016
+
33017
+
33018
+
33019
+
33020
+
33021
+
33022
+
33023
+
33024
+
33025
+
33026
+
33027
+
33028
+
33029
+
33030
+
33031
+
33032
+
33033
+
33034
+
33035
+
33036
+
33037
+
33038
+
33039
+
33040
+
33041
+
33042
+
33043
+
33044
+
33045
+
33046
+
33047
+
33048
+
33049
+
33050
+
33051
+
33052
+
33053
+
33054
+
33055
+
33056
+
33057
+
33058
+
33059
+
33060
+
33061
+
33062
+
33063
+
33064
+
33065
+
33066
+
33067
+
33068
+
33069
+
33070
+
33071
+
33072
+
33073
+
33074
+
33075
+
33076
+
33077
+
33078
+
33079
+
33080
+
33081
+
33082
+
33083
+
33084
+
33085
+
33086
+
33087
+
33088
+
33089
+
33090
+
33091
+
33092
+
33093
+
33094
+
33095
+
33096
+
33097
+
33098
+
33099
+
33100
+
33101
+
33102
+
33103
+
33104
+
33105
+
33106
+
33107
+
33108
+
33109
+
33110
+
33111
+
33112
+
33113
+
33114
+
33115
+
33116
+
33117
+
33118
+
33119
+
33120
+
33121
+
33122
+
33123
+
33124
+
33125
+
33126
+
33127
+
33128
+
33129
+
33130
+
33131
+
33132
+
33133
+
33134
+
33135
+
33136
+
33137
+
33138
+
33139
+
33140
+
33141
+
33142
+
33143
+
33144
+
33145
+
33146
+
33147
+
33148
+
33149
+
33150
+
33151
+
33152
+
33153
+
33154
+
33155
+
33156
+
33157
+
33158
+
33159
+
33160
+
33161
+
33162
+
33163
+
33164
+
33165
+
33166
+
33167
+
33168
+
33169
+
33170
+
33171
+
33172
+
33173
+
33174
+
33175
+
33176
+
33177
+
33178
+
33179
+
33180
+
33181
+
33182
+
33183
+
33184
+
33185
+
33186
+
33187
+
33188
+
33189
+
33190
+
33191
+
33192
+
33193
+
33194
+
33195
+
33196
+
33197
+
33198
+
33199
+
33200
+
33201
+
33202
+
33203
+
33204
+
33205
+
33206
+
33207
+
33208
+
33209
+
33210
+
33211
+
33212
+
33213
+
33214
+
33215
+
33216
+
33217
+
33218
+
33219
+
33220
+
33221
+
33222
+
33223
+
33224
+
33225
+
33226
+
33227
+
33228
+
33229
+
33230
+
33231
+
33232
+
33233
+
33234
+
33235
+
33236
+
33237
+
33238
+
33239
+
33240
+
33241
+
33242
+
33243
+
33244
+
33245
+
33246
+
33247
+ Training...: 79% 3466/4393 [4:33:16<1:29:13, 5.77s/it]
33248
+ Step... (28100 | Loss: 0.017779188230633736, Learning Rate: 4.4244443415664136e-05, Gradient Norm: 0.3853922486305237)
33249
+ Step... (28125 | Loss: 0.029410462826490402, Learning Rate: 4.419394099386409e-05, Gradient Norm: 0.41001659631729126)
33250
+ Step... (28150 | Loss: 0.016860324889421463, Learning Rate: 4.414343129610643e-05, Gradient Norm: 0.3353402614593506)
33251
+ Step... (28175 | Loss: 0.1199326440691948, Learning Rate: 4.409292887430638e-05, Gradient Norm: 1.177916407585144)
33252
+ Step... (28200 | Loss: 0.009715979918837547, Learning Rate: 4.4042426452506334e-05, Gradient Norm: 0.23744286596775055)
33253
+ Step... (28225 | Loss: 0.023713188245892525, Learning Rate: 4.399191675474867e-05, Gradient Norm: 0.27059558033943176)
33254
+ Step... (28250 | Loss: 0.015286052599549294, Learning Rate: 4.3941414332948625e-05, Gradient Norm: 0.3356326222419739)
33255
+ Step... (28275 | Loss: 0.03733113408088684, Learning Rate: 4.389091191114858e-05, Gradient Norm: 0.5076051950454712)
33256
+ Step... (28300 | Loss: 0.024148428812623024, Learning Rate: 4.3840402213390917e-05, Gradient Norm: 0.40085557103157043)
33257
+ Step... (28325 | Loss: 0.024370620027184486, Learning Rate: 4.378989979159087e-05, Gradient Norm: 0.27008843421936035)
33258
+ Step... (28350 | Loss: 0.12309176474809647, Learning Rate: 4.3739393731812015e-05, Gradient Norm: 0.5491603016853333)
33259
+ Step... (28375 | Loss: 0.020386263728141785, Learning Rate: 4.368888767203316e-05, Gradient Norm: 0.2690469026565552)
33260
+ Step... (28400 | Loss: 0.018853355199098587, Learning Rate: 4.363838161225431e-05, Gradient Norm: 0.40446701645851135)
33261
+ Step... (28425 | Loss: 0.029719140380620956, Learning Rate: 4.358787919045426e-05, Gradient Norm: 0.30341053009033203)
33262
+ Step... (28450 | Loss: 0.03215150907635689, Learning Rate: 4.35373694926966e-05, Gradient Norm: 0.6584086418151855)
33263
+ Step... (28475 | Loss: 0.019242705777287483, Learning Rate: 4.348686707089655e-05, Gradient Norm: 0.3172548711299896)
33264
+ Step... (28500 | Loss: 0.034587059170007706, Learning Rate: 4.3436364649096504e-05, Gradient Norm: 0.4526202380657196)
33265
+ Step... (28525 | Loss: 0.03248746320605278, Learning Rate: 4.338585495133884e-05, Gradient Norm: 0.3442862331867218)
33266
+ Step... (28550 | Loss: 0.017472591251134872, Learning Rate: 4.3335352529538795e-05, Gradient Norm: 0.33737820386886597)
33267
+ Step... (28575 | Loss: 0.03616875782608986, Learning Rate: 4.328485010773875e-05, Gradient Norm: 0.44748085737228394)
33268
+ Step... (28600 | Loss: 0.010958717204630375, Learning Rate: 4.323434040998109e-05, Gradient Norm: 0.3348379135131836)
33269
+ Step... (28625 | Loss: 0.021889686584472656, Learning Rate: 4.318383798818104e-05, Gradient Norm: 0.291972815990448)
33270
+ Step... (28650 | Loss: 0.018253032118082047, Learning Rate: 4.313333556638099e-05, Gradient Norm: 0.39204180240631104)
33271
+ Step... (28675 | Loss: 0.03230177238583565, Learning Rate: 4.308282586862333e-05, Gradient Norm: 0.31189021468162537)
33272
+ Step... (28700 | Loss: 0.025849582627415657, Learning Rate: 4.3032323446823284e-05, Gradient Norm: 0.5001385807991028)
33273
+ Step... (28725 | Loss: 0.02958354540169239, Learning Rate: 4.298182102502324e-05, Gradient Norm: 0.3614922761917114)
33274
+ Step... (28750 | Loss: 0.02733941376209259, Learning Rate: 4.2931311327265576e-05, Gradient Norm: 0.3756287693977356)
33275
+ Step... (28775 | Loss: 0.029227962717413902, Learning Rate: 4.288080890546553e-05, Gradient Norm: 0.37987276911735535)
33276
+ Step... (28800 | Loss: 0.02478295937180519, Learning Rate: 4.283030648366548e-05, Gradient Norm: 0.43760961294174194)
33277
+ Step... (28825 | Loss: 0.02181088924407959, Learning Rate: 4.277979678590782e-05, Gradient Norm: 0.3122001886367798)
33278
+ Step... (28850 | Loss: 0.027777288109064102, Learning Rate: 4.272929436410777e-05, Gradient Norm: 0.39810433983802795)
33279
+ Step... (28875 | Loss: 0.038595400750637054, Learning Rate: 4.267878466635011e-05, Gradient Norm: 0.38269856572151184)
33280
+ Step... (28900 | Loss: 0.03247010335326195, Learning Rate: 4.2628282244550064e-05, Gradient Norm: 0.4291834831237793)
33281
+ Step... (28925 | Loss: 0.014019368216395378, Learning Rate: 4.257777982275002e-05, Gradient Norm: 0.2331201434135437)
33282
+ Step... (28950 | Loss: 0.017961695790290833, Learning Rate: 4.2527270124992356e-05, Gradient Norm: 0.3652317225933075)
33283
+ Step... (28975 | Loss: 0.03309032693505287, Learning Rate: 4.247676770319231e-05, Gradient Norm: 0.43095332384109497)
33284
+ Step... (29000 | Loss: 0.01431436650454998, Learning Rate: 4.2426261643413454e-05, Gradient Norm: 0.3215678930282593)
33285
+ Step... (29025 | Loss: 0.030018476769328117, Learning Rate: 4.23757555836346e-05, Gradient Norm: 0.41584596037864685)
33286
+ Step... (29050 | Loss: 0.02051413059234619, Learning Rate: 4.2325249523855746e-05, Gradient Norm: 0.43135276436805725)
33287
+ Step... (29075 | Loss: 0.03962719812989235, Learning Rate: 4.22747471020557e-05, Gradient Norm: 0.46567052602767944)
33288
+ Step... (29100 | Loss: 0.02515358477830887, Learning Rate: 4.2224241042276844e-05, Gradient Norm: 1.7703990936279297)
33289
+ Step... (29125 | Loss: 0.0183755811303854, Learning Rate: 4.217373498249799e-05, Gradient Norm: 0.46105656027793884)
33290
+ Step... (29150 | Loss: 0.023963920772075653, Learning Rate: 4.212323256069794e-05, Gradient Norm: 0.3423953354358673)
33291
+ Step... (29175 | Loss: 0.02140207402408123, Learning Rate: 4.207272286294028e-05, Gradient Norm: 0.29520702362060547)
33292
+ Step... (29200 | Loss: 0.03680615872144699, Learning Rate: 4.2022220441140234e-05, Gradient Norm: 1.6429659128189087)
33293
+ Step... (29225 | Loss: 0.03201254457235336, Learning Rate: 4.197171801934019e-05, Gradient Norm: 0.355790913105011)
33294
+ Step... (29250 | Loss: 0.01865505427122116, Learning Rate: 4.1921208321582526e-05, Gradient Norm: 0.37935853004455566)
33295
+ Step... (29275 | Loss: 0.04450344666838646, Learning Rate: 4.187070589978248e-05, Gradient Norm: 0.42739537358283997)
33296
+ Step... (29300 | Loss: 0.02708597481250763, Learning Rate: 4.182020347798243e-05, Gradient Norm: 0.4142377972602844)
33297
+ Step... (29325 | Loss: 0.017092134803533554, Learning Rate: 4.176969378022477e-05, Gradient Norm: 0.2180645912885666)
33298
+ Step... (29350 | Loss: 0.02320004440844059, Learning Rate: 4.171919135842472e-05, Gradient Norm: 0.4315558671951294)
33299
+ Step... (29375 | Loss: 0.022841814905405045, Learning Rate: 4.1668688936624676e-05, Gradient Norm: 0.2843684256076813)
33300
+ Step... (29400 | Loss: 0.012752613984048367, Learning Rate: 4.1618179238867015e-05, Gradient Norm: 0.2700633704662323)
33301
+ Step... (29425 | Loss: 0.032406099140644073, Learning Rate: 4.156767681706697e-05, Gradient Norm: 0.3672173023223877)
33302
+ Step... (29450 | Loss: 0.013478636741638184, Learning Rate: 4.151717439526692e-05, Gradient Norm: 0.36269333958625793)
33303
+ Step... (29475 | Loss: 0.02877821959555149, Learning Rate: 4.146666469750926e-05, Gradient Norm: 1.2450404167175293)
33304
+ Step... (29500 | Loss: 0.026545371860265732, Learning Rate: 4.141616227570921e-05, Gradient Norm: 0.7389810085296631)
33305
+ Step... (29525 | Loss: 0.04219873994588852, Learning Rate: 4.1365659853909165e-05, Gradient Norm: 0.48379987478256226)
33306
+ Step... (29550 | Loss: 0.025202322751283646, Learning Rate: 4.13151501561515e-05, Gradient Norm: 0.36057695746421814)
33307
+ Step... (29575 | Loss: 0.036751795560121536, Learning Rate: 4.1264647734351456e-05, Gradient Norm: 0.3288840651512146)
33308
+ Step... (29600 | Loss: 0.04314383491873741, Learning Rate: 4.12141416745726e-05, Gradient Norm: 1.1218085289001465)
33309
+ Step... (29625 | Loss: 0.02685457281768322, Learning Rate: 4.116363561479375e-05, Gradient Norm: 0.3273230791091919)
33310
+ Step... (29650 | Loss: 0.017800813540816307, Learning Rate: 4.1113129555014893e-05, Gradient Norm: 0.41265296936035156)
33311
+ Step... (29675 | Loss: 0.023478511720895767, Learning Rate: 4.1062627133214846e-05, Gradient Norm: 0.2817476987838745)
33312
+ Step... (29700 | Loss: 0.018604634329676628, Learning Rate: 4.1012117435457185e-05, Gradient Norm: 0.3600842356681824)
33313
+ Step... (29725 | Loss: 0.01911950670182705, Learning Rate: 4.096161501365714e-05, Gradient Norm: 0.24088819324970245)
33314
+ Step... (29750 | Loss: 0.010041565634310246, Learning Rate: 4.091111259185709e-05, Gradient Norm: 0.26652786135673523)
33315
+ Step... (29775 | Loss: 0.020387249067425728, Learning Rate: 4.086060289409943e-05, Gradient Norm: 0.2445187121629715)
33316
+
33317
+
33318
+
33319
+
33320
+
33321
+
33322
+
33323
+
33324
+
33325
+
33326
+
33327
+
33328
+
33329
+
33330
+
33331
+
33332
+
33333
+
33334
+
33335
+
33336
+
33337
+
33338
+
33339
+
33340
+
33341
+
33342
+
33343
+
33344
+
33345
+
33346
+
33347
+
33348
+
33349
+
33350
+
33351
+
33352
+
33353
+
33354
+
33355
+
33356
+
33357
+
33358
+
33359
+
33360
+
33361
+
33362
+
33363
+
33364
+
33365
+
33366
+
33367
+
33368
+
33369
+
33370
+
33371
+
33372
+
33373
+
33374
+
33375
+
33376
+
33377
+
33378
+
33379
+
33380
+
33381
+
33382
+
33383
+
33384
+
33385
+
33386
+
33387
+
33388
+
33389
+
33390
+
33391
+
33392
+
33393
+
33394
+
33395
+
33396
+
33397
+
33398
+
33399
+
33400
+
33401
+
33402
+
33403
+
33404
+
33405
+
33406
+
33407
+
33408
+
33409
+
33410
+
33411
+
33412
+
33413
+
33414
+
33415
+
33416
+
33417
+
33418
+
33419
+
33420
+
33421
+
33422
+
33423
+
33424
+
33425
+
33426
+
33427
+
33428
+
33429
+
33430
+
33431
+
33432
+
33433
+
33434
+
33435
+
33436
+
33437
+
33438
+
33439
+
33440
+
33441
+
33442
+
33443
+
33444
+
33445
+
33446
+
33447
+
33448
+
33449
+
33450
+
33451
+
33452
+
33453
+
33454
+
33455
+
33456
+
33457
+
33458
+
33459
+
33460
+
33461
+
33462
+
33463
+
33464
+
33465
+
33466
+
33467
+
33468
+
33469
+
33470
+
33471
+
33472
+
33473
+
33474
+
33475
+
33476
+
33477
+
33478
+
33479
+
33480
+
33481
+
33482
+
33483
+
33484
+
33485
+
33486
+
33487
+ Training...: 83% 3641/4393 [4:47:16<48:57, 3.91s/it]
33488
+ Evaluating ...: 0% 0/85 [00:00<?, ?it/s]
33489
+ Step... (29825 | Loss: 0.01673395000398159, Learning Rate: 4.0759598050499335e-05, Gradient Norm: 0.21883606910705566)
33490
+ Step... (29850 | Loss: 0.03240591287612915, Learning Rate: 4.0709088352741674e-05, Gradient Norm: 0.40065884590148926)
33491
+ Step... (29875 | Loss: 0.019736234098672867, Learning Rate: 4.0658585930941626e-05, Gradient Norm: 0.32982364296913147)
33492
+ Step... (29900 | Loss: 0.018150361254811287, Learning Rate: 4.0608076233183965e-05, Gradient Norm: 0.387714684009552)
33493
+ Step... (29925 | Loss: 0.018563171848654747, Learning Rate: 4.055757381138392e-05, Gradient Norm: 0.22470852732658386)
33494
+ Step... (29950 | Loss: 0.01995537057518959, Learning Rate: 4.050707138958387e-05, Gradient Norm: 0.373696506023407)
33495
+ Step... (29975 | Loss: 0.02121199481189251, Learning Rate: 4.045656169182621e-05, Gradient Norm: 0.24509090185165405)
33496
+ /home/sanchitgandhi/hf/lib/python3.8/site-packages/flax/jax_utils.py:312: FutureWarning: jax.tree_map is deprecated, and will be removed in a future release. Use jax.tree_util.tree_map instead.
33497
+ return jax.tree_map(pad, tree)
33498
+ /home/sanchitgandhi/hf/lib/python3.8/site-packages/flax/jax_utils.py:321: FutureWarning: jax.tree_map is deprecated, and will be removed in a future release. Use jax.tree_util.tree_map instead.
33499
+ return out if static_return else jax.tree_map(unpad, out)
33500
+
33501
+
33502
+
33503
+
33504
+
33505
+
33506
+
33507
+
33508
+
33509
+
33510
+
33511
+
33512
+
33513
+
33514
+
33515
+
33516
+
33517
+
33518
+
33519
+
33520
+
33521
+
33522
+
33523
+
33524
+
33525
+
33526
+
33527
+
33528
+
33529
+
33530
+
33531
+
33532
+
33533
+
33534
+
33535
+
33536
+
33537
+
33538
+
33539
+
33540
+
33541
+
33542
+
33543
+
33544
+
33545
+
33546
+
33547
+
33548
+
33549
+
33550
+
33551
+
33552
+
33553
+
33554
+
33555
+
33556
+
33557
+
33558
+
33559
+
33560
+
33561
+
33562
+
33563
+
33564
+
33565
+
33566
+
33567
+
33568
+
33569
+
33570
+
33571
+
33572
+
33573
+
33574
+
33575
+
33576
+
33577
+
33578
+
33579
+
33580
+
33581
+
33582
+
33583
+ device_metrics = jax.tree_map(lambda x: x[0], device_metrics)
33584
+ /home/sanchitgandhi/hf/lib/python3.8/site-packages/flax/training/common_utils.py:45: FutureWarning: jax.tree_map is deprecated, and will be removed in a future release. Use jax.tree_util.tree_map instead.
33585
+ return jax.tree_map(stack_args, *forest)
33586
+ run_flax_speech_recognition_seq2seq.py:1392: FutureWarning: jax.tree_map is deprecated, and will be removed in a future release. Use jax.tree_util.tree_map instead.
33587
+ eval_metrics = jax.tree_map(jnp.mean, eval_metrics)
33588
+ Step... (20000/50000 | Eval Loss: 1.021510124206543 | run_flax_speech_recognition_seq2seq.py:1425: FutureWarning: jax.tree_map is deprecated, and will be removed in a future release. Use jax.tree_util.tree_map instead.
33589
+ params = jax.device_get(jax.tree_map(lambda x: x[0], state.params))
33590
+ Configuration saved in /home/sanchitgandhi/flax-wav2vec2-2-bart-large-ls-960h-black-box/config.json
33591
+ tcmalloc: large alloc 2586787840 bytes == 0x36da2e000 @ 0x7f7cba873680 0x7f7cba893bdd 0x7f7b690721ff 0x7f7b6908142c 0x7f7b6908241d 0x7f7b6908241d 0x7f7b6908241d 0x7f7b6908241d 0x7f7b6908241d 0x7f7b6908241d 0x7f7b6908241d 0x7f7b6907c164 0x7f7b6907c91e 0x505166 0x56bbfa 0x569dba 0x5f6eb3 0x56cc1f 0x569dba 0x5f6eb3 0x56cc1f 0x5f6cd6 0x56bacd 0x569dba 0x50bca0 0x56cc1f 0x569dba 0x5f6eb3 0x56bacd 0x569dba 0x5f6eb3
33592
+ Model weights saved in /home/sanchitgandhi/flax-wav2vec2-2-bart-large-ls-960h-black-box/flax_model.msgpack
33593
+ tokenizer config file saved in ./tokenizer_config.json
33594
+ Special tokens file saved in ./special_tokens_map.json
33595
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
33596
+ To disable this warning, you can either:
33597
+ - Avoid using `tokenizers` before the fork if possible
33598
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
33599
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
33600
+ To disable this warning, you can either:
33601
+ - Avoid using `tokenizers` before the fork if possible
33602
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
33603
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
33604
+ To disable this warning, you can either:
33605
+ - Avoid using `tokenizers` before the fork if possible
33606
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
33607
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
33608
+ To disable this warning, you can either:
33609
+ - Avoid using `tokenizers` before the fork if possible
33610
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
33611
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
33612
+ To disable this warning, you can either:
33613
+ - Avoid using `tokenizers` before the fork if possible
33614
+ - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
33615
+ huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
33616
  To disable this warning, you can either:
33617
  - Avoid using `tokenizers` before the fork if possible
wandb/run-20220828_085247-2hx8pk65/files/wandb-summary.json CHANGED
@@ -1 +1 @@
1
- {"train/decoder_grad_norm": 0.24299649894237518, "train/decoder_param_norm": 1060.7144775390625, "train/encoder_grad_norm": 0.20758813619613647, "train/encoder_param_norm": 2320.3076171875, "train/grad_norm": 0.3195936977863312, "layer_grad_norm/": {"decoder": {"model": {"decoder": {"embed_positions": {"embedding": 0.015636462718248367}, "embed_tokens": {"embedding": 0.08405420929193497}, "layernorm_embedding": {"bias": 0.006145514082163572, "scale": 0.004968359600752592}, "layers": {"FlaxBartDecoderLayers": {"encoder_attn": {"k_proj": {"bias": 6.667326488241088e-06, "kernel": 0.014865289442241192}, "out_proj": {"bias": 0.01036052592098713, "kernel": 0.049787431955337524}, "q_proj": {"bias": 0.0007108663558028638, "kernel": 0.015181932598352432}, "v_proj": {"bias": 0.019716547802090645, "kernel": 0.03999572619795799}}, "encoder_attn_layer_norm": {"bias": 0.015086976811289787, "scale": 0.016625721007585526}, "fc1": {"bias": 0.0057020955719053745, "kernel": 0.14046090841293335}, "fc2": {"bias": 0.0147400489076972, "kernel": 0.12780840694904327}, "final_layer_norm": {"bias": 0.032790299504995346, "scale": 0.03732256218791008}, "self_attn": {"k_proj": {"bias": 2.526882781239692e-06, "kernel": 0.01344823744148016}, "out_proj": {"bias": 0.020945662632584572, "kernel": 0.0472058430314064}, "q_proj": {"bias": 0.001085575670003891, "kernel": 0.014002838172018528}, "v_proj": {"bias": 0.020695650950074196, "kernel": 0.06064840778708458}}, "self_attn_layer_norm": {"bias": 0.009373994544148445, "scale": 0.011083677411079407}}}}}}, "encoder": {"adapter": {"layers": {"0": {"conv": {"bias": 0.02808896079659462, "kernel": 0.06710191071033478}}, "1": {"conv": {"bias": 0.02258674055337906, "kernel": 0.046897199004888535}}, "2": {"conv": {"bias": 0.02590387687087059, "kernel": 0.07143399119377136}}}}, "encoder": {"layer_norm": {"bias": 0.09637241810560226, "scale": 0.0566645972430706}, "layers": {"FlaxWav2Vec2EncoderLayers": {"attention": {"k_proj": {"bias": 2.509430260033696e-06, "kernel": 0.019797371700406075}, "out_proj": {"bias": 0.0026633520610630512, "kernel": 0.042062435299158096}, "q_proj": {"bias": 0.003061411203816533, "kernel": 0.01928599737584591}, "v_proj": {"bias": 0.011404206976294518, "kernel": 0.04114246740937233}}, "feed_forward": {"intermediate_dense": {"bias": 0.006081512663513422, "kernel": 0.05225696042180061}, "output_dense": {"bias": 0.002437157789245248, "kernel": 0.045792415738105774}}, "final_layer_norm": {"bias": 0.031238090246915817, "scale": 0.03307477384805679}, "layer_norm": {"bias": 0.04994071274995804, "scale": 0.04129469022154808}}}, "pos_conv_embed": {"conv": {"bias": 0.0008167774649336934, "weight_g": 0.0036110610235482454, "weight_v": 0.012701401486992836}}}, "feature_extractor": {"conv_layers": {"0": {"conv": {"bias": 0.0, "kernel": 0.0}, "layer_norm": {"bias": 0.0, "scale": 0.0}}, "1": {"conv": {"bias": 0.0, "kernel": 0.0}, "layer_norm": {"bias": 0.0, "scale": 0.0}}, "2": {"conv": {"bias": 0.0, "kernel": 0.0}, "layer_norm": {"bias": 0.0, "scale": 0.0}}, "3": {"conv": {"bias": 0.0, "kernel": 0.0}, "layer_norm": {"bias": 0.0, "scale": 0.0}}, "4": {"conv": {"bias": 0.0, "kernel": 0.0}, "layer_norm": {"bias": 0.0, "scale": 0.0}}, "5": {"conv": {"bias": 0.0, "kernel": 0.0}, "layer_norm": {"bias": 0.0, "scale": 0.0}}, "6": {"conv": {"bias": 0.0, "kernel": 0.0}, "layer_norm": {"bias": 0.0, "scale": 0.0}}}}, "feature_projection": {"layer_norm": {"bias": 0.0036688754335045815, "scale": 0.004511996638029814}, "projection": {"bias": 0.0011850270675495267, "kernel": 0.033120427280664444}}, "masked_spec_embed": 0.0}}, "layer_param_norm/": {"decoder": {"model": {"decoder": {"embed_positions": {"embedding": 58.625732421875}, "embed_tokens": {"embedding": 628.6316528320312}, "layernorm_embedding": {"bias": 2.4100914001464844, "scale": 13.898597717285156}, "layers": {"FlaxBartDecoderLayers": {"encoder_attn": {"k_proj": {"bias": 47.98827362060547, "kernel": 330.9457092285156}, "out_proj": {"bias": 6.15840482711792, "kernel": 227.3690948486328}, "q_proj": {"bias": 20.846906661987305, "kernel": 337.8941650390625}, "v_proj": {"bias": 3.635528802871704, "kernel": 231.59678649902344}}, "encoder_attn_layer_norm": {"bias": 10.790207862854004, "scale": 57.176551818847656}, "fc1": {"bias": 25.858234405517578, "kernel": 343.7656555175781}, "fc2": {"bias": 7.86165714263916, "kernel": 246.748291015625}, "final_layer_norm": {"bias": 3.946004867553711, "scale": 63.51469421386719}, "self_attn": {"k_proj": {"bias": 59.53274917602539, "kernel": 279.572998046875}, "out_proj": {"bias": 3.8042359352111816, "kernel": 132.37606811523438}, "q_proj": {"bias": 32.1674919128418, "kernel": 282.6742248535156}, "v_proj": {"bias": 2.5869321823120117, "kernel": 140.7692108154297}}, "self_attn_layer_norm": {"bias": 8.874273300170898, "scale": 84.6916732788086}}}}}}, "encoder": {"adapter": {"layers": {"0": {"conv": {"bias": 0.9373133182525635, "kernel": 60.65754699707031}}, "1": {"conv": {"bias": 1.0780301094055176, "kernel": 58.4447135925293}}, "2": {"conv": {"bias": 1.3075517416000366, "kernel": 58.37184524536133}}}}, "encoder": {"layer_norm": {"bias": 0.29216912388801575, "scale": 4.3043694496154785}, "layers": {"FlaxWav2Vec2EncoderLayers": {"attention": {"k_proj": {"bias": 19.379070281982422, "kernel": 552.49365234375}, "out_proj": {"bias": 16.84808349609375, "kernel": 704.6502075195312}, "q_proj": {"bias": 40.86162567138672, "kernel": 545.0123291015625}, "v_proj": {"bias": 15.593074798583984, "kernel": 696.20166015625}}, "feed_forward": {"intermediate_dense": {"bias": 24.4946231842041, "kernel": 1376.0777587890625}, "output_dense": {"bias": 20.806180953979492, "kernel": 1302.1185302734375}}, "final_layer_norm": {"bias": 32.5095329284668, "scale": 141.84854125976562}, "layer_norm": {"bias": 7.295251846313477, "scale": 45.60984420776367}}}, "pos_conv_embed": {"conv": {"bias": 15.245718002319336, "weight_g": 21.039236068725586, "weight_v": 213.549072265625}}}, "feature_extractor": {"conv_layers": {"0": {"conv": {"bias": 0.5982058644294739, "kernel": 8.08896541595459}, "layer_norm": {"bias": 10.069783210754395, "scale": 10.451257705688477}}, "1": {"conv": {"bias": 4.74075174331665, "kernel": 90.8435287475586}, "layer_norm": {"bias": 6.922820091247559, "scale": 19.5467586517334}}, "2": {"conv": {"bias": 6.7732415199279785, "kernel": 146.13897705078125}, "layer_norm": {"bias": 9.044225692749023, "scale": 19.424888610839844}}, "3": {"conv": {"bias": 5.224758148193359, "kernel": 159.10508728027344}, "layer_norm": {"bias": 8.319666862487793, "scale": 17.64743423461914}}, "4": {"conv": {"bias": 4.434978008270264, "kernel": 157.35813903808594}, "layer_norm": {"bias": 9.193974494934082, "scale": 15.562357902526855}}, "5": {"conv": {"bias": 5.297643661499023, "kernel": 131.1835174560547}, "layer_norm": {"bias": 10.735219955444336, "scale": 13.812533378601074}}, "6": {"conv": {"bias": 5.615579128265381, "kernel": 136.41822814941406}, "layer_norm": {"bias": 12.515308380126953, "scale": 11.152680397033691}}}}, "feature_projection": {"layer_norm": {"bias": 9.315728187561035, "scale": 27.725435256958008}, "projection": {"bias": 4.307735443115234, "kernel": 88.24262237548828}}, "masked_spec_embed": 26.247730255126953}}, "train/learning_rate": 6.065858542569913e-05, "train/loss": 0.04709053412079811, "train/param_norm": 2551.262939453125, "_timestamp": 1661775899, "_runtime": 99132, "_step": 19975, "eval/loss": 0.6138997077941895, "eval/wer": 0.05543913826697548, "eval/cer": 0.039964500651745845, "eval/step_10k": {"_type": "table-file", "sha256": "8b44e8a00a036a18ffdf81b4d076c8bf849ea6649001c69e94fa439b14f110ee", "size": 26434, "artifact_path": "wandb-client-artifact://18m0dj4hts3yiat04x5pvmncavkjapd5wb8bznb37vw8c0lqna3m2yjd1wtdrfstuoo7ejt2sphvjo0zuw1e5ne5d3qbkd7c1fylclfggig6us5tsmsj2uum5pchx48n:latest/eval/step_10k.table.json", "_latest_artifact_path": "wandb-client-artifact://18m0dj4hts3yiat04x5pvmncavkjapd5wb8bznb37vw8c0lqna3m2yjd1wtdrfstuoo7ejt2sphvjo0zuw1e5ne5d3qbkd7c1fylclfggig6us5tsmsj2uum5pchx48n:latest/eval/step_10k.table.json", "path": "media/table/eval/step_10k_10000_8b44e8a00a036a18ffdf.table.json", "ncols": 7, "nrows": 50}}
 
1
+ {"train/decoder_grad_norm": 0.17611896991729736, "train/decoder_param_norm": 1062.4339599609375, "train/encoder_grad_norm": 0.1704455018043518, "train/encoder_param_norm": 2322.47119140625, "train/grad_norm": 0.24509090185165405, "layer_grad_norm/": {"decoder": {"model": {"decoder": {"embed_positions": {"embedding": 0.006716505624353886}, "embed_tokens": {"embedding": 0.0642298087477684}, "layernorm_embedding": {"bias": 0.002760232426226139, "scale": 0.002035953802987933}, "layers": {"FlaxBartDecoderLayers": {"encoder_attn": {"k_proj": {"bias": 4.814476142200874e-06, "kernel": 0.009996717795729637}, "out_proj": {"bias": 0.007056824862957001, "kernel": 0.035733651369810104}, "q_proj": {"bias": 0.00045872520422562957, "kernel": 0.010119627229869366}, "v_proj": {"bias": 0.014002962969243526, "kernel": 0.028897128999233246}}, "encoder_attn_layer_norm": {"bias": 0.010591115802526474, "scale": 0.01078032236546278}, "fc1": {"bias": 0.004002838861197233, "kernel": 0.09873048961162567}, "fc2": {"bias": 0.010417253710329533, "kernel": 0.09346094727516174}, "final_layer_norm": {"bias": 0.02925114333629608, "scale": 0.025131691247224808}, "self_attn": {"k_proj": {"bias": 1.495998731115833e-06, "kernel": 0.009474096819758415}, "out_proj": {"bias": 0.016716178506612778, "kernel": 0.03129136934876442}, "q_proj": {"bias": 0.0007107162964530289, "kernel": 0.00879898015409708}, "v_proj": {"bias": 0.01823665015399456, "kernel": 0.04575955122709274}}, "self_attn_layer_norm": {"bias": 0.006624831352382898, "scale": 0.0074623520486056805}}}}}}, "encoder": {"adapter": {"layers": {"0": {"conv": {"bias": 0.024443458765745163, "kernel": 0.04797496274113655}}, "1": {"conv": {"bias": 0.018976185470819473, "kernel": 0.03335544094443321}}, "2": {"conv": {"bias": 0.02025711163878441, "kernel": 0.05017269030213356}}}}, "encoder": {"layer_norm": {"bias": 0.09079575538635254, "scale": 0.0370769128203392}, "layers": {"FlaxWav2Vec2EncoderLayers": {"attention": {"k_proj": {"bias": 2.546356427046703e-06, "kernel": 0.018089286983013153}, "out_proj": {"bias": 0.002156489295884967, "kernel": 0.034899111837148666}, "q_proj": {"bias": 0.0026696091517806053, "kernel": 0.017847701907157898}, "v_proj": {"bias": 0.009628637693822384, "kernel": 0.03381096571683884}}, "feed_forward": {"intermediate_dense": {"bias": 0.004856355953961611, "kernel": 0.04390779137611389}, "output_dense": {"bias": 0.001986933406442404, "kernel": 0.03847503662109375}}, "final_layer_norm": {"bias": 0.02509871870279312, "scale": 0.028251897543668747}, "layer_norm": {"bias": 0.043583061546087265, "scale": 0.03362990543246269}}}, "pos_conv_embed": {"conv": {"bias": 0.0006979878526180983, "weight_g": 0.002805550117045641, "weight_v": 0.011207042261958122}}}, "feature_extractor": {"conv_layers": {"0": {"conv": {"bias": 0.0, "kernel": 0.0}, "layer_norm": {"bias": 0.0, "scale": 0.0}}, "1": {"conv": {"bias": 0.0, "kernel": 0.0}, "layer_norm": {"bias": 0.0, "scale": 0.0}}, "2": {"conv": {"bias": 0.0, "kernel": 0.0}, "layer_norm": {"bias": 0.0, "scale": 0.0}}, "3": {"conv": {"bias": 0.0, "kernel": 0.0}, "layer_norm": {"bias": 0.0, "scale": 0.0}}, "4": {"conv": {"bias": 0.0, "kernel": 0.0}, "layer_norm": {"bias": 0.0, "scale": 0.0}}, "5": {"conv": {"bias": 0.0, "kernel": 0.0}, "layer_norm": {"bias": 0.0, "scale": 0.0}}, "6": {"conv": {"bias": 0.0, "kernel": 0.0}, "layer_norm": {"bias": 0.0, "scale": 0.0}}}}, "feature_projection": {"layer_norm": {"bias": 0.0032904886174947023, "scale": 0.003584273625165224}, "projection": {"bias": 0.0010678826365619898, "kernel": 0.03113219328224659}}, "masked_spec_embed": 0.0}}, "layer_param_norm/": {"decoder": {"model": {"decoder": {"embed_positions": {"embedding": 58.647159576416016}, "embed_tokens": {"embedding": 628.4832763671875}, "layernorm_embedding": {"bias": 2.4181270599365234, "scale": 13.876160621643066}, "layers": {"FlaxBartDecoderLayers": {"encoder_attn": {"k_proj": {"bias": 47.96188735961914, "kernel": 331.3609313964844}, "out_proj": {"bias": 6.137172698974609, "kernel": 227.73728942871094}, "q_proj": {"bias": 20.867891311645508, "kernel": 338.29461669921875}, "v_proj": {"bias": 3.5909812450408936, "kernel": 231.93850708007812}}, "encoder_attn_layer_norm": {"bias": 10.96327018737793, "scale": 57.53877639770508}, "fc1": {"bias": 26.014301300048828, "kernel": 345.999755859375}, "fc2": {"bias": 7.84496545791626, "kernel": 248.36624145507812}, "final_layer_norm": {"bias": 3.921175003051758, "scale": 63.50761413574219}, "self_attn": {"k_proj": {"bias": 59.543113708496094, "kernel": 279.90435791015625}, "out_proj": {"bias": 3.776594638824463, "kernel": 132.7462158203125}, "q_proj": {"bias": 32.19286346435547, "kernel": 283.0003662109375}, "v_proj": {"bias": 2.568603992462158, "kernel": 141.117431640625}}, "self_attn_layer_norm": {"bias": 8.895291328430176, "scale": 84.70443725585938}}}}}}, "encoder": {"adapter": {"layers": {"0": {"conv": {"bias": 1.1401727199554443, "kernel": 62.00471878051758}}, "1": {"conv": {"bias": 1.3068256378173828, "kernel": 59.942413330078125}}, "2": {"conv": {"bias": 1.5166486501693726, "kernel": 59.78715133666992}}}}, "encoder": {"layer_norm": {"bias": 0.2932465970516205, "scale": 4.228818893432617}, "layers": {"FlaxWav2Vec2EncoderLayers": {"attention": {"k_proj": {"bias": 19.381807327270508, "kernel": 553.2210083007812}, "out_proj": {"bias": 16.85451316833496, "kernel": 705.1119995117188}, "q_proj": {"bias": 40.90638732910156, "kernel": 545.735107421875}, "v_proj": {"bias": 15.575094223022461, "kernel": 696.6250610351562}}, "feed_forward": {"intermediate_dense": {"bias": 24.463790893554688, "kernel": 1377.1727294921875}, "output_dense": {"bias": 20.80949592590332, "kernel": 1303.4677734375}}, "final_layer_norm": {"bias": 32.52007293701172, "scale": 141.95835876464844}, "layer_norm": {"bias": 7.280069828033447, "scale": 45.696510314941406}}}, "pos_conv_embed": {"conv": {"bias": 15.224443435668945, "weight_g": 21.051162719726562, "weight_v": 213.89393615722656}}}, "feature_extractor": {"conv_layers": {"0": {"conv": {"bias": 0.5982058644294739, "kernel": 8.08896541595459}, "layer_norm": {"bias": 10.069783210754395, "scale": 10.451257705688477}}, "1": {"conv": {"bias": 4.74075174331665, "kernel": 90.8435287475586}, "layer_norm": {"bias": 6.922820091247559, "scale": 19.5467586517334}}, "2": {"conv": {"bias": 6.7732415199279785, "kernel": 146.13897705078125}, "layer_norm": {"bias": 9.044225692749023, "scale": 19.424888610839844}}, "3": {"conv": {"bias": 5.224758148193359, "kernel": 159.10508728027344}, "layer_norm": {"bias": 8.319666862487793, "scale": 17.64743423461914}}, "4": {"conv": {"bias": 4.434978008270264, "kernel": 157.35813903808594}, "layer_norm": {"bias": 9.193974494934082, "scale": 15.562357902526855}}, "5": {"conv": {"bias": 5.297643661499023, "kernel": 131.1835174560547}, "layer_norm": {"bias": 10.735219955444336, "scale": 13.812533378601074}}, "6": {"conv": {"bias": 5.615579128265381, "kernel": 136.41822814941406}, "layer_norm": {"bias": 12.515308380126953, "scale": 11.152680397033691}}}}, "feature_projection": {"layer_norm": {"bias": 9.262188911437988, "scale": 27.640396118164062}, "projection": {"bias": 4.317654132843018, "kernel": 88.17610931396484}}, "masked_spec_embed": 26.247730255126953}}, "train/learning_rate": 4.045656169182621e-05, "train/loss": 0.02121199481189251, "train/param_norm": 2553.945556640625, "_timestamp": 1661823749, "_runtime": 146982, "_step": 29975, "eval/loss": 1.021510124206543, "eval/wer": 0.05054961214661226, "eval/cer": 0.0362100285658818, "eval/step_10k": {"_type": "table-file", "sha256": "8b44e8a00a036a18ffdf81b4d076c8bf849ea6649001c69e94fa439b14f110ee", "size": 26434, "artifact_path": "wandb-client-artifact://18m0dj4hts3yiat04x5pvmncavkjapd5wb8bznb37vw8c0lqna3m2yjd1wtdrfstuoo7ejt2sphvjo0zuw1e5ne5d3qbkd7c1fylclfggig6us5tsmsj2uum5pchx48n:latest/eval/step_10k.table.json", "_latest_artifact_path": "wandb-client-artifact://18m0dj4hts3yiat04x5pvmncavkjapd5wb8bznb37vw8c0lqna3m2yjd1wtdrfstuoo7ejt2sphvjo0zuw1e5ne5d3qbkd7c1fylclfggig6us5tsmsj2uum5pchx48n:latest/eval/step_10k.table.json", "path": "media/table/eval/step_10k_10000_8b44e8a00a036a18ffdf.table.json", "ncols": 7, "nrows": 50}, "eval/step_20k": {"_type": "table-file", "sha256": "a0a50c5d8793ca99e4646f70c3624f8742c5285825bc1c59ab4083ac4de9d6e3", "size": 26657, "artifact_path": "wandb-client-artifact://13ri9hnxp93kf7dsdol2hs1j0v7bpkwwvujpi27awdck0fjm6vfog0dun9k9toif5xrt3cijlotddakikiw0bnbo3go679b4d2spq9c0w865vq0k9auiszkkbvev62fc:latest/eval/step_20k.table.json", "_latest_artifact_path": "wandb-client-artifact://13ri9hnxp93kf7dsdol2hs1j0v7bpkwwvujpi27awdck0fjm6vfog0dun9k9toif5xrt3cijlotddakikiw0bnbo3go679b4d2spq9c0w865vq0k9auiszkkbvev62fc:latest/eval/step_20k.table.json", "path": "media/table/eval/step_20k_20000_a0a50c5d8793ca99e464.table.json", "ncols": 7, "nrows": 50}}
wandb/run-20220828_085247-2hx8pk65/logs/debug-internal.log CHANGED
The diff for this file is too large to render. See raw diff
 
wandb/run-20220828_085247-2hx8pk65/run-2hx8pk65.wandb CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b64db33311cf3882e158023908e58b8d0a72a15e4fe5d8fa86d30c5ba22a41e2
3
- size 8691435
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dbd52b7487584ca71d80fed0e3182d1bd195ab787ea787199645db598d730074
3
+ size 12917901