bert-base-uncased-finetuned-math_punctuation

This model is a fine-tuned version of bert-base-uncased on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1688
  • Cls Report: {'EMPTY': {'precision': 0.9816733583732677, 'recall': 0.9870318722937851, 'f1-score': 0.9843453228066947, 'support': 92149}, '.': {'precision': 0.7987590486039297, 'recall': 0.7989242863053372, 'f1-score': 0.7988416589099182, 'support': 4834}, ',': {'precision': 0.6800739713361073, 'recall': 0.5517629407351838, 'f1-score': 0.6092358666390557, 'support': 2666}, '?': {'precision': 0.7534498620055198, 'recall': 0.7527573529411765, 'f1-score': 0.7531034482758621, 'support': 1088}, 'accuracy': 0.9639556468824761, 'macro avg': {'precision': 0.8034890600797061, 'recall': 0.7726191130688707, 'f1-score': 0.7863815741578827, 'support': 100737}, 'weighted avg': {'precision': 0.9624492510113833, 'recall': 0.9639556468824761, 'f1-score': 0.9630189215746797, 'support': 100737}}
  • Precision: 0.8035
  • Recall: 0.7726
  • F Score: 0.7864
  • Auc: 0.9378

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 12

Training results

Training Loss Epoch Step Validation Loss Cls Report Precision Recall F Score Auc
0.0872 0.61 500 0.0870 {'EMPTY': {'precision': 0.9790177368052186, 'recall': 0.9853498138883764, 'f1-score': 0.9821735697210293, 'support': 92149}, '.': {'precision': 0.7199926021823562, 'recall': 0.8053371948696731, 'f1-score': 0.7602773166682941, 'support': 4834}, ',': {'precision': 0.647912885662432, 'recall': 0.40172543135783945, 'f1-score': 0.49594813614262556, 'support': 2666}, '?': {'precision': 0.7650214592274678, 'recall': 0.6553308823529411, 'f1-score': 0.7059405940594059, 'support': 1088}, 'accuracy': 0.9577017381895431, 'macro avg': {'precision': 0.7779861709693687, 'recall': 0.7119358306172076, 'f1-score': 0.7360849041478388, 'support': 100737}, 'weighted avg': {'precision': 0.9555141484124913, 'recall': 0.9577017381895431, 'f1-score': 0.9556742202198848, 'support': 100737}} 0.7780 0.7119 0.7361 0.9357
0.0649 1.23 1000 0.0828 {'EMPTY': {'precision': 0.9791280947255113, 'recall': 0.9871078362217712, 'f1-score': 0.9831017730438966, 'support': 92149}, '.': {'precision': 0.7804928131416837, 'recall': 0.7863053371948697, 'f1-score': 0.7833882934872218, 'support': 4834}, ',': {'precision': 0.6437346437346437, 'recall': 0.4913728432108027, 'f1-score': 0.557328228036588, 'support': 2666}, '?': {'precision': 0.7918454935622318, 'recall': 0.6783088235294118, 'f1-score': 0.7306930693069307, 'support': 1088}, 'accuracy': 0.9610173024807172, 'macro avg': {'precision': 0.7988002612910177, 'recall': 0.7357737100392139, 'f1-score': 0.7636278409686592, 'support': 100737}, 'weighted avg': {'precision': 0.9586974152176492, 'recall': 0.9610173024807172, 'f1-score': 0.9595240617676797, 'support': 100737}} 0.7988 0.7358 0.7636 0.9422
0.0616 1.84 1500 0.0819 {'EMPTY': {'precision': 0.9811004732986104, 'recall': 0.9852847019501025, 'f1-score': 0.9831881358593543, 'support': 92149}, '.': {'precision': 0.7373263236950808, 'recall': 0.8123707074886223, 'f1-score': 0.7730314960629922, 'support': 4834}, ',': {'precision': 0.661874334398296, 'recall': 0.4662415603900975, 'f1-score': 0.5470950704225351, 'support': 2666}, '?': {'precision': 0.7820383451059536, 'recall': 0.7123161764705882, 'f1-score': 0.7455507455507455, 'support': 1088}, 'accuracy': 0.9603025700586676, 'macro avg': {'precision': 0.7905848691244852, 'recall': 0.7440532865748527, 'f1-score': 0.7622163619739069, 'support': 100737}, 'weighted avg': {'precision': 0.9588043882358697, 'recall': 0.9603025700586676, 'f1-score': 0.9589957260210037, 'support': 100737}} 0.7906 0.7441 0.7622 0.9428
0.0442 2.45 2000 0.0880 {'EMPTY': {'precision': 0.9839689595583041, 'recall': 0.9824740366146133, 'f1-score': 0.9832209298537127, 'support': 92149}, '.': {'precision': 0.79424216765453, 'recall': 0.7761688043028547, 'f1-score': 0.7851014856664574, 'support': 4834}, ',': {'precision': 0.585063752276867, 'recall': 0.6024006001500375, 'f1-score': 0.5936056181851784, 'support': 2666}, '?': {'precision': 0.6926131850675139, 'recall': 0.8014705882352942, 'f1-score': 0.7430762675756285, 'support': 1088}, 'accuracy': 0.9605606678777411, 'macro avg': {'precision': 0.7639720161393038, 'recall': 0.7906285073257, 'f1-score': 0.7762510753202443, 'support': 100737}, 'weighted avg': {'precision': 0.9611608981973231, 'recall': 0.9605606678777411, 'f1-score': 0.9608090930244636, 'support': 100737}} 0.7640 0.7906 0.7763 0.9480
0.0403 3.07 2500 0.1003 {'EMPTY': {'precision': 0.9812742167010242, 'recall': 0.9866412006641417, 'f1-score': 0.9839503901472927, 'support': 92149}, '.': {'precision': 0.7884338341222473, 'recall': 0.7925113777410012, 'f1-score': 0.7904673475704116, 'support': 4834}, ',': {'precision': 0.6488584474885845, 'recall': 0.5330082520630157, 'f1-score': 0.5852553542009885, 'support': 2666}, '?': {'precision': 0.7584541062801933, 'recall': 0.7215073529411765, 'f1-score': 0.7395195478097033, 'support': 1088}, 'accuracy': 0.9624566941640113, 'macro avg': {'precision': 0.7942551511480123, 'recall': 0.7584170458523338, 'f1-score': 0.7747981599320991, 'support': 100737}, 'weighted avg': {'precision': 0.9608165980480563, 'recall': 0.9624566941640113, 'f1-score': 0.9614744503226722, 'support': 100737}} 0.7943 0.7584 0.7748 0.9393
0.0309 3.68 3000 0.0944 {'EMPTY': {'precision': 0.981867888441049, 'recall': 0.9860660452093891, 'f1-score': 0.9839624889004396, 'support': 92149}, '.': {'precision': 0.7817371937639198, 'recall': 0.7987174182871328, 'f1-score': 0.790136089225417, 'support': 4834}, ',': {'precision': 0.659784138901924, 'recall': 0.5273818454613654, 'f1-score': 0.5861997081509277, 'support': 2666}, '?': {'precision': 0.7339857651245552, 'recall': 0.7582720588235294, 'f1-score': 0.7459312839059674, 'support': 1088}, 'accuracy': 0.9624765478424016, 'macro avg': {'precision': 0.7893437465578619, 'recall': 0.7676093419453542, 'f1-score': 0.7765573925456879, 'support': 100737}, 'weighted avg': {'precision': 0.9610631910159828, 'recall': 0.9624765478424016, 'f1-score': 0.9615638633652216, 'support': 100737}} 0.7893 0.7676 0.7766 0.9450
0.0188 4.29 3500 0.1144 {'EMPTY': {'precision': 0.9802141226142778, 'recall': 0.9876070277485377, 'f1-score': 0.9838966879827886, 'support': 92149}, '.': {'precision': 0.7962216090002122, 'recall': 0.7759619362846504, 'f1-score': 0.7859612362493452, 'support': 4834}, ',': {'precision': 0.6711409395973155, 'recall': 0.5251312828207052, 'f1-score': 0.5892255892255893, 'support': 2666}, '?': {'precision': 0.7253649635036497, 'recall': 0.7306985294117647, 'f1-score': 0.728021978021978, 'support': 1088}, 'accuracy': 0.962436840485621, 'macro avg': {'precision': 0.7932354086788638, 'recall': 0.7548496940664144, 'f1-score': 0.7717763728699254, 'support': 100737}, 'weighted avg': {'precision': 0.9604529146981596, 'recall': 0.962436840485621, 'f1-score': 0.9611899882855223, 'support': 100737}} 0.7932 0.7548 0.7718 0.9384
0.0201 4.91 4000 0.1079 {'EMPTY': {'precision': 0.9820999805317009, 'recall': 0.9853932218472257, 'f1-score': 0.9837438450329619, 'support': 92149}, '.': {'precision': 0.7719614921780987, 'recall': 0.7962350020686801, 'f1-score': 0.7839103869653768, 'support': 4834}, ',': {'precision': 0.6680672268907563, 'recall': 0.5367591897974494, 'f1-score': 0.5952579034941764, 'support': 2666}, '?': {'precision': 0.7176368375325803, 'recall': 0.7591911764705882, 'f1-score': 0.7378293881196962, 'support': 1088}, 'accuracy': 0.9620000595610352, 'macro avg': {'precision': 0.7849413842832841, 'recall': 0.7693946475459859, 'f1-score': 0.7751853809030529, 'support': 100737}, 'weighted avg': {'precision': 0.9608490332780492, 'recall': 0.9620000595610352, 'f1-score': 0.961217331581472, 'support': 100737}} 0.7849 0.7694 0.7752 0.9455
0.0125 5.52 4500 0.1295 {'EMPTY': {'precision': 0.9802362170783223, 'recall': 0.9871186882114836, 'f1-score': 0.9836654140420125, 'support': 92149}, '.': {'precision': 0.7850408548082967, 'recall': 0.7751344642118329, 'f1-score': 0.7800562090142604, 'support': 4834}, ',': {'precision': 0.6598960793575814, 'recall': 0.5240060015003751, 'f1-score': 0.5841522057286221, 'support': 2666}, '?': {'precision': 0.7573739295908658, 'recall': 0.7316176470588235, 'f1-score': 0.7442730247779334, 'support': 1088}, 'accuracy': 0.9619305716866693, 'macro avg': {'precision': 0.7956367702087666, 'recall': 0.7544692002456288, 'f1-score': 0.773036713390707, 'support': 100737}, 'weighted avg': {'precision': 0.9599847170618124, 'recall': 0.9619305716866693, 'f1-score': 0.9607363211567075, 'support': 100737}} 0.7956 0.7545 0.7730 0.9359
0.011 6.13 5000 0.1329 {'EMPTY': {'precision': 0.9830135739743687, 'recall': 0.9847203984850622, 'f1-score': 0.9838662459746934, 'support': 92149}, '.': {'precision': 0.7825380710659898, 'recall': 0.7972693421597021, 'f1-score': 0.789835024080336, 'support': 4834}, ',': {'precision': 0.629092416079569, 'recall': 0.5693923480870218, 'f1-score': 0.5977554636739515, 'support': 2666}, '?': {'precision': 0.7495412844036697, 'recall': 0.7509191176470589, 'f1-score': 0.7502295684113865, 'support': 1088}, 'accuracy': 0.962208523184133, 'macro avg': {'precision': 0.7860463363808993, 'recall': 0.7755753015947113, 'f1-score': 0.7804215755350918, 'support': 100737}, 'weighted avg': {'precision': 0.9615053869223467, 'recall': 0.962208523184133, 'f1-score': 0.9618136240240696, 'support': 100737}} 0.7860 0.7756 0.7804 0.9398
0.0089 6.75 5500 0.1353 {'EMPTY': {'precision': 0.9801606962066217, 'recall': 0.9875744717794007, 'f1-score': 0.9838536176653423, 'support': 92149}, '.': {'precision': 0.7881303174932559, 'recall': 0.7856847331402566, 'f1-score': 0.7869056251942402, 'support': 4834}, ',': {'precision': 0.6794234592445328, 'recall': 0.5127531882970743, 'f1-score': 0.5844377939290295, 'support': 2666}, '?': {'precision': 0.7528301886792453, 'recall': 0.7334558823529411, 'f1-score': 0.7430167597765363, 'support': 1088}, 'accuracy': 0.9625758162343528, 'macro avg': {'precision': 0.800136165405914, 'recall': 0.7548670688924182, 'f1-score': 0.7745534491412871, 'support': 100737}, 'weighted avg': {'precision': 0.9605316034538982, 'recall': 0.9625758162343528, 'f1-score': 0.961231148432892, 'support': 100737}} 0.8001 0.7549 0.7746 0.9347
0.0058 7.36 6000 0.1412 {'EMPTY': {'precision': 0.9816272115893218, 'recall': 0.9868256844892511, 'f1-score': 0.9842195837346985, 'support': 92149}, '.': {'precision': 0.8054963783553473, 'recall': 0.782167976830782, 'f1-score': 0.7936607892527289, 'support': 4834}, ',': {'precision': 0.6578260869565218, 'recall': 0.567516879219805, 'f1-score': 0.6093435360451067, 'support': 2666}, '?': {'precision': 0.7368896925858951, 'recall': 0.7490808823529411, 'f1-score': 0.7429352780309937, 'support': 1088}, 'accuracy': 0.963340182852378, 'macro avg': {'precision': 0.7954598423717714, 'recall': 0.7713978557231947, 'f1-score': 0.782539796765882, 'support': 100737}, 'weighted avg': {'precision': 0.9619626924275458, 'recall': 0.963340182852378, 'f1-score': 0.9625483201446381, 'support': 100737}} 0.7955 0.7714 0.7825 0.9414
0.0062 7.98 6500 0.1473 {'EMPTY': {'precision': 0.981030745505721, 'recall': 0.9872055041291821, 'f1-score': 0.9841084390787439, 'support': 92149}, '.': {'precision': 0.7817779565567177, 'recall': 0.8040959867604468, 'f1-score': 0.7927799306547011, 'support': 4834}, ',': {'precision': 0.6930792377131394, 'recall': 0.5183795948987246, 'f1-score': 0.5931330472103004, 'support': 2666}, '?': {'precision': 0.761996161228407, 'recall': 0.7297794117647058, 'f1-score': 0.7455399061032865, 'support': 1088}, 'accuracy': 0.9632309876212315, 'macro avg': {'precision': 0.8044710252509962, 'recall': 0.7598651243882648, 'f1-score': 0.778890330761758, 'support': 100737}, 'weighted avg': {'precision': 0.9614830487384137, 'recall': 0.9632309876212315, 'f1-score': 0.9620035027760906, 'support': 100737}} 0.8045 0.7599 0.7789 0.9366
0.0042 8.59 7000 0.1567 {'EMPTY': {'precision': 0.9827906976744186, 'recall': 0.9860009332711153, 'f1-score': 0.9843931982296761, 'support': 92149}, '.': {'precision': 0.8073005093378608, 'recall': 0.7869259412494828, 'f1-score': 0.7969830295411691, 'support': 4834}, ',': {'precision': 0.6395010395010395, 'recall': 0.5768942235558889, 'f1-score': 0.6065864720962334, 'support': 2666}, '?': {'precision': 0.7170940170940171, 'recall': 0.7711397058823529, 'f1-score': 0.7431355181576615, 'support': 1088}, 'accuracy': 0.9633004754955975, 'macro avg': {'precision': 0.7866715659018341, 'recall': 0.7802402009897099, 'f1-score': 0.782774554506185, 'support': 100737}, 'weighted avg': {'precision': 0.9624147902364303, 'recall': 0.9633004754955975, 'f1-score': 0.9627957529689442, 'support': 100737}} 0.7867 0.7802 0.7828 0.9384
0.0036 9.2 7500 0.1570 {'EMPTY': {'precision': 0.9826980264936469, 'recall': 0.9861745651065122, 'f1-score': 0.9844332264494322, 'support': 92149}, '.': {'precision': 0.7967748604506926, 'recall': 0.7972693421597021, 'f1-score': 0.7970220246096578, 'support': 4834}, ',': {'precision': 0.6540284360189573, 'recall': 0.5693923480870218, 'f1-score': 0.6087828353719671, 'support': 2666}, '?': {'precision': 0.7427536231884058, 'recall': 0.7536764705882353, 'f1-score': 0.7481751824817517, 'support': 1088}, 'accuracy': 0.963568500153866, 'macro avg': {'precision': 0.7940637365379256, 'recall': 0.7766281814853679, 'f1-score': 0.7846033172282022, 'support': 100737}, 'weighted avg': {'precision': 0.9624865329644247, 'recall': 0.963568500153866, 'f1-score': 0.9629467969930973, 'support': 100737}} 0.7941 0.7766 0.7846 0.9388
0.0026 9.82 8000 0.1661 {'EMPTY': {'precision': 0.9815270404419699, 'recall': 0.9871403921909082, 'f1-score': 0.9843257135127824, 'support': 92149}, '.': {'precision': 0.8012995179207714, 'recall': 0.7908564335953662, 'f1-score': 0.7960437272254035, 'support': 4834}, ',': {'precision': 0.6715935334872979, 'recall': 0.5453863465866466, 'f1-score': 0.6019457669219622, 'support': 2666}, '?': {'precision': 0.7368888888888889, 'recall': 0.7619485294117647, 'f1-score': 0.7492092182557614, 'support': 1088}, 'accuracy': 0.9635982806714514, 'macro avg': {'precision': 0.7978272451847321, 'recall': 0.7713329254461714, 'f1-score': 0.7828811064789774, 'support': 100737}, 'weighted avg': {'precision': 0.9620340152149095, 'recall': 0.9635982806714514, 'f1-score': 0.96263173010883, 'support': 100737}} 0.7978 0.7713 0.7829 0.9351
0.0023 10.43 8500 0.1675 {'EMPTY': {'precision': 0.9818959095767076, 'recall': 0.9870318722937851, 'f1-score': 0.984457192336833, 'support': 92149}, '.': {'precision': 0.7958974358974359, 'recall': 0.8026479106330161, 'f1-score': 0.7992584200226595, 'support': 4834}, ',': {'precision': 0.6728025770823746, 'recall': 0.5483870967741935, 'f1-score': 0.6042570779086587, 'support': 2666}, '?': {'precision': 0.7608695652173914, 'recall': 0.7398897058823529, 'f1-score': 0.750232991612302, 'support': 1088}, 'accuracy': 0.9639060126865004, 'macro avg': {'precision': 0.8028663719434774, 'recall': 0.7694891463958369, 'f1-score': 0.7845514204701133, 'support': 100737}, 'weighted avg': {'precision': 0.9624032096863155, 'recall': 0.9639060126865004, 'f1-score': 0.9629784873841293, 'support': 100737}} 0.8029 0.7695 0.7846 0.9363
0.0023 11.04 9000 0.1693 {'EMPTY': {'precision': 0.9820605039475532, 'recall': 0.9867497205612649, 'f1-score': 0.9843995279801664, 'support': 92149}, '.': {'precision': 0.7992526468756488, 'recall': 0.7964418700868846, 'f1-score': 0.7978447829240494, 'support': 4834}, ',': {'precision': 0.6733121884911645, 'recall': 0.5573893473368342, 'f1-score': 0.6098912374307408, 'support': 2666}, '?': {'precision': 0.7384341637010676, 'recall': 0.7628676470588235, 'f1-score': 0.7504520795660037, 'support': 1088}, 'accuracy': 0.9638365248121346, 'macro avg': {'precision': 0.7982648757538585, 'recall': 0.7758621462609517, 'f1-score': 0.7856469069752401, 'support': 100737}, 'weighted avg': {'precision': 0.962485951913241, 'recall': 0.9638365248121346, 'f1-score': 0.9630093777465815, 'support': 100737}} 0.7983 0.7759 0.7856 0.9380
0.0019 11.66 9500 0.1688 {'EMPTY': {'precision': 0.9816733583732677, 'recall': 0.9870318722937851, 'f1-score': 0.9843453228066947, 'support': 92149}, '.': {'precision': 0.7987590486039297, 'recall': 0.7989242863053372, 'f1-score': 0.7988416589099182, 'support': 4834}, ',': {'precision': 0.6800739713361073, 'recall': 0.5517629407351838, 'f1-score': 0.6092358666390557, 'support': 2666}, '?': {'precision': 0.7534498620055198, 'recall': 0.7527573529411765, 'f1-score': 0.7531034482758621, 'support': 1088}, 'accuracy': 0.9639556468824761, 'macro avg': {'precision': 0.8034890600797061, 'recall': 0.7726191130688707, 'f1-score': 0.7863815741578827, 'support': 100737}, 'weighted avg': {'precision': 0.9624492510113833, 'recall': 0.9639556468824761, 'f1-score': 0.9630189215746797, 'support': 100737}} 0.8035 0.7726 0.7864 0.9378

Framework versions

  • Transformers 4.25.1
  • Pytorch 2.0.0.dev20230111
  • Datasets 2.8.0
  • Tokenizers 0.13.2
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.