AwAppp
/

benchmarks_8bit_batch_size25

Transformers

Model card Files Files and versions

xet

Community

AwAppp commited on Mar 6, 2024

Commit

1d8dc5f

verified ·

1 Parent(s): e218f6d

Upload TextGenerationReport

Browse files

Files changed (2) hide show

README.md +199 -0
benchmark_report.json +187 -0

README.md ADDED Viewed

	@@ -0,0 +1,199 @@

+---
+library_name: transformers
+tags: []
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]

benchmark_report.json ADDED Viewed

	@@ -0,0 +1,187 @@

+{
+    "prefill": {
+        "memory": {
+            "unit": "MB",
+            "max_ram": 3610.86976,
+            "max_vram": 13706.985472,
+            "max_reserved": 13214.154752,
+            "max_allocated": 11027.453952
+        },
+        "latency": {
+            "unit": "s",
+            "mean": 0.4016479968261719,
+            "stdev": 0.00450307941805889,
+            "values": [
+                0.4229984130859375,
+                0.4061747131347656,
+                0.4007014465332031,
+                0.4005621643066406,
+                0.4003113098144531,
+                0.4004822998046875,
+                0.4004290466308594,
+                0.4008570861816406,
+                0.40047308349609373,
+                0.40001739501953126,
+                0.4006717529296875,
+                0.4008058776855469,
+                0.40065023803710936,
+                0.4005826416015625,
+                0.40037786865234376,
+                0.40050381469726565,
+                0.40064306640625,
+                0.40063385009765623,
+                0.40045669555664065,
+                0.39976141357421874,
+                0.40033383178710935,
+                0.4008027648925781,
+                0.40047308349609373,
+                0.4009308166503906,
+                0.40056524658203124
+            ]
+        },
+        "throughput": {
+            "unit": "tokens/s",
+            "value": 995.8969126220113
+        },
+        "energy": null,
+        "efficiency": null
+    },
+    "decode": {
+        "memory": {
+            "unit": "MB",
+            "max_ram": 3630.583808,
+            "max_vram": 13736.3456,
+            "max_reserved": 13243.51488,
+            "max_allocated": 11166.721536
+        },
+        "latency": {
+            "unit": "s",
+            "mean": 28.560530487060543,
+            "stdev": 0,
+            "values": [
+                28.560530487060543
+            ]
+        },
+        "throughput": {
+            "unit": "tokens/s",
+            "value": 86.65805423751874
+        },
+        "energy": null,
+        "efficiency": null
+    },
+    "per_token": {
+        "memory": null,
+        "latency": {
+            "unit": "s",
+            "mean": 0.2884902069400055,
+            "stdev": 0.005574553256331573,
+            "values": [
+                0.30573773193359377,
+                0.3014727783203125,
+                0.29969100952148436,
+                0.29973300170898437,
+                0.29858203125,
+                0.298608642578125,
+                0.29833624267578124,
+                0.2978570251464844,
+                0.29740133666992186,
+                0.29682278442382815,
+                0.29672549438476564,
+                0.2960639953613281,
+                0.2946611328125,
+                0.2954721374511719,
+                0.2954915771484375,
+                0.2945587158203125,
+                0.29500518798828124,
+                0.2944102478027344,
+                0.29504205322265625,
+                0.29280154418945314,
+                0.2928322448730469,
+                0.29225778198242186,
+                0.2919669799804688,
+                0.29329202270507815,
+                0.2921922607421875,
+                0.29252301025390626,
+                0.2918021240234375,
+                0.29138430786132813,
+                0.29127166748046873,
+                0.2904176635742188,
+                0.2911313781738281,
+                0.2917529602050781,
+                0.2908026733398438,
+                0.29039410400390625,
+                0.2901688232421875,
+                0.28997222900390623,
+                0.28976434326171874,
+                0.2902865905761719,
+                0.28897689819335937,
+                0.28997222900390623,
+                0.2897592468261719,
+                0.28981964111328123,
+                0.28855502319335935,
+                0.2880440368652344,
+                0.2889666442871094,
+                0.28792318725585936,
+                0.28774911499023437,
+                0.2884403076171875,
+                0.2875832214355469,
+                0.28695040893554685,
+                0.2873180236816406,
+                0.2875217895507812,
+                0.2857482299804687,
+                0.28675787353515625,
+                0.2872637329101563,
+                0.28659710693359375,
+                0.2864322509765625,
+                0.28636672973632815,
+                0.28585574340820313,
+                0.28576461791992186,
+                0.2865428466796875,
+                0.2852802429199219,
+                0.2851758117675781,
+                0.284943359375,
+                0.28499661254882813,
+                0.28454400634765625,
+                0.28392242431640624,
+                0.2840729675292969,
+                0.28438323974609375,
+                0.28357427978515626,
+                0.28338687133789064,
+                0.28420913696289063,
+                0.28362240600585936,
+                0.2831769714355469,
+                0.2835486755371094,
+                0.28436376953125,
+                0.28338381958007813,
+                0.2831790161132812,
+                0.28350567626953127,
+                0.28333056640625,
+                0.28241510009765625,
+                0.2824325256347656,
+                0.2824560546875,
+                0.2829588623046875,
+                0.2829619140625,
+                0.28229632568359375,
+                0.2825932922363281,
+                0.28196762084960936,
+                0.2830182495117187,
+                0.28210791015625,
+                0.2823055419921875,
+                0.28156927490234374,
+                0.2821396484375,
+                0.2817843322753906,
+                0.28104806518554687,
+                0.28229119873046876,
+                0.2816358337402344,
+                0.281427978515625,
+                0.2809241638183594
+            ]
+        },
+        "throughput": {
+            "unit": "tokens/s",
+            "value": 86.65805423751874
+        },
+        "energy": null,
+        "efficiency": null
+    }
+}