"_name_or_path": "/content/drive/MyDrive/model_2/model/checkpoints/quantized_gpt2_ex_4bits_with-fixed-dataset",
"activation_function": "gelu_new",
"architectures": [
"attn_pdrop": 0.1,
"bos_token_id": 50256,
"embd_pdrop": 0.1,
"eos_token_id": 50256,
"initializer_range": 0.02,
"layer_norm_epsilon": 1e-05,
"model_type": "gpt2",
"n_ctx": 1024,
"n_embd": 768,
"n_head": 12,
"n_inner": null,
"n_layer": 12,
"n_positions": 1024,
"quantization_config": {
"batch_size": 1,
"bits": 4,
"block_name_to_quantize": null,
"cache_block_outputs": true,
"damp_percent": 0.1,
"dataset": [
"Question: Calculate the relative occupancies of the \u03b1 and \u03b2 spin energy levels for a radical species with g = 2.05, at L- and W-band frequencies (take TS = 300 K).\n\nOptions:\nA. N\u03b1/N\u03b2 = 0.9800 at L-band; N\u03b1/N\u03b2 = 0.9509 at W-band\nB. N\u03b1/N\u03b2 = 0.9950 at L-band; N\u03b1/N\u03b2 = 0.9609 at W-band\nC. N\u03b1/N\u03b2 = 0.9910 at L-band; N\u03b1/N\u03b2 = 0.9709 at W-band\nD. N\u03b1/N\u03b2 = 0.9850 at L-band; N\u03b1/N\u03b2 = 0.9809 at W-band\n\nAnswer:",
"Question: Which of the following is more appropriate to do feature selection?\n\nOptions:\nA. Ridge\nB. Lasso\nC. both (a) and (b)\nD. neither (a) nor (b)\n\nAnswer:",
"Question: Exact solutions of the Schr\u00f6dinger equation CANNOT be obtained for a?\n\nOptions:\nA. simple harmonic oscillator\nB. particle in a one-dimensional box\nC. rigid rotor\nD. helium atom\n\nAnswer:",
"Question: What is the dimensionality of the null space of the following matrix? A = [[3, 2, \u22129], [\u22126, \u22124, 18], [12, 8, \u221236]]?\n\nOptions:\nA. 0\nB. 1\nC. 2\nD. 3\n\nAnswer:",
"Question: In plants, proton pumps are involved in the process of loading sugars into the phloem for transport. Which of the following is true about this process?\n\nOptions:\nA. It is passive.\nB. It depends on DNA.\nC. It requires ATP.\nD. It translocates starch.\n\nAnswer:",
"Question: The soils of which of the following biomes has the highest rate of leaching and cycling of nutrients?\n\nOptions:\nA. Tropical rain forest\nB. Tundra\nC. Taiga\nD. Desert\n\nAnswer:",
"Question: The amplitude of a free induction decay drops to 25% of its initial intensity after 1.0 s. Assuming exponential relaxation and \u03a9 = 0, determine the value of T2.\n\nOptions:\nA. 0.721 s\nB. 0.750 s\nC. 1.386 s\nD. 1.661 s\n\nAnswer:",
"Question: During the mammalian cardiac cycle, a volume of blood equivalent to ventricular stroke volume is transferred from the more compliant venous side to the less compliant arterial side of the circulation. In terms of pressures within the venous and arterial compartments, this transfer results in?\n\nOptions:\nA. no change in pressure in either compartment\nB. no effect on venous pressure and a small increase in arterial pressure\nC. an increase in venous pressure and an equal but opposite decrease in arterial pressure\nD. little effect on venous pressure and a large increase in arterial pressure\n\nAnswer:",
"Question: Computational complexity of Gradient descent is,?\n\nOptions:\nA. linear in D\nB. linear in N\nC. polynomial in D\nD. dependent on the number of iterations\n\nAnswer:",
"Question: For polynomial regression, which one of these structural assumptions is the one that most affects the trade-off between underfitting and overfitting:?\n\nOptions:\nA. The polynomial degree\nB. Whether we learn the weights by matrix inversion or gradient descent\nC. The assumed variance of the Gaussian noise\nD. The use of a constant-term unit input\n\nAnswer:",
"Question: Hybrids between some related species of plants are sterile because the parent plants had different chromosome numbers. Occasionally the chromosome number of such a hybrid plant doubles spontaneously. Which of the following best describes the descendants of those plants with the double chromosome number?\n\nOptions:\nA. The plant with the double chromosome number would be genetically defective and have no descendants.\nB. The descendants would be at a selective advantage because of the increased ability to introgress.\nC. The descendants would be reproductively successful because they could backcross with the parental species.\nD. The descendants would regain the ability to reproduce sexually because chromosomes could pair normally.\n\nAnswer:",
"Question: Mammals are homeostatic for all of the following EXCEPT?\n\nOptions:\nA. body temperature\nB. blood glucose concentration\nC. blood pH\nD. metabolic rate\n\nAnswer:",
"Question: The sight organs of crustaceans and insects contain ommatidia, which make up the individual visual units of the?\n\nOptions:\nA. eyespot\nB. simple eye\nC. compound eye\nD. binocular eye\n\nAnswer:",
"Question: Which of the following best explains why enzymes are effective in facilitating chemical reactions?\n\nOptions:\nA. They raise the temperature of the reaction mixture, thereby speeding up the conversion of reactants to products.\nB. They alter the equilibrium constant of a reaction (K_eq) so that more reactant can be converted to product.\nC. They increase the maximal rate of the chemical reaction (V_max).\nD. They lower the activation energy, thereby speeding up the conversion of reactants to products.\n\nAnswer:",
"Question: The normal modes of a carbon dioxide molecule that are infrared-active include which of the following?\nI. Bending\nII. Symmetric stretching\nIII. Asymmetric stretching?\n\nOptions:\nA. I only\nB. II only\nC. III only\nD. I and III only\n\nAnswer:",
"Question: Which of the following sources makes the greatest contribution to the dry mass of organic matter that comprises an oak tree?\n\nOptions:\nA. Organic molecules from decaying matter in the soil that are taken up by the roots\nB. Mineral nutrients dissolved in groundwater that are taken up by the roots\nC. Water that is taken up by the roots and carbon dioxide from the air\nD. Endosperm located in the cotyledons of the acorn\n\nAnswer:",
"Question: Which is a characteristic unique to angiosperms?\n\nOptions:\nA. Wind-borne pollen\nB. A dominant sporophyte life cycle\nC. Alteration of generations\nD. Double fertilization\n\nAnswer:",
"Question: Which among the following prevents overfitting when we perform bagging?\n\nOptions:\nA. The use of sampling with replacement as the sampling technique\nB. The use of weak classifiers\nC. The use of classification algorithms which are not prone to overfitting\nD. The practice of validation performed on every classifier trained\n\nAnswer:",
"Question: The numerical output of a sigmoid node in a neural network:?\n\nOptions:\nA. Is unbounded, encompassing all real numbers.\nB. Is unbounded, encompassing all integers.\nC. Is bounded between 0 and 1.\nD. Is bounded between -1 and 1.\n\nAnswer:",
"Question: The concept of punctuated equilibrium refers to?\n\nOptions:\nA. oscillating ecological successional stages\nB. ecological succession arrested by sudden environmental changes, e.g., fire\nC. persistent predator-prey relationships in relatively stable environments\nD. bursts of speciation followed by relatively unchanging lineages\n\nAnswer:",
"Question: In one taxonomic classification, Archaea, Eukarya, and Bacteria represent the three major domains of life. Eukarya utilize the general transcription factors TBP (TATA-binding protein) and TFIIB in transcription, whereas Bacteria do not. At least one member of Archaea has a protein similar to TBP and a protein similar to TFIIB. Based on this observation, which of the following scenarios is most likely?\n\nOptions:\nA. Archaea and Eukarya diverged after their common ancestor diverged from Bacteria.\nB. Archaea and Bacteria diverged after their common ancestor diverged from Eukarya.\nC. Bacteria and Eukarya diverged after their common ancestor diverged from Archaea.\nD. Archaea, Eukarya, and Bacteria diverged simultaneously from a common ancestor.\n\nAnswer:",
"Question: Suppose we like to calculate P(H|E, F) and we have no conditional independence information. Which of the following sets of numbers are sufficient for the calculation?\n\nOptions:\nA. P(E, F), P(H), P(E|H), P(F|H)\nB. P(E, F), P(H), P(E, F|H)\nC. P(H), P(E|H), P(F|H)\nD. P(E, F), P(E|H), P(F|H)\n\nAnswer:",
"Question: The Henry\u2019s law constant for CO2 dissolved in water at 25\u00b0C is 30.0 atm M^\u22121. The concentration of dissolved CO2 in a vessel pressurized with 2.0 atm of CO2 is?\n\nOptions:\nA. 1.5 M\nB. 0.15 M\nC. 0.067 M\nD. 0.015 M\n\nAnswer:",
"Question: Which of the following statements about mitochondria and chloroplasts is generally true?\n\nOptions:\nA. Plants have chloroplasts but no mitochondria; animals have mitochondria but no chloroplasts.\nB. Plants have chloroplasts but no mitochondria; fungi have mitochondria but no chloroplasts.\nC. Plants and fungi have chloroplasts but no mitochondria; animals have only mitochondria.\nD. Plants have both chloroplasts and mitochondria; animals and fungi have only mitochondria.\n\nAnswer:",
"Question: Among primates, a high degree of sexual dimorphism in a species usually indicates intense competition between?\n\nOptions:\nA. males in order to obtain individual food resources\nB. males in order to obtain mates\nC. females in order to obtain individual food resources\nD. females in order to obtain mates\n\nAnswer:",
"Question: Which of the following statements about fungi is NOT true?\n\nOptions:\nA. They all are eukaryotic.\nB. They all have rigid cell walls.\nC. Most are filamentous.\nD. Some are photosynthetic.\n\nAnswer:",
"Question: Another term for out-of-distribution detection is?\n\nOptions:\nA. anomaly detection\nB. one-class detection\nC. train-test mismatch robustness\nD. background detection\n\nAnswer:",
"Question: Predict the hyperfine value for the EPR spectrum of fully deuteriated benzene radical anion C6D6\u2022-.\n\nOptions:\nA. 0.375 mT\nB. 3.75 G\nC. 2.35 mT\nD. 0.58 G\n\nAnswer:",
"Question: The X-band (9.5 GHz) EPR spectrum of a matrix isolated Na atom reveals four hyperfine lines with resonant field positions of 3074 G, 3174 G, 3274 G and 3374 G. Calculate the g value of the atom.\n\nOptions:\nA. g = 2.002\nB. g = 1.950\nC. g = 2.250\nD. g = 2.005\n\nAnswer:",
"Question: Which of the following is a clustering algorithm in machine learning?\n\nOptions:\nA. Expectation Maximization\nB. CART\nC. Gaussian Na\u00efve Bayes\nD. Apriori\n\nAnswer:",
"Question: Statement 1| After mapped into feature space Q through a radial basis kernel function, 1-NN using unweighted Euclidean distance may be able to achieve better classification performance than in original space (though we can\u2019t guarantee this). Statement 2| The VC dimension of a Perceptron is smaller than the VC dimension of a simple linear SVM.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: Calculate the magnetic moment (\u03bcI) of a 13C nucleus.\n\nOptions:\nA. 6.1445 x 10^-27 J T-1\nB. 3.1445 x 10^-27 J T-1\nC. 9.1445 x 10^-27 J T-1\nD. 1.1445 x 10^-28 J T-1\n\nAnswer:",
"Question: Statement 1| L2 regularization of linear models tends to make models more sparse than L1 regularization. Statement 2| Residual connections can be found in ResNets and Transformers.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: A species of small rodent eats seeds from only one species of pine. In normal years, a pair of these rodents will have a litter of two or three. It is unusual for small rodents to have such small litter sizes. The rodents are most likely to exhibit which other characteristic?\n\nOptions:\nA. Moderate sexual size dimorphism\nB. High parental investment\nC. Precocial young\nD. Frequent extrapair matings\n\nAnswer:",
"Question: Calculate the equilibrium polarization of 13C nuclei in a 20.0 T magnetic field at 300 K.\n\nOptions:\nA. 10.8 x 10^-5\nB. 4.11 x 10^-5\nC. 3.43 x 10^-5\nD. 1.71 x 10^-5\n\nAnswer:",
"Question: Suppose that the 13C nuclei in a molecule in a 600 MHz spectrometer can be 100% polarized (p = 1). If T1 = 5.0 s, how long does it take for p to reach a value equal to twice the thermal equilibrium polarization at 298 K?\n\nOptions:\nA. [The polarization relaxes exponentially: p(t) = [p(0) - peq]exp(-t/T1) + peq.]\nB. 72.0 s\nC. 56.6 s\nD. 12.7 s\n\nAnswer:",
"Question: A quote from a natural resources text states: \"Whenever the original ecosystem becomes restructured by man, it tends to become simplified, with a resultant disruption of the stabilizing influences of density-dependent regulatory factors.\" This implies that in a disturbed ecosystem?\n\nOptions:\nA. there exist large populations of a low number of species\nB. population levels of a species are kept at equilibrium through natural regulatory mechanisms\nC. a given prey organism is subject to higher predation rates by more diverse predators\nD. a given prey organism is less likely to undergo a population surge\n\nAnswer:",
"Question: Which of the following is classified as a conjugate acid-base pair?\n\nOptions:\nA. HCl / NaOH\nB. H3O+ / H2O\nC. O2 / H2O\nD. H+ / Cl\u2212\n\nAnswer:",
"Question: Statement 1| Word2Vec parameters were not initialized using a Restricted Boltzman Machine. Statement 2| The tanh function is a nonlinear activation function.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: Statement 1| The back-propagation algorithm learns a globally optimal neural network with hidden layers. Statement 2| The VC dimension of a line should be at most 2, since I can find at least one case of 3 points that cannot be shattered by any line.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: In fluorescence spectroscopy, the quantum yield (\u03a6_f) is best defined as the?\n\nOptions:\nA. rate of fluorescence emission\nB. number of photons emitted\nC. number of photons emitted, divided by the number of photons absorbed\nD. number of excitation photons impinging on the sample, divided by the number of photons absorbed\n\nAnswer:",
"Question: Brood parasites such as the cuckoo successfully trick other species of birds into rearing their young by exploiting the host birds' instinctive response to the loud begging cues of a fledgling in their nest. The genes that allow the host bird species to be duped into rearing the cuckoo fledglings to the detriment of their own offspring most likely remain in the gene pool of the population because?\n\nOptions:\nA. on average, the host birds' response allows them to rear their own young efficiently by feeding only those who indicate they are hungry\nB. the maximum fitness of the duped bird is not compromised when the bird rears an interloper of another species\nC. on average, little energy is spent on rearing a fledgling bird, whether it is an interloper or one's own\nD. the maximum fitness of the cuckoo would then be reduced\n\nAnswer:",
"Question: Which of the following is true of a convolution kernel?\n\nOptions:\nA. Convolving an image with $\\begin{bmatrix}1 & 0 & 0\\\\ 0 & 1 & 0 \\\\ 0 & 0 & 1 \\end{bmatrix}$ would not change the image\nB. Convolving an image with $\\begin{bmatrix}0 & 0 & 0\\\\ 0 & 1 & 0 \\\\ 0 & 0 & 0 \\end{bmatrix}$ would not change the image\nC. Convolving an image with $\\begin{bmatrix}1 & 1 & 1\\\\ 1 & 1 & 1 \\\\ 1 & 1 & 1 \\end{bmatrix}$ would not change the image\nD. Convolving an image with $\\begin{bmatrix}0 & 0 & 0\\\\ 0 & 0 & 0 \\\\ 0 & 0 & 0 \\end{bmatrix}$ would not change the image\n\nAnswer:",
"Question: Given two Boolean random variables, A and B, where P(A) = 1/2, P(B) = 1/3, and P(A | \u00acB) = 1/4, what is P(A | B)?\n\nOptions:\nA. 1/6\nB. 1/4\nC. 3/4\nD. 1\n\nAnswer:",
"Question: Calculate the magnetic field responsible for the polarization of 2.5 x 10^-6 for 13C at 298 K.\n\nOptions:\nA. 0.5 T\nB. 1.2 T\nC. 2.9 T\nD. 100 T\n\nAnswer:",
"Question: When doing least-squares regression with regularisation (assuming that the optimisation can be done exactly), increasing the value of the regularisation parameter \u03bb the testing error.\n\nOptions:\nA. will never decrease the training error.\nB. will never increase the training error.\nC. will never decrease the testing error.\nD. will never increase\n\nAnswer:",
"Question: Statement 1| We learn a classifier f by boosting weak learners h. The functional form of f\u2019s decision boundary is the same as h\u2019s, but with different parameters. (e.g., if h was a linear classifier, then f is also a linear classifier). Statement 2| Cross validation can be used to select the number of iterations in boosting; this procedure may help reduce overfitting.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: Which of the following metal ions cannot be used as a paramagnetic quencher?\n\nOptions:\nA. Ti3+\nB. Cr3+\nC. Fe3+\nD. Zn2+\n\nAnswer:",
"Question: Which one sentence explains most accurately why spin trapping is often used to detect free radical intermediates?\n\nOptions:\nA. spin trapping provides more structural information than direct detection by EPR\nB. spin trapping makes it easy to quantify free radical intermediates\nC. steady state concentration of free radical intermediates is often too low to enable direct detection by EPR\nD. detection of spin adducts requires lower power than direct detection of radical intermediates\n\nAnswer:",
"Question: Mobile regions of DNA capable of inserting themselves into an existing genome are?\n\nOptions:\nA. prions\nB. cistrons\nC. introns\nD. transposons\n\nAnswer:",
"Question: What is the NMR frequency of 31P in a 20.0 T magnetic field?\n\nOptions:\nA. 54.91 MHz\nB. 239.2 MHz\nC. 345.0 MHz\nD. 2167 MHz\n\nAnswer:",
"Question: Suppose we would like to perform clustering on spatial data such as the geometrical locations of houses. We wish to produce clusters of many different sizes and shapes. Which of the following methods is the most appropriate?\n\nOptions:\nA. Decision Trees\nB. Density-based clustering\nC. Model-based clustering\nD. K-means clustering\n\nAnswer:",
"Question: For a neural network, which one of these structural assumptions is the one that most affects the trade-off between underfitting (i.e. a high bias model) and overfitting (i.e. a high variance model):?\n\nOptions:\nA. The number of hidden nodes\nB. The learning rate\nC. The initial choice of weights\nD. The use of a constant-term unit input\n\nAnswer:",
"Question: All proteins absorb electromagnetic radiation of wavelength around 190 nm, which corresponds to a \u03c0 \u2192 \u03c0* excitation in the protein molecule. In which region of the spectrum is this wavelength found?\n\nOptions:\nA. X-ray\nB. Ultraviolet\nC. Visible\nD. Infrared\n\nAnswer:",
"Question: A female fruit fly bearing linked genes that produce the phenotype gray body and normal wings mates with a male fruit fly of phenotype black body and vestigial wings. The presence of gray-bodied, vestigialwinged flies among the progeny is best explained by?\n\nOptions:\nA. crossing over\nB. independent assortment\nC. segregation of alleles\nD. penetrance\n\nAnswer:",
"Question: Annelids and arthropods are similar to each other in that members of both phyla?\n\nOptions:\nA. have segmented bodies\nB. have a closed circulatory system\nC. conduct gas exchange by diffusion through a moist membrane\nD. have well-developed sense organs\n\nAnswer:",
"Question: Calculate the spin angular momentum of 43Ca. [I = 7\u20442]?\n\nOptions:\nA. 2.166 x 10^-34 J s\nB. 3.691 x 10^-34 J s\nC. 4.185 x 10^-34 J s\nD. 5.493 x 10^-34 J s\n\nAnswer:",
"Question: Which of the following plant cells undergoes programmed cell death to become functional?\n\nOptions:\nA. Phloem sieve tube member\nB. Xylem vessel member\nC. Stomatal guard cell\nD. Root cap cell\n\nAnswer:",
"Question: Statement 1| RELUs are not monotonic, but sigmoids are monotonic. Statement 2| Neural networks trained with gradient descent with high probability converge to the global optimum.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: Chlorofluorocarbons (CFCs) such as F3CCCl3 are implicated in the decomposition of stratospheric ozone. Which of the following methods would be best suited to measurement of trace amounts (sub-ppb) of CFCs in an air sample?\n\nOptions:\nA. Gas chromatographic separation of the air sample on a capillary column followed by electron capture detection\nB. Gas chromatographic separation of the air sample on a packed column followed by thermal conductivity detection\nC. Gas chromatographic separation of the air sample on a capillary column followed by flame ionization detection\nD. Conversion of the sample of the chlorinated compounds to chloride ions, followed by titration with Ag+\n\nAnswer:",
"Question: A behavioral response called a fixed action pattern shown by animals?\n\nOptions:\nA. occurs the second time an animal is exposed to the correct stimulus at the appropriate time in its life\nB. occurs in the absence of sensory feedback\nC. is a motor response which once released may be terminated spontaneously\nD. is triggered by a number of sensory signals in the animal's environment\n\nAnswer:",
"Question: Which of the following adaptations would limit pollination by bees and promote hummingbird pollination?\n\nOptions:\nA. Patterns of ultraviolet color on the petals\nB. Modified petals to provide a landing space\nC. Pendant (hanging) red-colored flowers\nD. Nectar with high sugar concentration produced in limited amounts\n\nAnswer:",
"Question: A rise in intracellular free calcium in the sea urchin oocyte causes the release of proteolytic enzymes which act to prevent polyspermy. The events just described entail the?\n\nOptions:\nA. zona reaction\nB. acrosomal reaction\nC. cortical reaction\nD. fertilization reaction\n\nAnswer:",
"Question: Nerve outgrowth from a developing neuron begins at the growth cone, located at the tip of the axon. Microspikes of the growth cone extend and retract in order to move the growth cone forward. Exposure of the neuron to cytochasalin B at this stage of development causes?\n\nOptions:\nA. microtubules in the axon to undergo reversible dissociation\nB. microtubules in the axon to undergo irreversible dissociation\nC. microfilaments in the microspike to undergo reversible depolymerization\nD. microfilaments in the microspike to undergo irreversible depolymerization\n\nAnswer:",
"Question: Existential risks posed by AI are most commonly associated with which of the following professors?\n\nOptions:\nA. Nando de Frietas\nB. Yann LeCun\nC. Stuart Russell\nD. Jitendra Malik\n\nAnswer:",
"Question: Which one of the following is equal to P(A, B, C) given Boolean random variables A, B and C, and no independence or conditional independence assumptions between any of them?\n\nOptions:\nA. P(A | B) * P(B | C) * P(C | A)\nB. P(C | A, B) * P(A) * P(B)\nC. P(A, B | C) * P(C)\nD. P(A | B, C) * P(B | A, C) * P(C | A, B)\n\nAnswer:",
"Question: Statement 1| The Stanford Sentiment Treebank contained movie reviews, not book reviews. Statement 2| The Penn Treebank has been used for language modeling.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: During a delay, spins with an offset frequency \u03a9 = 250 rad s-1 precess through an angle of 60\u00b0. How long is the delay?\n\nOptions:\nA. 4.19 ms\nB. 26.3 ms\nC. 240 ms\nD. 1510 ms\n\nAnswer:",
"Question: Which of the following is typically NOT found in normal somatic cells of a human male?\n\nOptions:\nA. The entire genetic information possessed by the original zygote\nB. An inactivated X chromosome\nC. Forty-four autosomes\nD. A diploid nucleus\n\nAnswer:",
"Question: A Cu(II) metal ion (giso = 2.12) produces four lines with a separation of 500 MHz between each line. Express the hyperfine splitting in field units of mT and the hyperfine coupling in units of wavenumbers.\n\nOptions:\nA. 500 MHz = 0.185 mT = 0.29842 cm-1\nB. 500 MHz = 16.850 mT = 0.01667 cm-1\nC. 500 MHz = 32.953 mT = 0.76298 cm-1\nD. 500 MHz = 45.672 mT = 2.86329 cm-1\n\nAnswer:",
"Question: Neural networks:?\n\nOptions:\nA. Optimize a convex objective function\nB. Can only be trained with stochastic gradient descent\nC. Can use a mix of different activation functions\nD. None of the above\n\nAnswer:",
"Question: Which of the following statements concerning a sarcomere of a striated muscle (such as skeletal muscle) is correct?\n\nOptions:\nA. During contraction H zones become elongated.\nB. In the relaxed position tropomyosin impedes myosin's access to the binding site of actin.\nC. Each myosin helical tail contains an actinbinding site and an ATP-hydrolyzing site.\nD. The proteins troponin and tropomyosin constitute the thick and thin filaments, respectively.\n\nAnswer:",
"Question: In Sweden, the red fox (Vulpes vulpes) severely limits populations of its prey, including hares. However, red fox populations are sometimes attacked by a fatal parasite, the mange mite. As mite population sizes increase at a given site, how are hare and fox populations most likely to respond at the same site? (Assume that hares have no major predators at this site other than foxes.)?\n\nOptions:\nA. Both fox and hare populations will decrease.\nB. Both fox and hare populations will increase.\nC. Fox populations will decrease and hare populations will increase.\nD. Fox populations will increase and hare populations will decrease.\n\nAnswer:",
"Question: For a Gaussian Bayes classifier, which one of these structural assumptions is the one that most affects the trade-off between underfitting and overfitting:?\n\nOptions:\nA. Whether we learn the class centers by Maximum Likelihood or Gradient Descent\nB. Whether we assume full class covariance matrices or diagonal class covariance matrices\nC. Whether we have equal class priors or priors estimated from the data\nD. Whether we allow classes to have different mean vectors or we force them to share the same mean vector\n\nAnswer:",
"Question: Statement 1| VGGNets have convolutional kernels of smaller width and height than AlexNet's first-layer kernels. Statement 2| Data-dependent weight initialization procedures were introduced before Batch Normalization.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: For a Gaussian Bayes classifier, which one of these structural assumptions is the one that most affects the trade-off between underfitting and overfitting:?\n\nOptions:\nA. Whether we learn the class centers by Maximum Likelihood or Gradient Descent\nB. Whether we assume full class covariance matrices or diagonal class covariance matrices\nC. Whether we have equal class priors or priors estimated from the data.\nD. Whether we allow classes to have different mean vectors or we force them to share the same mean vector\n\nAnswer:",
"Question: Statement 1| Density estimation (using say, the kernel density estimator) can be used to perform classification. Statement 2| The correspondence between logistic regression and Gaussian Naive Bayes (with identity class covariances) means that there is a one-to-one correspondence between the parameters of the two classifiers.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: High entropy means that the partitions in classification are?\n\nOptions:\nA. pure\nB. not pure\nC. useful\nD. useless\n\nAnswer:",
"Question: You are training a linear regression model for a simple estimation task, and notice that the model is overfitting to the data. You decide to add in $\\ell_2$ regularization to penalize the weights. As you increase the $\\ell_2$ regularization coefficient, what will happen to the bias and variance of the model?\n\nOptions:\nA. Bias increase ; Variance increase\nB. Bias increase ; Variance decrease\nC. Bias decrease ; Variance increase\nD. Bias decrease ; Variance decrease\n\nAnswer:",
"Question: Statement 1| The softmax function is commonly used in mutliclass logistic regression. Statement 2| The temperature of a nonuniform softmax distribution affects its entropy.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: A and B are two events. If P(A, B) decreases while P(A) increases, which of the following is true?\n\nOptions:\nA. P(A|B) decreases\nB. P(B|A) decreases\nC. P(B) decreases\nD. All of above\n\nAnswer:",
"Question: What are support vectors?\n\nOptions:\nA. The examples farthest from the decision boundary.\nB. The only examples necessary to compute f(x) in an SVM.\nC. The data centroid.\nD. All the examples that have a non-zero weight \u03b1k in a SVM.\n\nAnswer:",
"Question: The wings of a bat and the wings of a butterfly are?\n\nOptions:\nA. homologous structures\nB. analogous structures\nC. vestigial structures\nD. dissimilar in form and function\n\nAnswer:",
"Question: Antibiotics that affect bacterial cells interfere with all of the following EXCEPT?\n\nOptions:\nA. peptidoglycan synthesis\nB. protein synthesis\nC. DNA synthesis\nD. reverse transcriptase\n\nAnswer:",
"Question: The solid-state structures of the principal allotropes of elemental boron are made up of which of the following structural units?\n\nOptions:\nA. B12 icosahedra\nB. B8 cubes\nC. B6 octahedra\nD. B4 tetrahedra\n\nAnswer:",
"Question: A single line is seen in the 31P spectrum of a solution of sodium phosphate. The 31P chemical shifts of H2PO4\u203e and HPO42\u2013 are 3.42 ppm and 5.82 ppm respectively. What is the chemical shift when the pH of the solution equals the pKa of H2PO4\u203e?\n\nOptions:\nA. 3.41 ppm\nB. 3.98 ppm\nC. 4.33 ppm\nD. 4.62 ppm\n\nAnswer:",
"Question: Which of the following compounds has a 1H resonance approximately 1.55 kHz away from TMS on a spectrometer with a 12.0 T magnet?\n\nOptions:\nA. CH3F\nB. CH3Cl\nC. CH3Br\nD. CH3I\n\nAnswer:",
"Question: A silyl radical bearing an Si-H\u00b7 fragment has a g value of 2.0033 and a pair of lines separated by 15.5 MHz. Express the splitting in units of mT, Gauss and cm-1.\n\nOptions:\nA. 15.5 MHz = 11.104 mT = 27.201 Gauss = 0.862 x 10^-4 cm-1\nB. 15.5 MHz = 7.352 mT = 10.104 Gauss = 18.39 x 10^-4 cm-1\nC. 15.5 MHz = 1.55 mT = 0.562 Gauss = 31.0 x 10^-4 cm-1\nD. 15.5 MHz = 0.553 mT = 5.530 Gauss = 5.17 x 10^-4 cm-1\n\nAnswer:",
"Question: Which of the following is always true of a spontaneous process?\n\nOptions:\nA. The process is exothermic.\nB. The process does not involve any work.\nC. The entropy of the system increases.\nD. The total entropy of the system plus surroundings increases.\n\nAnswer:",
"Question: Which of the following is the joint probability of H, U, P, and W described by the given Bayesian Network H -> U <- P <- W? [note: as the product of the conditional probabilities]?\n\nOptions:\nA. P(H, U, P, W) = P(H) * P(W) * P(P) * P(U)\nB. P(H, U, P, W) = P(H) * P(W) * P(P | W) * P(W | H, P)\nC. P(H, U, P, W) = P(H) * P(W) * P(P | W) * P(U | H, P)\nD. None of the above\n\nAnswer:",
"Question: Two xylem plant cell types that provide support and conduct water and minerals are the?\n\nOptions:\nA. collenchyma and sclerenchyma\nB. sieve tube members and companion cells\nC. tracheids and vessel elements\nD. vessel elements and companion cells\n\nAnswer:",
"Question: A machine learning problem involves four attributes plus a class. The attributes have 3, 2, 2, and 2 possible values each. The class has 3 possible values. How many maximum possible different examples are there?\n\nOptions:\nA. 12\nB. 24\nC. 48\nD. 72\n\nAnswer:",
"Question: Predict the number of lines in the EPR spectrum of a solution of dimethylnitroxide (CH3)2NO\u2022 assuming the lines do not overlap.\n\nOptions:\nA. 21\nB. 3\nC. 7\nD. 24\n\nAnswer:",
"Question: Statement 1| The values of the margins obtained by two different kernels K1(x, x0) and K2(x, x0) on the same training set do not tell us which classifier will perform better on the test set. Statement 2| The activation function of BERT is the GELU.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: Which of the following statements is true of air as compared to water?\n\nOptions:\nA. Air provides more physical support.\nB. Air has a higher O2 concentration.\nC. Air offers more resistance to motion.\nD. Air has more thermal inertia.\n\nAnswer:",
"Question: Cancer cells grown in culture are similar to normal cells grown in culture in that they?\n\nOptions:\nA. divide an indefinite number of times\nB. do not display contact inhibition\nC. require a surface for attachment in order to grow\nD. proliferate to the same cell density\n\nAnswer:",
"Question: Proteins were shown to move about in a plane of the plasma membrane when mouse cellsurface proteins and human cell-surface proteins were observed to integrate along a fused mouse-human cell plasma membrane. Which of the following cell culture techniques was most likely employed in order to yield these results?\n\nOptions:\nA. Producing a heterokaryon\nB. Producing a hybrid cell\nC. Isolating an immortal variant cell from culture and using it to create a cell line\nD. Inserting a tumor-inducing virus into a normal cell to initiate transformation\n\nAnswer:",
"Question: A substance that is NOT generally considered to be a toxic pollutant in water is?\n\nOptions:\nA. carbonic acid\nB. a halogenated hydrocarbon\nC. lead\nD. mercury\n\nAnswer:",
"Question: Which of the following must be true in order for evolution to have occurred?\n\nOptions:\nA. The frequencies of some alleles in a population's gene pool has changed over successive generations.\nB. The frequencies of some alleles in a population's gene pool has changed during the organisms' lifetimes.\nC. The frequencies of each allele in a population's gene pool has remained constant over successive generations.\nD. The frequencies of each allele in an organism's genotype has remained constant within the organism's lifetime.\n\nAnswer:",
"Question: An organism belonging to the nekton is which one of the following?\n\nOptions:\nA. Whale\nB. Barnacle\nC. Cyanobacterium\nD. Protist\n\nAnswer:",
"Question: Statement 1| The maximum margin decision boundaries that support vector machines construct have the lowest generalization error among all linear classifiers. Statement 2| Any decision boundary that we get from a generative model with classconditional Gaussian distributions could in principle be reproduced with an SVM and a polynomial kernel of degree less than or equal to three.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: If your training loss increases with number of epochs, which of the following could be a possible issue with the learning process?\n\nOptions:\nA. Regularization is too low and model is overfitting\nB. Regularization is too high and model is underfitting\nC. Step size is too large\nD. Step size is too small\n\nAnswer:",
"Question: Statement 1| RoBERTa pretrains on a corpus that is approximate 10x larger than the corpus BERT pretrained on. Statement 2| ResNeXts in 2018 usually used tanh activation functions.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: In building a linear regression model for a particular data set, you observe the coefficient of one of the features having a relatively high negative value. This suggests that?\n\nOptions:\nA. This feature has a strong effect on the model (should be retained)\nB. This feature does not have a strong effect on the model (should be ignored)\nC. It is not possible to comment on the importance of this feature without additional information\nD. Nothing can be determined.\n\nAnswer:",
"Question: If N is the number of instances in the training dataset, nearest neighbors has a classification run time of?\n\nOptions:\nA. O(1)\nB. O( N )\nC. O(log N )\nD. O( N^2 )\n\nAnswer:",
"Question: Which PyTorch 1.8 command(s) produce $10\\times 5$ Gaussian matrix with each entry i.i.d. sampled from $\\mathcal{N}(\\mu=5,\\sigma^2=16)$ and a $10\\times 10$ uniform matrix with each entry i.i.d. sampled from $U[-1,1)$?\n\nOptions:\nA. \\texttt{5 + torch.randn(10,5) * 16} ; \\texttt{torch.rand(10,10,low=-1,high=1)}\nB. \\texttt{5 + torch.randn(10,5) * 16} ; \\texttt{(torch.rand(10,10) - 0.5) / 0.5}\nC. \\texttt{5 + torch.randn(10,5) * 4} ; \\texttt{2 * torch.rand(10,10) - 1}\nD. \\texttt{torch.normal(torch.ones(10,5)*5,torch.ones(5,5)*16)} ; \\texttt{2 * torch.rand(10,10) - 1}\n\nAnswer:",
"Question: The fact that the infrared absorption frequency of deuterium chloride (DCl) is shifted from that of hydrogen chloride (HCl) is due to the differences in their?\n\nOptions:\nA. electron distribution\nB. dipole moment\nC. force constant\nD. reduced mass\n\nAnswer:",
"Question: Which of the following would increase the rate at which a gas diffuses between the alveoli of the lung and the blood within a pulmonary capillary?\n\nOptions:\nA. Decreasing the partial pressure gradient of the gas\nB. Decreasing the solubility of the gas in water\nC. Increasing the total surface area available for diffusion\nD. Decreasing the rate of blood flow through the pulmonary capillary\n\nAnswer:",
"Question: Statement 1| Besides EM, gradient descent can be used to perform inference or learning on Gaussian mixture model. Statement 2 | Assuming a fixed number of attributes, a Gaussian-based Bayes optimal classifier can be learned in time linear in the number of records in the dataset.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: Which of the following is a benefit that mycorrhizal fungi confer to many plants?\n\nOptions:\nA. They protect plant roots from desiccation in extremely dry habitats.\nB. They fix nitrogen, which is particularly important for plants in nitrogen-limited habitats.\nC. They provide access to phosphorus, an essential element that is limited in many kinds of soils.\nD. They provide carbon to plants in exchange for fixed nitrogen.\n\nAnswer:",
"Question: _ refers to a model that can neither model the training data nor generalize to new data.\n\nOptions:\nA. good fitting\nB. overfitting\nC. underfitting\nD. all of the above\n\nAnswer:",
"Question: Which of the following is lower for argon than for neon?\n\nOptions:\nA. Melting point\nB. Boiling point\nC. Polarizability\nD. First ionization energy\n\nAnswer:",
"Question: The equation \u0394H = \u0394U + P\u0394V is applicable?\n\nOptions:\nA. always\nB. only for constant pressure processes\nC. only for constant temperature processes\nD. only for constant volume processes\n\nAnswer:",
"Question: Statement 1| The original ResNets and Transformers are feedforward neural networks. Statement 2| The original Transformers use self-attention, but the original ResNet does not.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: Considering 0.1 M aqueous solutions of each of the following, which solution has the lowest pH?\n\nOptions:\nA. Na2CO3\nB. Na3PO4\nC. Na2S\nD. NaCl\n\nAnswer:",
"Question: Statement 1| Overfitting is more likely when the set of training data is small. Statement 2| Overfitting is more likely when the hypothesis space is small.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: Which of the following is false?\n\nOptions:\nA. Semantic segmentation models predict the class of each pixel, while multiclass image classifiers predict the class of entire image.\nB. A bounding box with an IoU (intersection over union) equal to $96\\%$ would likely be considered at true positive.\nC. When a predicted bounding box does not correspond to any object in the scene, it is considered a false positive.\nD. A bounding box with an IoU (intersection over union) equal to $3\\%$ would likely be considered at false negative.\n\nAnswer:",
"Question: A frameshift mutation is created when?\n\nOptions:\nA. telomeric sequences are removed from DNA\nB. a codon's nucleotide sequence changes so that it calls for production of a different amino acid than the original one\nC. a base pair is either inserted or deleted in a gene\nD. a codon's nucleotide sequence is changed so that instead of coding for a given amino acid it acts to terminate translation\n\nAnswer:",
"Question: Statement 1| ImageNet has images of various resolutions. Statement 2| Caltech-101 has more images than ImageNet.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: Statement 1| The training error of 1-nearest neighbor classifier is 0. Statement 2| As the number of data points grows to infinity, the MAP estimate approaches the MLE estimate for all possible priors. In other words, given enough data, the choice of prior is irrelevant.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: When an influenza virus enters a cell, it immediately starts to do which of the following?\n\nOptions:\nA. Incorporate viral DNA into the host cell\u2019s chromosome\nB. Destroy the host cell\u2019s transcriptional machinery\nC. Replicate its genetic material and synthesize viral proteins\nD. Use a viral copy of reverse transcriptase to manufacture viral DNA\n\nAnswer:",
"Question: The structures that act as sites of gas exchange in a woody stem are the?\n\nOptions:\nA. lenticels\nB. terminal buds\nC. nodes\nD. internodes\n\nAnswer:",
"Question: The magnetic moment (\u03bcI) of an unknown nuclide is 2.884 x 10^-27 J T-1. Given the nuclear spin is known to be 1, identify the unknown nuclide.\n\nOptions:\nA. 14N\nB. 2H\nC. 19F\nD. 6Li\n\nAnswer:",
"Question: Of the following ions, which has the smallest radius?\n\nOptions:\nA. K+\nB. Ca2+\nC. Sc3+\nD. Rb+\n\nAnswer:",
"Question: Which of the following are the spatial clustering algorithms?\n\nOptions:\nA. Partitioning based clustering\nB. K-means clustering\nC. Grid based clustering\nD. All of the above\n\nAnswer:",
"Question: Statement 1| CIFAR-10 classification performance for convolution neural networks can exceed 95%. Statement 2| Ensembles of neural networks do not improve classification accuracy since the representations they learn are highly correlated.\n\nOptions:\nA. True, True\nB. False, False\nC. True, False\nD. False, True\n\nAnswer:",
"Question: Cobalt-60 is used in the radiation therapy of cancer and can be produced by bombardment of cobalt-59 with which of the following?\n\nOptions:\nA. Neutrons\nB. Alpha particles\nC. Beta particles\nD. X-rays\n\nAnswer:",
"Question: We are training fully connected network with two hidden layers to predict housing prices. Inputs are $100$-dimensional, and have several features such as the number of square feet, the median family income, etc. The first hidden layer has $1000$ activations. The second hidden layer has $10$ activations. The output is a scalar representing the house price. Assuming a vanilla network with affine transformations and with no batch normalization and no learnable parameters in the activation function, how many parameters does this network have?\n\nOptions:\nA. 111021\nB. 110010\nC. 111110\nD. 110011\n\nAnswer:"
"desc_act": false,
"exllama_config": {
"version": 1
"group_size": 128,
"max_input_length": null,
"model_seqlen": null,
"module_name_preceding_first_block": null,
"modules_in_block_to_quantize": null,
"pad_token_id": null,
"quant_method": "gptq",
"sym": true,
"tokenizer": null,
"true_sequential": true,
"use_cuda_fp16": false,
"use_exllama": false
"reorder_and_upcast_attn": false,
"resid_pdrop": 0.1,
"scale_attn_by_inverse_layer_idx": false,
"scale_attn_weights": true,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"task_specific_params": {
"text-generation": {
"do_sample": true,
"max_length": 50
"torch_dtype": "float16",
"transformers_version": "4.41.2",
"use_cache": true,
"vocab_size": 50257