--- license: bigcode-openrail-m --- Note : The adapter and related GLaDOS code is licensed under Apache 2.0- however the base model is licensed under bigcode-openrail-m. Since this adapter utilizes the base model, you still must adhere to the openrail license. As such I have marked openrail as the license for this model, since it _effectively_ is. GLaDOS speaks Markdown! Usage To use this model, you must first navigate to the bigcode starcoder model and accept their license, then create a token for your account and update the below code with it. ``` import torch from peft import PeftModel, PeftConfig from transformers import AutoModelForCausalLM, AutoTokenizer # Setup Model path = "JamesConley/glados_starcoder" token = PUT YOUR TOKEN HERE!!!!!!! config = PeftConfig.from_pretrained(path) base_model_path = config.base_model_name_or_path model = AutoModelForCausalLM.from_pretrained(base_model_path, torch_dtype=torch.float16, device_map="auto", use_auth_token=token) model = PeftModel.from_pretrained(model, path, device_map="auto") # Setup Tokenizer tokenizer = AutoTokenizer.from_pretrained(base_model_path, truncation_side="left", use_auth_token=token) # Encode Tokens text = """User :\nWrite a python function that trains a classifier on data loaded from a csv.\nGLaDOS :\n""" input_ids = tokenizer(text, return_tensors="pt", truncation=True, max_length=2048).input_ids # Move tokens to GPU #input_ids = input_ids.to("cuda") # Perform Inference with torch.no_grad(): with torch.cuda.amp.autocast(): gen_tokens = model.generate( input_ids=input_ids, max_new_tokens=256 ) # Decode Tokens gen_text = tokenizer.batch_decode(gen_tokens) ``` Result Note : The slash \ characters are not actually in the model output, I have added them to prevent the model card from rendering the codeblock incorrectly. ``` User : Write a python function that trains a classifier on data loaded from a csv. GLaDOS : Here is an example of a function that trains a classifier on data loaded from a CSV file: \``` python import pandas as pd from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestClassifier def train_classifier(data_file, target_column, test_size=0.2): # Load the data from the CSV file data = pd.read_csv(data_file) # Split the data into training and testing sets X_train, X_test, y_train, y_test = train_test_split(data.drop(target_column, axis=1), data[target_column], test_size=test_size) # Train the classifier clf = RandomForestClassifier() clf.fit(X_train, y_train) # Return the trained classifier and the test set predictions return clf, clf.predict(X_test) \``` This function takes in the following arguments: * `data_file`: the path to the CSV file containing the data * `target_column`: the name of the column in the CSV file that contains the target variable ``` Markdown rendered output : User : Write a python function that trains a classifier on data loaded from a csv. GLaDOS : Here is an example of a function that trains a classifier on data loaded from a CSV file: ``` python import pandas as pd from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestClassifier def train_classifier(data_file, target_column, test_size=0.2): # Load the data from the CSV file data = pd.read_csv(data_file) # Split the data into training and testing sets X_train, X_test, y_train, y_test = train_test_split(data.drop(target_column, axis=1), data[target_column], test_size=test_size) # Train the classifier clf = RandomForestClassifier() clf.fit(X_train, y_train) # Return the trained classifier and the test set predictions return clf, clf.predict(X_test) ``` This function takes in the following arguments: * `data_file`: the path to the CSV file containing the data * `target_column`: the name of the column in the CSV file that contains the target variable