license: mit
Ok so this guy offers this challenge and I don't actually have a lot going on in my life right now. So I'm like fine. Your idea looks interesting. I have no idea why you're spamming it. It does not appear you make any money from this. Why would you offer to pay for our fine-tuning if we don't like the results after fine-tuning on your data? Does this thing trojan horse in some crazy thing that lets you control all robots later even though it improves performance now? I dunno. I don't even know if I'm doing this right. It says fine-tune your model on it. But I don't know if that means make my model first and then fine-tune using his thing or if I can just sprinkle it into mine and cross my fingers? I'm just going to sprinkle in his data and just cross my fingers.
Now, every time I ever learn a new tech that can even conceivably be used to predict the stock market, I try to apply it to it. I fail every single time. It's fine. It's hard to predict the future. I'm going to try to make a bot that tells me things about this pre-revenue space company I'm totally gambling on. I don't know what I hope to achieve from the bot itself. Probably to try to guess the future and predict the stock market duh. The actual point of the bot doesn't matter. He just said if we're not happy with it or quality doesn't improve or something, he'll refund us the training fees we spent. Which to me means I can just trudge into this with no defined goals other than if I feel like I like the outcome and definitely be guaranteed that this thing will be able to tell the future. I didn't see any small print on his offer. Nor did I check. A deal is a deal.
I pulled the data for the company from various places and managed to get myself banned from the official AST Spacemobile website (the company I'm obsessed with) for trying to scrape it (sorry!). I hope that automatically expires at some point. Oh well. It's kinda embarrassing. I own part of the company. And I'm banned from the site. I don't have much from their site obviously but I grabbed a bunch of news and financial data. Maybe not a bunch. About maybe November-ish on. I did the dataset prep for that to turn it into a bot (I know! I know! You all asked for guides in my last post on how to build a dataset and fine-tune for model performance past just format. I PROMISE that's on the way. I'm writing it!) and then converted his dataset CSV into the Alpaca instruct/response form and just kinda manually copy/pasted chunks in-between my data. The internet seems to insist the order but doesn't matter but in my experience the loss can explode if the data differs too much in too large of chunks. You need to randomize a bit if you're training on flat file like I tend to do. Also his data was in a parquet file and that was a whole thing so here's the code to turn that into the Alpaca format:
import pandas as pd
Read the Parquet file
df = pd.read_parquet('0000.parquet')
Open the output file
with open('pfaf.txt', 'w', encoding='utf-8') as f: # Iterate through each row in the DataFrame for index, row in df.iterrows(): # Write the instruction and response to the file f.write("### Instruction:\n") f.write(row['Prompt'] + '\n\n') f.write("### Response:\n") f.write(row['Response'] + '' + '\n\n')
The CSV version had my parsers all vomiting so I had to make that.
I honestly don't expect this to go well. I'm kinda just doing this as a nerd joke/challenge. I'm not doing anything to improve the chances of success of this data but I think that'll be the best test right? I think? Unless you're supposed to fine-tune on it after you're done. But that'd be bizarre? You'd have all sorts of catastrophic forgetting. I've got a couple of these SME bots on the leaderboard so I'm just going to see how it does there other than I guess just playing with it. If it increases my GSM8K or whatever score it was, I'll be paying attention. My SMEs are crashing and burning on that score for some reason. At least that gives me some sort of hard metric. I submitted it. We'll see. You can check for yourself whenever it finishes. I don't have any of the benchmarks locally. I just dump to the leaderboard as my benchmark. They said they don't mind in one of their posts. The quality is going to be a little questionable since I can't grab their website info. Sounds like a guy-offering-the-guarantee's problem, though. And I fine-tuned on the GPTQ model instead of the FP16 model loaded in 4-bit mode/bf16. Not because there was a technical reason. The GPTQ model just loaded faster. Not my problem if it's a worse idea to train on that. That's a problem for PFAF moneybags over there.
Here. It's done.
I talked to it. It's ok I guess. I'm a little suspicious of its ability to literally tell the future. I'm still not rich and I don't know when I will be. I was expecting to be able to tell the future and not worry about a highly risky investment and all I got was a bot that might score better on objective benchmarks. And I don't even get to find that out until probably tomorrow. Maybe days if the leaderboard breaks again. I'm gonna take the refund I'm thinking. I need the money. Predicting the stock market failed once again. I'm willing to split the liability a little, though. I mean I didn't even ask the guy any of the parameters. I just started doing it. Some of that data was mine. Let's just meet in the middle. Let's figure out the cost structure:
I trained from my workstation. I have 2x 3090's and an AMD 5900x. Chicago power is 15¢/kWh. Each 3090 draw about 350 watts and the rest of the system probably draws maybe 200 watts or so. But then my room gets hot and I have to turn on the overhead fan and kick on the HVAC vent fan with the windows open or else my place gets really hot even in the middle of winter. We'll call it a kilowatt even since we're not billing wear and tear on the cards. I think you have to depreciate those by time anyway and not usage. At least for tax purposes. Anyway, dataset prep and training took about 3 hours in-total. Looking at raw data sizes, the pfaf data was about 500kb and my data around 2.1mb. So if we calculate that out, we get 3 * 0.15 * (500/(2100+500)) = 0.0865 to get the portion of the fine-tuning attributable to PFAF (someone check my math. I'm stoned.). I think that I feel like this guy owes me 9 cents, but I'm not gonna be petty about it. You can't give fractions of a penny. We'll call it 8 cents. If the scores don't improve.
(We'll see probably tomorrow or so if the leaderboard updates if this dataset does anything worth exploring just by dumping it in as suggested by the guy. Compare it to TacoBeLLM and Palworld-SME-13b on the leaderboard for bots I made similar ways.)