Adventures with the Anxiety Machine Pt. 2
So, last week we updated the Anxiety Machine - aka the experimental LLM we are running at the university and that I am bullying from time to time.
The good news? The Anxiety Machine is up to near 100% now in getting papers that have been published in English up until Q3 of 2024 right, and can in most cases even produce the DOI. That is really, fucking good. I really cannot emphasize how good that is. This is a massive help in research. I really cannot emphasize this enough. Being able to sit there and go: "Hey, I have been wondering about this and that, are there papers on that?" And it gives you both papers and possible keywords to search for? Yeah, fuck. That is great.
We also found one way to make the anxiety machine indeed suffer its anxiety attacks almost certainly: letting it guess and only giving it one chance to guess right. It is fine as soon as you tell it "three chances". It will do okay with that. But as soon as it only has one guess? Yeah, that machine is going to go down a circular thing of "AAAAAAAAAAAH! I HAVE TO GET IT RIGHT! I AM SO INTELLIGENT!!!!" I crashed several instances of it by trying to push it to make a single guess.
Something that is also a bit frustrating with it is, that we have one training package that we just purchased - and that very very clearly includes pretty much all the content from the fandom.com wikis, Ao3, and Reddit. Right now we have in that machine (that is running locally on the university's servers) 4 data sets: The first is just creative commons and public domain stuff that is offered by Google I think, the second is one that was purchased by the uni in November last year (and clearly involves a lot of data crawled content), the third is one academic set that has been put out by Stanford I think, and the last one is one that the research team at the uni created for themselves.
I mean, it is kinda funny to play around and poke the thing. And as noted, the machine is really fucking by now to get us the bloody papers. That is genuinely a really good use case for it.



















