rovr beat me to it below. Here are more links: https://jacobsgill.es/phdobtained (fun fact: because my thesis contains published papers, I am in breach of a few journal's copyright by uploading my own thesis pdf, but fuck'em).
LLM approaches were evaluated on my own time and but published (I left research after obtaining my PhD).
> because my thesis contains published papers, ..., but f 'em
Excluding the part in the middle because I don't wanna repost potential issues for you. I just wanted to comment that that is terrible. People often talk about the siloed nature of research in industry, without considering that academia supports the draconian publishing system. I understand IP protection, but IP protection doesn't have to mean no access. This is such a huge issue in the bio- world (biostats, genetics, etc).
I don't know your circumstances but often you retain the right to distribute a "post print", ie the final text as published but absent journal formatting. A dissertation should fit that definition.
This is indeed often the case, however, my university reviews each thesis, and deemed it can only change to open access in 2026 (+5 years from defense).
I think this is default policy for thesis based on publication agreements here.
Thank you for the link! And congratulations on obtaining your PhD
I have skimmed through it and it's truly amazing how good annotation of the dataset can lead to impressive results.
I apologise in advance if the question seems ignorant: The blog post talked about fine-tuning models online. Given that BERT models can run comfortably on even iPhone hardware, were you able to finetune your models locally or did you have to do it online too? If so, are there any products that you recommend?
Thanks! The fine-tunes where done in 2019-21 on a 4xV100 server with hyperparameter search, so thousands of individual fine-tuned models were trained in the end. I used weights and biased for experiment dashboarding the hyperparam search, but the hardware was our own GPU server (no cloud service used).
I doubt you can fine-tune BERT-large on a phone. A quantized, inference optimised pipeline can be leaps and bounds more efficient and is not comparable with the huggingface training pipelines on full models I did at the time. For non-adapter based training you're going to need GPUs ideally.
This is really cool -- thanks for posting it! I'll have to skim through it at some point since a lot of my work is in classifications models and mirrors the results you've seen