Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>By fine-tuning only the adapter layers, the original parameters of the base pre-trained model remain unchanged, preserving the general knowledge of the model while tailoring the adapter layers to support specific tasks.

From a ML noob (me) understanding of this, does this mean that the final matrix is regularly fine tuned instead of fine tuning the main model ? Is this similar to how chatGPT now remembers memory[1] ?

[1] https://help.openai.com/en/articles/8590148-memory-faq



The base model is frozen. The smaller adaptor matrices which are finetuned with new data. During inference, the weights from the adaptor matrices "shadow" the weights in the base model. Since the adaptor matrices are much smaller, it's quite efficient to finetune them.

The advantage of the adaptor matrices is you can have different sets of adaptor matrices for different tasks, all based of the base model.


ChatGPT memory is just a database with everything you told it to remember.

Low Rank Adaptors (LoRA) are a way of changing the function of a model by only having to load a delta for a tiny percentage of the weights rather than all the weights for an entirely new model.

No fine-tuning is going to happen on Apple computers or phones at any point. They are just swapping out Apple's pre-made LoRAs so that they can store one LLM and dozens of LoRAs in a fraction of the space it would take to store dozens of LLMs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: