Dynamic LoRAs -- Crazy idea?

BrainSlugs83 · 1 year ago

Dynamic LoRAs -- Crazy idea?

FPham · 1 year ago

Are you saying you want a model that will spit out LORA’s? Like “Please generate me Lora that will make yourself totally amazing?”

If so, this is more in the realm of star trek food replicator. AKA it works amazingly on a TV screen.

If not then, sorry.

The closest to this would be a model that will pickup the correct LORA needed to reply. Adapters can be easily switched on the fly and so a model can be made that would call a function to select correct adapter. Maybe this is how ChatGPT works. maybe not.

BrainSlugs83 · 1 year ago

No. I’m not advocating for creating a text-to-LoRA model. Though that would be a neat project, I think you’d have a monumental training task under your hands… and really… it just doesn’t seem that practical. Fine-tuning isn’t expensive enough to merit trying to train or build that netowrk anyway, so “the juice wouldn’t be worth the squeeze”.

Picking up a correct LoRA for a response is what an MoE system is (Mixture of Experts).

What I’m proposing is training a regular LLM to occasionally spit out tokens which signal another ML network to periodically run, which will make minor runtime adjustments to the current LORA to keep it “on track”.

Like a thousand tiny micro adjustments over the course of a long conversation. – Which could be used to shift the current latent space into one where the model has an “intuitive” or “latent” understanding of much of what is currently in the context – so that the actual context and attention tokens could be freed up for later use.

Basically if the network is already in the optimal LoRA the ML network would just spit out an identity tensor for the LoRA so that it never changes.

But as the LLM realizes it’s no longer in the realm of it’s current latent space, it spits out a special “think-harder” token, which signals the ML network to run.

The ML network takes the current context and pushes it into a weighted vectorized embedding that is representative of the current “state”, and spits out a tensor which makes micro adjustments to the LoRA / PEFT adapter.

That was one such application for this that I was proposing.

lans_throwaway · 1 year ago

Isn’t that just Hypernetwork? It’s been done before, eg. for stable diffusion

BrainSlugs83 · 1 year ago

Neat, that’s really similar to what I was thinking of. – I know SD is transformer based, but has anyone done this with LLMs?