CradawxB to LocalLLaMA@poweruser.forumEnglish · 1 year agoShareGPT4V - New multi-modal model, improves on LLaVAsharegpt4v.github.ioexternal-linkmessage-square17fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkShareGPT4V - New multi-modal model, improves on LLaVAsharegpt4v.github.ioCradawxB to LocalLLaMA@poweruser.forumEnglish · 1 year agomessage-square17fedilink
minus-squareGeraltOfRigaBlinkfedilinkEnglisharrow-up1·1 year agoThis is kinda nuts (first time I try a LLM + vision) Tried with a first person shooter screenshot, enemy on screen. Asked to give me the 2D coordinates of the enemy and it did, precisely.
This is kinda nuts (first time I try a LLM + vision)
Tried with a first person shooter screenshot, enemy on screen. Asked to give me the 2D coordinates of the enemy and it did, precisely.