minus-squarekryptkprBtoLocalLLaMA@poweruser.forum•SQLCoder-34b beats GPT-4 at Text-to-SQLlinkfedilinkEnglisharrow-up1·2 years agoDeepSeek is not based on any llama training, it’s a 2T token pretrain of their own. 16k context. All this info is at the top of their model card. linkfedilink
kryptkprB to LocalLLaMA@poweruser.forumEnglish · 2 years agoGoLLIE: Guideline-following Large Language Model for Information Extractionplus-squarehitz-zentroa.github.ioexternal-linkmessage-square1linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkGoLLIE: Guideline-following Large Language Model for Information Extractionplus-squarehitz-zentroa.github.iokryptkprB to LocalLLaMA@poweruser.forumEnglish · 2 years agomessage-square1linkfedilink
DeepSeek is not based on any llama training, it’s a 2T token pretrain of their own. 16k context. All this info is at the top of their model card.