minus-squarekryptkprBtoLocalLLaMA@poweruser.forum•SQLCoder-34b beats GPT-4 at Text-to-SQLlinkfedilinkEnglisharrow-up1·1 year agoDeepSeek is not based on any llama training, it’s a 2T token pretrain of their own. 16k context. All this info is at the top of their model card. linkfedilink
kryptkprB to LocalLLaMA@poweruser.forumEnglish · 1 year agoGoLLIE: Guideline-following Large Language Model for Information Extractionplus-squarehitz-zentroa.github.ioexternal-linkmessage-square1fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkGoLLIE: Guideline-following Large Language Model for Information Extractionplus-squarehitz-zentroa.github.iokryptkprB to LocalLLaMA@poweruser.forumEnglish · 1 year agomessage-square1fedilink
DeepSeek is not based on any llama training, it’s a 2T token pretrain of their own. 16k context. All this info is at the top of their model card.