asakura_matsunoki

asakura_matsunoki

Hi,

So I learning to build RAG system with LLaMa 2 and local embeddings. I have this big csv of data on books. Each row is a book and the columns are author(s), genres, publisher(s), release dates, ratings, and then one column is the brief summaries of the books.

I am trying to build an agent to answer questions on this csv. From basic lookups like

‘what books were published in the last two years?’,

‘give me 10 books from this publisher ABC with a rating higher then 3’

to more meaningful queries that need to read into the free-text summary column like:

‘what books have a girl as the main character?’

‘what books feature dragons? compare their plots’

I believe I got the general framework, but when I tried running it I got into a token limit error. Seems like the file is too big to be digested. Would love to hear your advice on any strategies to overcome this? I though about chunking but then how to recombine the answers from each chunk is unclear to me.

Thanks a ton! Cheers :D

Help with using Pandas Agent on big csv file

Help with using Pandas Agent on big csv file