Hi everyone, just thought I can write a post here to bounce ideas off.

I have multiple clauses, and I want to extract out the object of interest within that clause. For example, this is a clause: " Exit staircases shall be constructed of non-combustible materials to comply with the provisions of Cl.3.10.1.". Obviously, the object of interest here is ‘exit staircases’ or ‘staircases’, so I want that to be extracted. Here is another clause: “No structure or building shall be constructed within a sewer.”. Now, there are multiple object in that clause (i.e structure, building, and sewer), but it is also obvious that the object of interest is referring to ‘sewer’.

I ran through this in GPT-3.5, and it works. The GPT is able to return me the object of interest pretty accurately. However, is it possible to mass generate the response from GPT based on my huge list of clauses, instead of inputting the prompt very often? How do I do that? For example, I have a list of clauses, how can I make use of LLM such that I can get back the object of interest of each particular clause without prompting it manually?

Also, is this the correct/ideal way to extract out the object of interest for a huge list of clauses? My final goal is to cluster those similar object of interest, and see what clauses those object of interest are linked to (via some kind of RAG approach). So I create some kind of knowledge graph from that. Do you think my method is the right approach?

Thanks!