chat with pdfs
Hi all!
I tried to write a Chatbot for asking questions about an upload PDF. If the PDF is too long (es 70 pages) i get this error:
BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 16385 tokens. However, your messages resulted in 28870 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}
the errore arises after i have create the FAISS DB and when i ask the first question
I am using Streamlit and lagnchain.
here is my funcition for chunking:
-------
def get_pdf_text(pdf):
pdf_reader = PdfReader(pdf)
text = ""
for page in pdf_reader.pages:
text += page.extract_text()
return text
def get_text_chunks(text):
text_splitter = CharacterTextSplitter(separator="/n")
chunks=text_splitter.split_text(text)
return chunks
def create_db(text, embedding=OpenAIEmbeddings()):
# Convert the document chunks to embedding and save them to the vector store
vectordb1 = FAISS.from_texts(text, embedding)
return vectordb1
-----------
here is my retrieval chain:
def get_conversation_chain(vectorstore):
llm = ChatOpenAI()
# llm = HuggingFaceHub(repo_id="google/flan-t5-xxl", model_kwargs={"temperature":0.5, "max_length":512})
memory = ConversationBufferMemory(
memory_key='chat_history', return_messages=True)
conversation_chain = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=vectorstore.as_retriever(),
memory=memory
)
return conversation_chain
How can i reduce the number of tokens?
8
5 comments
Elena Guzzon
3
chat with pdfs
Data Alchemy
skool.com/data-alchemy
Your Community to Master the Fundamentals of Working with Data and AI — by Datalumina®
Leaderboard (30-day)
Powered by