RAG Disappointment and Azure Open AI Success With RAG
Recently, I have been working on programming small modules for Retrieval Augmented Generation (RAG) using Open AI. Also recently, I did a Coursera class on advanced RAG and have taken in several videos and posts on it. I have used samples from those sources and tested using various LLMs to generate simple Python to perform RAG with Open AI. In general , I have been disappointed with the outcomes until I tried Azure Open AI with Azure AI semantic search.
Disappointment
My general disappointments have come from the self-coded RAG attempts on a single PDF. The basic approach has been:
- take a single PDF (not a small PDF — about 43,000 token when uploaded to Open AI API in whole and extract the text using PyPDF
- chunk the text using at best recursive text and/or sentence transformer; sometimes just naive simple character count split
- embed the chunks trying Chroma or text-embedding-3-small
- query the collection using Chroma or FAISS and in one instance also with a simple augmented prompt
- call the LLM with the initial prompt and context from the embeddings query
- the prompt was for a list of 10 principles in the document that were all outlined in a single paragraph
It’s nothing sophisticated by any stretch but they corresponded with the examples I had available.
The results — abysmal. I’m not really surprised. I’m not sure how such an approach could do very well with simple chunking, embeddings, and basically running off keyword proximity. But since these were supposed examples of the use of RAG I expected better results. There were only two times I received good results. One of those times I’ll outline below; the other was when I didn’t parse or embed and just passed the entire document as context. Of course, the latter worked well but that was not the exercise I was after.
Success
My successful attempt came when I didn’t create the code directly but used Microsoft Azure Open AI playground. I have no doubt that coding it up would have worked just as well since it relies on the Azure AI infrastructure and the code would be little more than passing a prompt to the Azure LLM instance and getting the results. Here is what it consisted of:
- setup of an Azure Open AI instance using gpt-4o-mini model and, I think, the text-embedding-3-small embedding model
- setup an Azure AI Search instance with semantic search and indexing pointed to a blob storage container with a single PDF in it
I think that was it. I then went to the Azure Open AI Studio playground grounded the model with the search instance, provided my simple prompt, and got back the desired results. Was it the semantic search that made it work well? I suspect it helped a lot. I need to try it without the semantic search and see what happens. Sorry, I forgot to try that scenario.
Recap
All in all, I was very disappointed with the RAG results, as I coded it, especially as they were based on examples or AI generated single document RAG code. But, I was very pleased with the Azure test and I think the semantic search made all the difference.