# Starter - GENAI

## Business Requirement

I am a student and I want answers from my textbooks easily so that I can understand the subject better.

## Goal

1.  Understand Vector Database
    
2.  Understand RAG
    
3.  Understand how to productionize a GENAI app to a certain degree
    

## Ingestion

1.  Get all PDFs from [Class 11 Biology PDFs](https://ncert.nic.in/textbook.php?kebo1=0-19)
    
2.  Create a Qdrant vector database
    
3.  Write Python Script to upload the PDFs into the database
    
4.  Do not use frameworks such as Langchain / LLamaIndex
    
5.  Insert Metadata such as Chapter , Page Number when you are inserting data into the Database
    

[Github for Qdrant Code](https://github.com/ambarishg/JUWORKSHOP_JAN_2026/tree/main/QDRANT)

This directory collects helper scripts, configs, and notebooks for working with a Qdrant vector store, ingesting data, and prototyping RAG workflows.

| File Name | File Description |
| --- | --- |
| `.env` | Stores endpoint URLs, API keys, and model settings that `config_qdrant.py` loads for consistent configuration across notebooks. |
| 01\. `config_qdrant.py` | Reads the shared environment variables and exposes a configured `QdrantClient` plus embedding/model metadata. |
| 02\. `connect.ipynb` | Minimal notebook that imports `QdrantClient` and validates the hosted Qdrant connection using the shared config. |
| 03.`create_collection.ipynb` | Defines the BEES collection schema so that later ingestion and search notebooks can store vectors with metadata. |
| 04\. `documents_extraction.ipynb` | Uses LangChain’s `PyPDFLoader` helpers to pull text from PDFs, clean it, and prepare it for embedding. |
| 05\. `ingest.ipynb` | Illustrates iterating over local data sources and pushing documents plus embeddings into the configured Qdrant collection. |
| 06\. `advanced_rag_qdrant.ipynb` | Walks through a multi-step RAG pipeline, combining ingestion, Qdrant vector search, and OpenAI completions. |
| 07\. `hybrid_search_create_collection.ipynb` | Combines the collection creation steps with the hybrid search flow for an all-in-one run. |
| 08\. `hybrid_search.ipynb` | Demonstrates a hybrid vector/text search flow built on the shared Qdrant setup. |
| 09.`universal_hybrid_search_create_collection.ipynb` | Builds a universal collection and immediately runs the universal hybrid search |
| 10.`universal_hybrid_search.ipynb` | Shows universal hybrid search examples that can generalize beyond the Netflix/BEES datasets. |
| 11\. `netflix_hybrid_search_create_collection.ipynb` | Builds the Netflix collection before running a hybrid search scenario tailored to that data. |
| 12\. `netflix.ipynb` | Samples Netflix-specific prompts and retrieval logic against the provided title dataset. |
| `netflix_titles.csv` | Public Netflix title metadata that fuels the Netflix notebooks; includes genres, descriptions, and other columns. |

## Search

1.  User asks a question
    
2.  Use the question to search the Database
    
3.  Use simple text search to get results
    
4.  Use semantic search to get results
    
5.  Use hybrid search to get results
    
6.  Understand the difference between text search / semantic search / hybrid search
    
7.  In the search results show the meta data associated with the result
    
8.  Advanced \[ Implement RRF \]
    

## LLM

1.  User asks a question
    
2.  Understand RAG
    
3.  Use the search results and the LLM to get the answer
    
4.  In the answer , show the portions of the text used to frame the answer \[ the Chapter , Page Number \]
    
5.  Understand how effectively the LLM and RAG is answering the question
    

## UI

1.  Create screens to upload more documents
    
2.  Create screens to have the user ask a question
    
3.  Create a chatbot
    
4.  Create a chatbot with memory
    

## FastAPI

1.  Use FASTAPI to expose the function
    

## Docker

1.  Make a docker image of the FASTAPI
    

## Advanced

* * *

## Langraph

1.  Understand Langraph using the repo [https://github.com/ambarishg/langchain\_and\_langraph](https://github.com/ambarishg/langchain_and_langraph)
    
2.  Inspiration from the **LangGraph Complete Course for Beginners – Complex AI Agents with Python**
    

## Agent Framework

1.  Understand the Agent Framework using the repo [https://github.com/ambarishg/agent-framework](https://github.com/ambarishg/agent-framework)
    
2.  Inspiration from Microsoft Agent framework samples
