RIG THE RAG – ALI SHIRANI

What can be more satisfying than my personal AI setup with my data and answering my questions! Especially, on a machines with limited cpu and memory resources.

AI RAG - Ollama

It isn’t an article but a quick tip ! Guys, use open models such as Mixed Bread, Nomic Embed, etc for your embeddings, choose a fast vector database (keep looking for a better one as it comes out and transport all vectors + reattach, or craft a custom one), and bingo! Start communicating using API. If you are a business owner and still haven’t even entered this room, man !!! Do it, it still isn’t too late. After all what you need is a machine , you get better gpu you get better performance & agents built out.

I will write more soon, with practical examples. Let me know your opinion/suggestions/ recommendations on local AI server, more specifically embeddings enabled models you might have tested with ollama.

Simple steps are:

Install ollama on local machine
Pull a good modl supporting larger context and is specific to supporting embeddings, e.g.
- mxbai-embed-large 334M
- nomic-embed-text 137M
- all-minilm 23M
Install a vector database server
Call embeddings API within localhost
Store obtained vectors

Medium has an article on this if your programmer is facing difficulty implementing it.

https://medium.com/@ankitmaurya1994/url-shortener-with-vector-embeddings-ollama-8904cb227186

Learn more about free models here: https://ollama.com/blog/embedding-models

See you all soon. Let me know if you want me to write on something specific or explain you in simple terms

1 thought on “RIG THE RAG”

A WordPress Commenter says:

March 22, 2025 at 5:49 PM

Hi, this is a comment.
To get started with moderating, editing, and deleting comments, please visit the Comments screen in the dashboard.
Commenter avatars come from Gravatar.

Log in to Reply

1 thought on “RIG THE RAG”

Leave a Reply Cancel reply