As an experiment for this week, this morning I decided I should get a model of Deepseek R1 running on my local dev machine. I went on a whim, went slow step-by-step through what I felt like implementing, only around 7 hours in speedy research at each step to resolve small issues I found in my logic. Yes, locally using Deepseek R1 8b for fun, was an interesting experience I enjoyed!
One thing to note, how it all fits together comes from manual user input, so it doesn’t have access to the code project you have open. You will manually add the information it needs to your prompt, which is why it doesn’t clear the text of your prompt.
For example, you could have as the start of your prompt, the text you never delete:
The project is built in React, and I want to you give me the shorthand tutorial as a lesson to do the next step I ask.
Sample Prompt
The next step is <e.g. setting up a request using an html for and javascript to the ollama API I’m running locally so I can build this idea into a responsive web app>
Now, I have slightly lesser hardware on my desktop, so on a complex prompt like this, the response takes around 2 minutes. I’m okay with that, as on 8b I’m more comfortable with the answers than 1.5b or 7b, though they are close.

The in-depth answer is a pretty good tutorial to read, as is, which makes this into a useful tool already. Then, with the thinking section shared below, it’s a simple method for a 1-on-1 Q&A GPT interface that you can use locally. I did see people sharing methods on YouTube, I just prefer to use WSL2 for my solutions when it comes to LLMs, and usually just start from scratch in my own way.

The project should be easy enough, to run and build things, if you know your way around tech. I kept it short, and simple, assuming you know how to do NPM installs, for instance.
The Readme on my GPT-systems repo describes the actions to get Ollama running on steps 1-6, 7 is if you have an error that shows on port 11434 yet it still works on that part in testing after.
Step 7 was in WSL, and worked, then Step 8 was needed to make requests from Windows 11 directly to WSL, then the python and C# tests in 9 (worked on WSL before) and 10, worked on my Windows.
Steps 11, and 12, I roughed together the VSCode extension to use for local requests instead of the console scripts/apps. Steps to build it for yourself locally are there.
All in all, a simple structure fun project to use Deepseek R1 8b as an experiment in VSCode locally so no data is tracked. A win, I do believe.
I even saw a TikTok accessing the API direct from Godot GDScript, which was a fun thought. I didn’t feel I want it in Godot like this, though, so not for today.