Small guide to run Llama.cpp on windows with discrete AMD GPU

fatboy93@lemm.ee · 26 days ago

Or the ever classic: launch one version behind the current Android version. Provide security update once a year and then taut that it’s aon OS update.

fatboy93@lemm.ee · 1 month ago

Samsung A9+ goes on sale for about $150 every once in a while.

Kids FireHD tablets are generally lower than that. There’s not really any difference between the adult and kids version tbh.

fatboy93@lemm.ee · 3 months ago

Why do people sleep on KDE connect? It does a lot of things really well and is OS agnostic.

fatboy93@lemm.ee · 3 months ago

Does this support Android Auto? That’s the only reason I use maps.

fatboy93@lemm.ee · 5 months ago

Oh absolutely. I wear socks with sandals because my soles sweat and make my sandals sticky.

But yeah, wear proper attire for the work you do!

fatboy93@lemm.ee · 8 months ago

ThinkPad T450s (my old laptop)

OS: Arch Linux DE: Plasma

Services: Arr stack for gluetun, sonarr, radar and jackets Jellyfin for videos Gonic for audio

All 3 of them are run using docker compose

fatboy93@lemm.ee · 8 months ago

This has the same energy as my spouse yelling at me because jellyfin went down

fatboy93@lemm.ee · 8 months ago

This and the no questions asked two year replacement policy is amazing if you have a toddler.

The bundled foamy case also is really great.

We slapped a 256gb SD card, and have almost it full of videos that he watches when we travel.

fatboy93@lemm.ee · 9 months ago

For $10M, I might just retire, buy a house and raise my kid.

Sure beats having to struggle at work.

fatboy93@lemm.ee · 10 months ago

This. They just need you for a follow-up visits, since they get graded on how mow complete the procedure was done.

Unfortunately, dental works are of those kinds where everything takes multiple sittings.

fatboy93@lemm.ee · 10 months ago

Not really an open-source approach, but I found that irium Webcam is generally a lot better if you’re just wanting to use your phone as one.

For some reason scrcpy just doesn’t work well for me.

fatboy93@lemm.ee · 1 year ago

If you have to ask the question, then the answer is always NO

fatboy93@lemm.ee · 1 year ago

Iirc, the business line (ThinkPads) were not affected by these, but who knows.

But yes, my oldest laptop is a ThinkPad and I love it very dearly!

fatboy93@lemm.ee · 1 year ago

I use gonic with sonixd on my laptops, but probably might move to supersonic from sonixd.

On my phone, Tempo is really awesome!

fatboy93@lemm.ee · 1 year ago

Not sure if this would be useful, but my university uses ThinLinc. We can use the desktop and other stuff in the browser.

fatboy93@lemm.ee · edit-2 1 year ago

Did you try openrgb? Just curious!

https://openrgb-wiki.readthedocs.io/en/latest/Logitech-Keyboards/

fatboy93@lemm.ee · 1 year ago

Just don’t get xiaomi if you’re in the US. I used to have one when I moved from India last year and majority of the bands were unsupported, so I was stuck on 2G-3G speeds.

fatboy93@lemm.ee · 1 year ago

I fell in love with the webapp first and now I like the android native app!

fatboy93@lemm.ee · edit-2 1 year ago

I’m just going to cheat here a bit and use chatGPT to summarize this, since I don’t want to do the calculation wrong. Hope it makes sense. I’m just excited to share this!

########## Integrated GPU #########

Total inference time = Load time + Sample time + Prompt eval time + Eval time

Total inference time = 26205.90 ms + (6.34 ms/sample * 103 samples) + 29234.08 ms + 118847.32 ms

Total inference time = 26205.90 ms + 653.02 ms + 29234.08 ms + 118847.32 ms

Total inference time = 174940.32 ms

So, the total inference time is approximately 174940.32 ms.

########## Discrete GPU 6800M ######### Total inference time = Load time + Sample time + Prompt eval time + Eval time

Total inference time = 60188.90 ms + (3.58 ms/sample * 103 samples) + 7133.18 ms + 13003.63 ms

Total inference time = 60188.90 ms + 368.74 ms + 7133.18 ms + 13003.63 ms

Total inference time = 80594.45 ms

So, the total inference time is approximately 80594.45 ms. #####################################

Taking the difference Discrete - Integrated : 94345.87 ms.

Which is close to about 53% faster or about 1.5 minutes faster. The integrated GPU takes close to 175 seconds and the discrete finishes in about 81 seconds.

I do think that adding more RAM at some point could definitely help in improving the loading times, since the laptop has currently about 16Gb RAM.

fatboy93@lemm.ee · 1 year ago

I did post this on reddit first, since this community never pops on my feed and I was unsure if its inactive. But here it goes here as well!

fatboy93@lemm.ee · edit-2 1 year ago

Small guide to run Llama.cpp on windows with discrete AMD GPU

fatboy93

Small guide to run Llama.cpp on windows with discrete AMD GPU

Small guide to run Llama.cpp on windows with discrete AMD GPU