[Paper] Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in SOTA Large Language Models

rufus@discuss.tchncs.de · 5 months ago

I think they don’t take inspiration from Photoshop. Either it’s been a clone of a different product at some time or they developed it themselves. Hence the differences. I mean the whole UI doen’t really resemble similarity to Photoshop.

rufus@discuss.tchncs.de · edit-2 5 months ago

Plug it into a computer and see what the computer says.

I usually use Linux for that because it offers good error messages and I know the tools. But other operating systems might help, too.

And if you start writing to the card or executing recovery tools, make a backup / image first.

If the files are very important, maybe don’t tamper with it and ask for help. Like a repair shop, your local Linux community or any trustworthy computer expert friend.

The biggest enemy is probably encryption, if it’s encrypted. The files are definitely still there if you just ripped it out. In the old days you could just run a recovery program and get everything back.

rufus@discuss.tchncs.de · 5 months ago

FYI: There’s also AnLinux, Linux Deploy, Termux, tainer, UserLAnd, …

Some of them aren’t maintained anymore. And they don’t necessarily have hardware-acceleration. But don’t all require root and system patches.

rufus@discuss.tchncs.de · edit-2 5 months ago

I think most people use something like exllamav2 or vllm or use GGUF to do inference and it seems neither of those projects have properly implemented multimodality or this specific model architecture, yet.

You might just be at the forefront of things and there isn’t yet any beaten path you could follow.

The easiest thing you could do is just use something that already exists, be it 4bit models, wait a few weeks and then upgrade. And I mean you can also always quantize models yourself and set the parameters however you like, if you have some inference framework that supports your model including the adapters for vision and has the quantization levels you’re interested in…

rufus@discuss.tchncs.de · edit-2 5 months ago

Well, I’d say there is information in language. That’s kinda the point of it and why we use it. And language is powerful. We can describe and talk about a lot of things. (And it’s an interesting question what can not be described with language.)

I don’t think the stochastical parrot thing is a proper debate. It’s just that lots of people don’t know what AI is and what it can and cannot do. And it’s neither easy to understand nor are the consequences always that obvious.

Training LLMs involves some clever trickery, limit their size etc so they can’t just memorize everything, but instead are forced to learn concepts behind those texts.

I think they form models of the world inside of them. At least of things they’ve learned from the dataset. That’s why they can for example translate text. They have some concept of a cat stored inside of them and can apply that to a different language that uses entirely different characters to name that animal.

I wouldn’t say they are “tools to learn more aspects about nature”. They aren’t a sensor or something. And they can infer things, but not ‘measure’ things like an X-ray.

rufus@discuss.tchncs.de · edit-2 5 months ago

I’m currently reading the paper. I occasionally debate here on Lemmy whether LLMs are just stochastic parrots, or if they actually grasp the concepts they’re talking about. There’s also evicence for that.

Ultimately I wonder if and when we’ll get LLMs that address ‘hallucinations’ and expose a setting to adjust the factuality of the answer. I suppose that’s somewhere in the model or at least possible to learn for the model. But certainly not controlled or factored in in the current generation of LLMs.

rufus@discuss.tchncs.de · edit-2 5 months ago

[Paper] Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in SOTA Large Language Models

rufus@discuss.tchncs.de · edit-2 5 months ago

Thanks for taking the time to explain it to me. The Github issue also is very helpful. Seems that’s exactly my answer to “Why do I need a fourth store in addition to F-Droid, AuroraStore and Obtanium” 😉

Have a nice day, thanks for the STT keyboard! I didn’t really engage in the discussion because I’m exactly in the same situation as other people here. I already have the FUTO one and Sayboard… But eventually I’d like to replace FUTO software with free software alternatives. I don’t like their licensing. So this is very welcome.

rufus@discuss.tchncs.de · 5 months ago

Thanks.

rufus@discuss.tchncs.de · edit-2 5 months ago

Sure. There isn’t any paragraph on how it compares to other appstores or why the author started the project in the first place despite several other stores being available.

So I’m looking for the selling point. (Aside from your App being available there.)

rufus@discuss.tchncs.de · edit-2 5 months ago

Can someone enlighten me oabout the specifics of the accrescent.app appstore?

I guess it’s somewhat like Obtanium in that it fetches releases packed by the original developers, just plus an index, metadata and signing, thus more convenient and secure? I guess it’s open-source and everything? What are the unique benefits?

rufus@discuss.tchncs.de · edit-2 5 months ago

services.tabby.enable = true;
services.tabby.acceleration = "cuda";

? Could be another way.

rufus@discuss.tchncs.de · edit-2 5 months ago

I’m pretty sure he did this out of this own motivation because he thinks/thought it’s a fascinating topic. So, sure this doesn’t align with popularity. But it’s remarkable anyways, you’re right. And I always like to watch the progression. As far as I remember the early videos lacked professional audio and video standards that are nowadays the norm on Youtube. At some point he must have bought better equipment, but his content has been compelling since the start of his Youtube ‘career’. 😊

And I quite like the science content on Youtube. There are lots of people making really good videos, both from professional video producers and also from scientists (or hobbyists) who just share their insight and interesting perspective.

rufus@discuss.tchncs.de · edit-2 5 months ago

And maybe have a look at his Youtube channel and the older videos, too. Lots of them are a bit more philosophical and not too technical for the average person. I think he’s quite inspiring and conveys very well what AI safety is about, and what kinds of problems that field of science is concerned with.

rufus@discuss.tchncs.de · edit-2 5 months ago

Yeah, doesn’t really work. I mean it has a rough idea of that it needs to go east. And I’m surprised that it knows which interstates are in an area and a few street names in the cities. I’m really surprised. But I told it to get me from Houston to Montgomery as in your example. And in Houston it just tells random street names that aren’t even connected and in different parts of the city. Then it drives north on the I-45 and somehow ends up in the south on the I-610-E and finally the I-10-E. But then it makes up some shit, somehow drives to New Orleans, then a bit back and zig-zags it’s way back onto the I-10. Then some more instructions I didn’t fact check and it gets that it needs to go through Mobile and then north on the I-65.

I’ve tested ChatGPT on Germany. And it also gets which Autobahn is connected to the next. It still does occasional zig-zags and in between it likes to do an entire loop of 50km (30 miles) that ends up 2 cities back where it came from… Drives east again and on the second try takes a different exit.

However: I’m really surprised by the level of spatial awareness. I wouldn’t have expected it to come up with mostly correct cardinal directions and interstates that are actually connected and run through the mentioned cities. And like cities in between.

I don’t think I need to try “phi”. Small models have very limited knowledge stored inside of them. They’re too small to remember lots of things.

So, you were right. Consider me impressed. But I don’t think there is a real-world application for this unless your car has a teleporter built in to deal with the inconsistencies.

rufus@discuss.tchncs.de · edit-2 5 months ago

Which model(s) did you try? I’m willing to test it later. Downside is, I mainly use smaller LLMs, live in Germany, in an urban region with lots of streets and different Autobahnen and it’s kind of a hassle to deal with textual driving instructions anyways. 😆

rufus@discuss.tchncs.de · edit-2 6 months ago

Quite some AI questions coming up in selfhosted in the last few days…

Here’s some more communities I’m subscribed to:

And a few inactive ones on lemmy.intai.tech

I’m using koboldcpp and ollama. KoboldCpp is really awesome. In terms of hardware it’s an old PC with lots of RAM but no graphics card, so it’s quite slow for me. I occasionally rent a cloud GPU instance on runpod.io Not doing anything fancy, mainly role play, recreational stuff and I occasionally ask it to give me creative ideas for something, translate something or re-word or draft an unimportant text / email.

Have tried coding, summarizing and other stuff, but the performance of current AI isn’t enough for my everyday tasks.

rufus@discuss.tchncs.de · edit-2 6 months ago

What’s that got to do with AI?

Edit: Ah. Probably the search bar from the screenshot.

rufus@discuss.tchncs.de · 6 months ago

Isn’t that very similar to what TikTok does? Just with a different algorithm and maybe other content than just videos?

rufus@discuss.tchncs.de · edit-2 6 months ago

Hmmh. We’ve seen all kinds of claims and hype regarding AI. I’d like to see and judge for myself. Guess I’ll have to wait a few days.

Edit 2024-05-18: And yesterday it showed up in the webinterface. How do I get the talking and the emotions? Is that not available yet? Or do I need a phone app for that?