AI

4109 readers

3 users here now

Artificial intelligence (AI) is intelligence demonstrated by machines, unlike the natural intelligence displayed by humans and animals, which involves consciousness and emotionality. The distinction between the former and the latter categories is often revealed by the acronym chosen.

founded 3 years ago

Do I understand LLMs? (lemmy.world)

submitted 1 month ago by [email protected] to c/[email protected]

8 comments fedilink hide all child comments

Basically, it's a calculator that can take letters, numbers, words, sentences, and so on as input.

And produce a mathematically "correct" sounding output, defined by language patterns in the training data.

This core concept is in most if not all "AI" models, not just LLMs, I think.

top 8 comments

sorted by: hot top controversial new old

[–] [email protected] 0 points 1 month ago

mathematically “correct” sounding output

It's hard to say because that's a rather ambiguous way of describing it ("correct" could mean anything), but it is a valid way of describing its mechanisms.

"Correct" in the context of LLMs would be a token that is likely to follow the preceding sequence of tokens. In fact, it computes a probability for every possible token, then takes a random sample according to that distribution* to choose the next token, and it repeats that until some termination condition. This is what we call maximum likelihood estimation (MLE) in machine learning (ML). We're learning a distribution that makes the training data as likely as possible. MLE is indeed the basis of a lot of ML, but not all.

*~Oversimplification.~

[–] [email protected] 0 points 1 month ago

The critical thing to remember about LLMs is that they are probabilistic in nature. They don't know facts, they don't reason, they don't evaluate. All they do is take your input string, split that string into tokens that are about 3-4 characters long, and then go back into their vast, vast, pretrained database and say "I have this series of tokens. In the past when similar sets of tokens were given, what were the tokens that were most likely to be associated with them?" It will then construct the output string one token at-a-time (more sophisticated models can do multiple tokens at once so that words, phrases and sentences might hang together better) until the output is complete (the probability of the next token being relevant drops below some threshold value) or your output limit is reached.

[–] [email protected] 0 points 1 month ago (1 children)

And our brains probably do something very similar. If llms are not intelligent our brains aren’t either.

[–] [email protected] 0 points 1 month ago* (last edited 1 month ago)

I think you're right, but it takes more than an LLM to be intelligent. The LLM is one piece of the pie, though

[–] [email protected] 0 points 1 month ago* (last edited 1 month ago) (2 children)

There’s a part of our brains called the salience network, that continually models and predicts our environment and directs our conscious attention to things it fails to predict. And there’s a certain optimum level of unpredictability that attracts our attention without overwhelming it. When we talk to each other, most of the actual information content is predictable, and the salience network filters it out—the unpredictable part is the actual conscious message.

LLMs basically recreate the salience network. They continually model and predict the content of the text stream—except instead of modeling someone else’s words so they can extract the unpredictable conscious message, they model their own words so they can keep predicting the next ones.

This creates an obvious issue: when our salience networks process the stream of words coming out of an LLM, it’s all predictable, so our salience networks tell us there’s no actual message. When AI designers discovered this, they added a feature called “temperature” that basically injects randomness into the generated text so our salience networks will get fooled into thinking there’s a conscious message.

[–] [email protected] 0 points 1 week ago

This was a great read! I did have a feeling LLMs would be a but boring when setting them to low temperatures and now I understand why.

I have found that LLMs are great for predetermined processes, like generating JSON using a given format and writing code, but they suck at creative tasks, and your amazing explanation now told me why that is.

Thanks again!

[–] [email protected] 0 points 1 month ago

This was a great read, Thanks!

I have a new rabbit hole to explore 😝

[–] [email protected] 0 points 1 month ago

Pretty much.

But it only takes words as input. Numbers are just another type of word. Sentences are just a series of words with a "." word on the end.

This is why when you get a LLM to do basic math it fails so often - it has no concept of a number or operations on those numbers.