is the 4k context length of llama2 for real?

actually-a-cat@sh.itjust.works · 1 year ago

is the 4k context length of llama2 for real?

Sims@lemmy.ml · 1 year ago

I was unaware that the smaller context models exhibited the same effect. It does seem logical that broad important information and conclusions is naturally put at the ends of a sentence by us. I haven’t read the paper yet, but wonder if the training set - our communication - also contains more information at the ends, so the effect isn’t caused by the algorithm, but by the data. I’ll give the paper a read, thx…