Leave No Context Behind： Efficient Infinite Context Transformers with Infini attention

https://www.youtube.com/watch?v=r_UBBfTPcF0

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation. A key component in our proposed approach is a new attention technique dubbed Infini-attention. T

arxiv.org

2404.07143v1 ko.pdf

1.01MB

1.이들이 설계한 구조

그냥 더 간단하게 보면 다음과 같다.

그냥 이전 데이터와 현재 데이터의 데이터 불균형화를 막는 형식

실제로 이걸 여러겹 쌓으면 다음과 같이됨

이상하지 않은가?

Infini-Transformer가 linear하다는 것을?

즉, 이건 궁극적으로 활용되지 못하는 방식이다. softmax로 선형 조건을 비선형으로 짜치는 것이다.

이게 그 공식이고

그리고 이러한 방식은 1990년대에 시도되었고

Reference에 보면 나와있다. 그리고 없어졌다. -> 잘안된거

Performance 성능에서는 좋게 나왔는데, 500k다 단어가. 그러면 infinite이라고 우리가 봐야하나?

500,000개 글자인데, 생각보다 너무 적다. 책한권 아닌가? 이 부분은 이미 Gemini나 다른 연구에서 attention 매커니즘으로 잘되고 있다는 것을 확인했다.

Linearity에 의해 이미 불안정한 것에 추가로 올란다면, 나는 부정적이다

나는 window context가 무한인건줄알고 읽었는데, 아니었다는 점이 더 실망감이 컸다.

저보다 잘 서술하신 분을 보시려면 아래로

https://ostin.tistory.com/513

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Abstract 압축 메모리와 linear attention을 활용하여 제한된 메모리에서도 무한히 긴 context를 처리할 수 있는 Infini-attention 제안 [arXiv](2024/04/10 version v1) Introduction Infini-attention은 오래된 KV state를 버리지

ostin.tistory.com

'일상생활' 카테고리의 다른 글

[안될공학 - IT 테크 신기술] 앞으로 AI 가 가져올 위험은 딥페이크보다 더 할 수 있습니다 (더밸류컨설팅 이병주 대표 4부) (1)	2024.05.01
트랜스포머 인코더 강의 (0)	2024.04.29
Apple Shocks Again： Introducing OpenELM Open Source AI Model That Changes Everything! (0)	2024.04.28
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping (Searchformer) (0)	2024.04.25
Anthropic's Claude 3는 자의식이 있는가? (0)	2024.04.24

JunHan's AI Factory

Leave No Context Behind： Efficient Infinite Context Transformers with Infini attention

'일상생활' 카테고리의 다른 글

티스토리툴바

Leave No Context Behind： Efficient Infinite Context Transformers with Infini attention

'일상생활' 카테고리의 다른 글

'일상생활' Related Articles

티스토리툴바