본문 바로가기

인공지능

The Pile: An 800GB Dataset of Diverse Text for Language Modeling https://arxiv.org/abs/2101.00027 The Pile: An 800GB Dataset of Diverse Text for Language ModelingRecent work has demonstrated that increased training dataset diversity improves general cross-domain knowledge and downstream generalization capability for large-scale language models. With this in mind, we present \textit{the Pile}: an 825 GiB English texarxiv.org 요약최근 연구에 따르면, 훈련 데이터셋의 다양성이 증가할수록 대.. 더보기
GLM: General Language Model Pretraining with Autoregressive Blank Infilling https://arxiv.org/abs/2103.10360 GLM: General Language Model Pretraining with Autoregressive Blank InfillingThere have been various types of pretraining architectures including autoencoding models (e.g., BERT), autoregressive models (e.g., GPT), and encoder-decoder models (e.g., T5). However, none of the pretraining frameworks performs the best for all tasks ofarxiv.org 초록자동인코딩 모델(예: BERT), 자기회귀.. 더보기
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis https://arxiv.org/abs/2403.03206 Scaling Rectified Flow Transformers for High-Resolution Image SynthesisDiffusion models create data from noise by inverting the forward paths of data towards noise and have emerged as a powerful generative modeling technique for high-dimensional, perceptual data such as images and videos. Rectified flow is a recent generativearxiv.org 요약디퓨전 모델은 데이터를 노이즈로 변환하는 경로를.. 더보기
GraphCast: Learning skillful medium-range global weather forecasting https://arxiv.org/abs/2212.12794 GraphCast: Learning skillful medium-range global weather forecastingGlobal medium-range weather forecasting is critical to decision-making across many social and economic domains. Traditional numerical weather prediction uses increased compute resources to improve forecast accuracy, but cannot directly use historical weatharxiv.org 전 세계 중기 기상 예보는 많은 사회적, 경제적 분야에서.. 더보기
SAM 2: Segment Anything in Images and Videos https://ai.meta.com/research/publications/sam-2-segment-anything-in-images-and-videos/ SAM 2: Segment Anything in Images and Videos | Research - AI at MetaGRAPHICS COMPUTER VISION Meta 3D Gen Raphael Bensadoun, Tom Monnier, Yanir Kleiman, Filippos Kokkinos, Yawar Siddiqui, Mahendra Kariya, Omri Harosh, Roman Shapovalov, Emilien Garreau, Animesh Karnewar, Ang Cao, Idan Azuri, Iurii Makarov, Eric-.. 더보기
Neural General Circulation Models for Weather and Climate https://arxiv.org/abs/2311.07222 Neural General Circulation Models for Weather and ClimateGeneral circulation models (GCMs) are the foundation of weather and climate prediction. GCMs are physics-based simulators which combine a numerical solver for large-scale dynamics with tuned representations for small-scale processes such as cloud formationarxiv.org 요약일반 순환 모델(GCM)은 날씨와 기후 예측의 기초입니다. GCM은 구름.. 더보기
Conformer: Convolution-augmented Transformer for Speech Recognition https://arxiv.org/abs/2005.08100 Conformer: Convolution-augmented Transformer for Speech RecognitionRecently Transformer and Convolution neural network (CNN) based models have shown promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural networks (RNNs). Transformer models are good at capturing content-based global interacarxiv.org 초록최근 Transformer와 Convolutional .. 더보기
The Llama3 Herd of Models https://ai.meta.com/research/publications/the-llama-3-herd-of-models/ The Llama 3 Herd of Models | Research - AI at MetaAbstract Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usai.meta.com 현대 인공지능(AI.. 더보기