인공지능 썸네일형 리스트형 SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers https://arxiv.org/abs/2105.15203 SegFormer: Simple and Efficient Design for Semantic Segmentation with TransformersWe present SegFormer, a simple, efficient yet powerful semantic segmentation framework which unifies Transformers with lightweight multilayer perception (MLP) decoders. SegFormer has two appealing features: 1) SegFormer comprises a novel hierarchically strarxiv.org 요약:우리는 SegFormer라.. 더보기 Better & Faster Large Language Models via Multi-token Prediction https://arxiv.org/abs/2404.19737 Better & Faster Large Language Models via Multi-token PredictionLarge language models such as GPT and Llama are trained with a next-token prediction loss. In this work, we suggest that training language models to predict multiple future tokens at once results in higher sample efficiency. More specifically, at each posiarxiv.org GPT 및 Llama와 같은 대규모 언어 모델은 다음 토큰 예측.. 더보기 BioCLIP: A Vision Foundation Model for the Tree of Life https://arxiv.org/abs/2311.18803 BioCLIP: A Vision Foundation Model for the Tree of LifeImages of the natural world, collected by a variety of cameras, from drones to individual phones, are increasingly abundant sources of biological information. There is an explosion of computational methods and tools, particularly computer vision, for extraarxiv.org 초록 (Abstract)드론부터 개인 휴대폰까지 다양한 카메라로 수집된 자연 세.. 더보기 Mip-Splatting: Alias-free 3D Gaussian Splatting https://arxiv.org/abs/2311.16493 Mip-Splatting: Alias-free 3D Gaussian SplattingRecently, 3D Gaussian Splatting has demonstrated impressive novel view synthesis results, reaching high fidelity and efficiency. However, strong artifacts can be observed when changing the sampling rate, \eg, by changing focal length or camera distance. Wearxiv.org 초록최근 3D Gaussian Splatting은 놀라운 새로운 뷰 합성 결과를 보여주며 높은.. 더보기 Rich Human Feedback for Text-to-Image Generation https://arxiv.org/abs/2312.10240 Rich Human Feedback for Text-to-Image GenerationRecent Text-to-Image (T2I) generation models such as Stable Diffusion and Imagen have made significant progress in generating high-resolution images based on text descriptions. However, many generated images still suffer from issues such as artifacts/implaarxiv.org 요약최근의 텍스트-이미지(T2I) 생성 모델인 Stable Diffusion과 Imagen은.. 더보기 Generative Image Dynamics https://generative-dynamics.github.io/ Generative Image DynamicsWe present an approach to modeling an image-space prior on scene motion. Our prior is learned from a collection of motion trajectories extracted from real video sequences depicting natural, oscillatory dynamics such as trees, flowers, candles, and clothesgenerative-dynamics.github.io 요약우리는 장면 운동에 대한 이미지 공간 사전 모델링 접근 방식을 제시합니다. 우리의 사.. 더보기 Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild https://arxiv.org/abs/2401.13627 Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the WildWe introduce SUPIR (Scaling-UP Image Restoration), a groundbreaking image restoration method that harnesses generative prior and the power of model scaling up. Leveraging multi-modal techniques and advanced generative prior, SUPIR marks a significant advanarxiv.org.. 더보기 LoRA: Low-Rank Adaptation of Large Language Models https://arxiv.org/abs/2106.09685 LoRA: Low-Rank Adaptation of Large Language ModelsAn important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes learxiv.org 요약자연어 처리의 중요한 패러다임은 일반 도메인 데이터에 대한 대규모 사전 학습과 특정 .. 더보기 이전 1 ··· 13 14 15 16 17 18 19 ··· 22 다음