본문 바로가기

인공지능

Scalable watermarking for identifying large language model outputs https://huggingface.co/blog/synthid-text Introducing SynthID TextIntroducing SynthID Text Do you find it difficult to tell if text was written by a human or generated by AI? Being able to identify AI-generated content is essential to promoting trust in information, and helping to address problems such as misattributionhuggingface.co 요약대규모 언어 모델(LLMs)은 사람의 글과 구분하기 어려울 정도로 고품질의 합성 텍스트를 대량으로 생성할 수 .. 더보기
X-Portrait 2: Highly Expressive Portrait Animation https://byteaigc.github.io/X-Portrait2/ X-Portrait 2: Highly Expressive Portrait AnimationPortrait animation technology provides a ultra-low cost and highly effective way to creating expressive, realistic character animations and video footages: users only need to provide a static portrait image and a driving performance video, and the model cabyteaigc.github.io 초상화 애니메이션 기술초상화 애니메이션 기술은 표현력 있고 .. 더보기
TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion Interpolation https://arxiv.org/abs/2410.04221 TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion InterpolationWe present TANGO, a framework for generating co-speech body-gesture videos. Given a few-minute, single-speaker reference video and target speech audio, TANGO produces high-fidelity videos with synchronized body gestures. TANGO builds on Gesture Video Ree.. 더보기
Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models https://arxiv.org/abs/2410.11081 Simplifying, Stabilizing and Scaling Continuous-Time Consistency ModelsConsistency models (CMs) are a powerful class of diffusion-based generative models optimized for fast sampling. Most existing CMs are trained using discretized timesteps, which introduce additional hyperparameters and are prone to discretization errors. Wharxiv.org 요약일관성 모델(Consistency Models,.. 더보기
Pyramidal Flow Matching for Efficient Video Generative Modeling https://arxiv.org/abs/2410.05954 Pyramidal Flow Matching for Efficient Video Generative ModelingVideo generation requires modeling a vast spatiotemporal space, which demands significant computational resources and data usage. To reduce the complexity, the prevailing approaches employ a cascaded architecture to avoid direct training with full resolutiarxiv.org 초록비디오 생성은 광범위한 시공간 공간을 모델링해야 하며, 이는 .. 더보기
Oasis: A Universe in a Transformer https://github.com/etched-ai/open-oasis GitHub - etched-ai/open-oasis: Inference script for Oasis 500MInference script for Oasis 500M. Contribute to etched-ai/open-oasis development by creating an account on GitHub.github.com 우리는 Oasis를 발표하게 되어 매우 기쁩니다. Oasis는 최초의 플레이 가능한 실시간 오픈 월드 AI 모델로, 프레임별로 생성되는 인터랙티브 비디오 게임입니다. Oasis는 사용자 키보드와 마우스 입력을 받아 실시간으로 게임 플레이를 생성하며, 내부적으로 물리 법칙, 게임 규칙, 그래픽 등을 시뮬레이션.. 더보기
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion https://arxiv.org/abs/2411.04928 DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video DiffusionIn this paper, we introduce \textbf{DimensionX}, a framework designed to generate photorealistic 3D and 4D scenes from just a single image with video diffusion. Our approach begins with the insight that both the spatial structure of a 3D scene and the temparxiv.org 요약이 논문.. 더보기
Language models generalize beyond natural proteins https://www.biorxiv.org/content/10.1101/2022.12.21.521521v1.full https://github.com/facebookresearch/esm GitHub - facebookresearch/esm: Evolutionary Scale Modeling (esm): Pretrained language models for proteinsEvolutionary Scale Modeling (esm): Pretrained language models for proteins - facebookresearch/esmgithub.com  요약 진화 과정에서 얻어진 단백질 서열로부터 디자인 패턴을 학습하는 것은 생성적 단백질 설계로의 가능성을 제시할 수 있다. 하지만 자연 단백질.. 더보기