AI Models

AI Models Page 4 of 5

All Models Complete list of AI models and foundation models, sorted by newest first

DeepSeek-R1
DeepSeek-R1 by Hangzhou DeepSeek Corporation
0

DeepSeek-R1 is a high-performance AI reasoning model developed by Hangzhou DeepSeek Corporation. It is designed to match the capabilities of OpenAI's o1 official version, excelling in tasks such as math, coding, and natural language reasoning. The model leverages large-scale reinforcement learning techniques, achieving exceptional performance with minimal labeled data. DeepSeek-R1 is open-sourced under the MIT License and supports model distillation for training other models.

AI Reasoning Reinforcement Learning Model Distillation Natural Language Processing Math Reasoning Coding Open Source MIT License Chain-of-Thought Reasoning API Integration
language production Open Source
Wan2.1
Wan2.1 by Alibaba Cloud
0

Wan2.1 is an open-source AI video generation model developed by Alibaba Cloud, featuring robust visual generation capabilities. It supports text-to-video and image-to-video tasks and includes two model sizes: a 14B-parameter professional version excelling in complex motion generation and physical modeling, and a 1.3B-parameter speed version that runs on consumer-grade GPUs with low VRAM requirements, suitable for secondary development and academic research. Wan2.1 is based on a causal 3D VAE and video Diffusion Transformer architecture, enabling efficient spatiotemporal compression and long-term dependency modeling. The 14B version outperforms models like Sora, Luma, and Pika with a score of 86.22% on the Vbench evaluation, securing the top position. It is open-sourced under the Apache 2.0 license and supports multiple mainstream frameworks, available on GitHub, HuggingFace, and the ModelScope community for easy deployment.

AI Video Generation Open Source Text-to-Video Image-to-Video Complex Motion Generation Physical Law Simulation Multi-Style Generation Text Effects Generation Alibaba Cloud Consumer-Grade GPUs
multimodal production Open Source
Grok-1
Grok-1 by xAI
0

Grok-1 is a large language model developed by xAI, an AI startup under Elon Musk. It is a Mixture of Experts (MoE) model with 314 billion parameters, making it the largest open-source language model available. The development and training of Grok-1 follow open-source principles, with its weights and network architecture publicly available under the Apache 2.0 license, allowing free use, modification, and distribution for both personal and commercial purposes.

Large Language Model Open Source Natural Language Processing Mixture of Experts Transformer AI Research Machine Learning Language Understanding Text Generation OpenAI Alternative
language production Open Source
Loopy
Loopy by ByteDance
0

Loopy is an advanced audio-driven AI video generation model developed by ByteDance. It animates static photos by synchronizing facial expressions and head movements with provided audio files, creating realistic dynamic videos. Built on diffusion model technology, Loopy captures long-term motion information without requiring additional spatial signals, making it versatile for applications in entertainment, education, and more.

AI Video Generation Audio-Driven ByteDance Diffusion Model Facial Animation Video Synthesis Entertainment Education Multimedia AI Research
multimodal experimental
CosyVoice2
CosyVoice2 by Alibaba Group
0

CosyVoice 2.0 is an upgraded speech generation model developed by Alibaba's Tongyi Lab. It improves codebook utilization with limited scalar quantization, simplifies the text-to-speech architecture, and introduces a block-aware causal flow matching model to support diverse synthesis scenarios. The model significantly enhances pronunciation accuracy, timbre consistency, rhythm, and audio quality, with a MOS score increase from 5.4 to 5.53. It supports streaming inference, reducing the first-packet synthesis latency to 150ms, making it suitable for real-time speech synthesis applications.

Speech Synthesis AI Model Real-time Processing Multilingual Support Text-to-Speech Streaming Inference High Accuracy Timbre Consistency Natural Experience Low Latency
Speech Generation production Open Source
CogVideoX
CogVideoX by Zhipu AI
0

CogVideoX is an open-source AI video generation model developed by Zhipu AI. It allows users to generate 6-second videos from English text prompts, with a resolution of 720*480 and 8 frames per second. The model requires 7.8-26GB of VRAM for inference and includes features like 3D Causal VAE for video reconstruction. It also provides tools such as CLI/WEB Demo, online experience, API interface examples, and fine-tuning guides.

AI Video Generation Open Source Deep Learning Text-to-Video 3D Causal VAE Video Reconstruction Zhipu AI Inference Fine-Tuning API
multimodal production Open Source
ChatTTS
ChatTTS by 2noise
0

ChatTTS is an open-source text-to-speech (TTS) model optimized for dialogue scenarios, supporting both Chinese and English. Trained on approximately 100,000 hours of data, it produces high-quality, natural-sounding speech. The model offers fine-grained control over prosodic features like laughter and pauses, supports multiple speakers, and is ideal for conversational tasks. It surpasses most open-source TTS models in fluidity and naturalness.

Text-to-Speech Dialogue Speech Synthesis Open Source Natural Language Processing Multi-Language Support Prosody Control Real-Time Speech Voice Role Selection AI Model
Text-to-Speech production Open Source
Stable Diffusion
Stable Diffusion by Stability AI
40000

An open-source text-to-image model capable of generating detailed images from text descriptions, with a strong community and multiple deployment options.

AI Art Text-to-Image Open Source Local Deployment
vision production Open Source
600000 views
LLaMA 2
LLaMA 2 by Meta
45000

Meta's open-source large language model family, offering strong performance across various tasks with different model sizes.

LLM Open Source Meta AI Foundation Model
language production Open Source
700000 views
GPT-4
GPT-4 by OpenAI
50000

GPT-4 is OpenAI's most advanced large language model, demonstrating human-level performance on various academic and professional tests.

LLM NLP AI ChatGPT OpenAI
language production
1000000 views