AI Frameworks

AI Frameworks Page 4 of 5

All Frameworks Complete list of AI frameworks, sorted by newest first

MimicMotion by Tencent

MimicMotion, developed by Tencent, is an advanced AI framework for generating high-quality human motion videos. It utilizes confidence-aware pose guidance to ensure smooth transitions and detailed hand movements. The framework employs region loss amplification and hand region enhancement to reduce image distortion and improve visual quality. MimicMotion can generate long videos with high temporal coherence using a progressive latent fusion strategy, making it ideal for applications in dance, sports, and daily activities.

AI Video Generation Human Motion Tencent Pose Guidance Latent Diffusion Temporal Coherence Hand Movement Progressive Fusion Confidence-Aware

production Open Source

0 views

Learn More View Repository

PuLID by ByteDance

PuLID is an open-source framework developed by ByteDance for personalized text-to-image generation. It leverages contrastive alignment and fast sampling methods to achieve efficient ID customization without requiring model tuning. This technology allows users to create realistic face-swapping effects while maintaining high ID fidelity and minimizing interference with the original image's style and background. PuLID supports personalized editing through simple text prompts, making it scalable for applications in art creation, virtual avatar customization, and film production.

Text-to-Image Face-Swapping Personalization AI Art Image Generation Open-Source Contrastive Alignment Fast Sampling Virtual Avatars Film Production

production Open Source

0 views

Learn More View Repository

AgentQ by MultiOn

Agent Q is a self-supervised agent reasoning and search framework developed by MultiOn in collaboration with Stanford University. It integrates techniques such as guided Monte Carlo Tree Search (MCTS), AI self-criticism, and Direct Preference Optimization (DPO) to enable AI models to self-improve through iterative fine-tuning and reinforcement learning based on human feedback. Agent Q has demonstrated exceptional performance in web navigation and multi-step task execution, significantly improving success rates in real-world tasks like OpenTable reservations.

AI Agent Self-Learning Reinforcement Learning Monte Carlo Tree Search Web Navigation Multi-Step Reasoning Iterative Fine-Tuning Zero-Shot Learning AI Self-Criticism Direct Preference Optimization

beta

0 views

Learn More View Repository

MovieDreamer by Zhejiang University, Alibaba

MovieDreamer is an AI video generation framework developed by Zhejiang University in collaboration with Alibaba, specifically designed for long videos. It combines autoregressive models and diffusion rendering techniques to generate long videos with complex plots and high visual quality. Through multimodal script enhancements, it improves scene descriptions, maintains character and scene consistency, significantly extends the duration of generated content, and advances the development of automated long video production technology.

AI Video Generation Long Videos Autoregressive Models Diffusion Rendering Multimodal Script Character Consistency Visual Fidelity Video Production Film Making Virtual Reality

experimental Open Source

0 views

Learn More View Repository

LongRAG by Tsinghua University, CAS, ZhiPu

LongRAG is a robust retrieval-augmented generation (RAG) framework developed by Tsinghua University, the Chinese Academy of Sciences (CAS), and ZhiPu. It is specifically designed for long-context question answering (LCQA), addressing challenges in global context understanding and factual detail recognition. The framework includes a hybrid retriever, an LLM-enhanced information extractor, a CoT-guided filter, and an LLM-enhanced generator. LongRAG outperforms baseline models on multiple datasets and offers an automated fine-tuning data construction pipeline to enhance domain adaptability and instruction-following capabilities.

Retrieval-Augmented Generation Long-Context Question Answering LLM Natural Language Processing AI Research Hybrid Retriever Chain of Thought Information Extraction Automated Fine-Tuning Global Context Understanding

production Open Source

0 views

Learn More View Repository

ClearerVoiceStudio by Alibaba DAMO Academy

ClearerVoice-Studio is an open-source voice processing framework developed by Alibaba DAMO Academy's Tongyi Lab. It integrates functions such as voice enhancement, separation, and speaker extraction from audio and video. The framework is based on complex-domain deep learning algorithms, effectively eliminating background noise while preserving voice clarity and minimizing distortion. It provides advanced pre-trained models and training scripts, supporting researchers and developers in voice processing tasks and promoting innovative applications of voice processing technology.

Voice Processing AI Framework Open Source Voice Enhancement Speaker Extraction Audio Processing Video Processing Deep Learning Pre-trained Models Developer Tools

production Open Source

0 views

Learn More View Repository

DeepGEMM by DeepSeek

DeepGEMM is an open-source library by DeepSeek specifically designed for FP8 (8-bit floating point) matrix multiplication (GEMM). It supports both regular and Mixture of Experts (MoE) grouped GEMM operations, utilizing Just-In-Time (JIT) compilation for dynamic optimization at runtime. Optimized for NVIDIA Hopper Tensor Cores, DeepGEMM leverages fine-grained scaling and CUDA core dual-level accumulation to address FP8 precision issues while enhancing data transfer efficiency with Hopper's Tensor Memory Accelerator (TMA) feature. Its lightweight design, with core code of only about 300 lines, achieves or surpasses expert-level optimization libraries across various matrix shapes.

FP8 Matrix Multiplication NVIDIA Hopper JIT Compilation High-Performance Computing CUDA Deep Learning Open-Source AI Libraries Mixture of Experts

production Open Source

0 views

Learn More View Repository

Diffutoon by Alibaba, East China Normal University (ECNU)

Diffutoon is an AI framework developed by researchers from Alibaba and East China Normal University (ECNU) that transforms realistic videos into cartoon anime styles. It leverages diffusion model-based editable cartoon coloring technology to achieve high-resolution and long-duration video rendering. The framework also includes content editing capabilities, allowing users to adjust video details based on text prompts, ensuring high visual quality and consistency in the final output.

AI Video Editing Cartoon Animation Diffusion Models High-Resolution Video Video Stylization Content Editing Frame Consistency Automatic Coloring Structure Preservation Video Processing

experimental Open Source

0 views

Learn More View Repository

AniPortrait by Tencent

AniPortrait is an open-source framework developed by Tencent that transforms audio and a portrait image into high-quality, lip-synced animated videos. It operates in two stages: extracting 3D facial features from audio and converting them into 2D facial landmarks, then using a diffusion model and motion module to generate realistic animations. The framework excels in producing natural, diverse animations with precise lip-syncing and facial expressions, offering flexibility for editing and customization.

AI Video Generation Lip Syncing Open Source Facial Animation Audio-Driven Animation Diffusion Models 3D Facial Landmarks Video Editing Photorealistic Animation Tencent

production Open Source

0 views

Learn More View Repository

3FS by DeepSeek

3FS (Fire-Flyer File System) is a high-performance distributed file system developed by DeepSeek, specifically optimized for AI training and inference workloads. It leverages modern SSD and RDMA network technologies to aggregate the throughput of thousands of SSDs and the network bandwidth of hundreds of storage nodes, delivering up to 6.6 TiB/s read throughput. 3FS ensures strong consistency, provides a universal file interface, and eliminates the need for learning new storage APIs. It is ideal for large-scale data processing, inference optimization, and high-throughput parallel checkpointing.

Distributed File System AI Training AI Inference High Performance RDMA Large-Scale Data Processing Checkpoint Support KVCache Scalability Strong Consistency

production Open Source

0 views

Learn More View Repository

Helping everyone find the best AI for their work and daily life through deep analysis and honest comparisons.

Company

About Contact News Insights

Stay Updated

Get notified about new AI tools, models, and insights.

AI Frameworks Page 4 of 5

All Frameworks Complete list of AI frameworks, sorted by newest first

Company

Categories

Stay Updated

What’s in Startup Plan?

What’s in Startup Plan?

What’s in Startup Plan?

What’s in Startup Plan?

Details

Frameworks

Database

Billing

Completed

Project Type

Project Settings

Drop files here or click to upload.

Budget

Build a Team

Set First Target

Upload Files

Drop files here or click to upload.

Project Created!

No result found

Advanced Search

Search Preferences

AI Frameworks Page 4 of 5

All Frameworks Complete list of AI frameworks, sorted by newest first

Company

Categories

Stay Updated

Drop files here or click to upload.

Drop files here or click to upload.