A user-friendly AI chatbot that can help with writing, analysis, answering questions, and creative tasks.
A design platform with powerful AI features for creating graphics, presentations, and marketing materials with ease.
The industry-standard image editing software now enhanced with powerful AI features for generative fill, neural filters, and creative editing.
An AI-powered writing assistant that helps improve grammar, style, tone, and clarity in real-time.
An AI-powered assistant integrated with Microsoft 365 apps, helping users create content, analyze data, and boost productivity.
A user-friendly video editing app with AI-powered features for automatic editing, effects, and content creation.
An AI writing assistant integrated into Notion's workspace, helping users write, edit, and organize content more efficiently.
An AI content creation platform that helps create marketing copy, blog posts, social media content, and more.
An AI meeting assistant that provides real-time transcription, summarization, and collaboration features for meetings and conversations.
An AI-powered video and audio editing platform that makes content creation as easy as editing a document.
SadTalker is an open-source AI digital human project developed by Xi'an Jiaotong University, Tencent AI Lab, and Ant Group. It creates realistic talking face animations from a single face image and audio by leveraging 3D motion coefficients. The tool uses advanced techniques like ExpNet for facial expression learning and PoseVAE for head movement synthesis, enabling high-quality, stylized video animations. It supports multiple languages and datasets, making it versatile for various applications such as virtual assistants, video production, and language learning.
Crayo AI is an AI-powered short video generation tool designed to help content creators quickly produce engaging videos for platforms like Douyin and TikTok. Leveraging natural language processing and computer vision technologies, Crayo AI allows users to generate video drafts automatically by simply providing a topic and parameters. The tool includes features like text, music, and visual effects, along with editing functions and optimization suggestions, streamlining the video creation process and allowing creators to focus on creativity and storytelling.
SciSpace is an AI-based literature reading and analysis tool designed to streamline academic research. It integrates a powerful search engine and intelligent filtering functions to help users quickly locate and organize relevant academic papers. Users can upload literature for in-depth analysis, including understanding paper content, formulas, and tables, as well as adding personal notes and tags. SciSpace supports multiple languages, offers a Chinese interface, and facilitates sharing and collaboration among users.
ViewCrafter is an advanced video diffusion model developed by Peking University, CUHK, and Tencent. It synthesizes high-fidelity novel views from single or few images by combining the generative capabilities of video diffusion models with point-based 3D representation. This allows for precise control over camera poses to generate high-quality video frames. Through iterative view synthesis strategies and camera trajectory planning, ViewCrafter gradually expands 3D cues to generate a broader range of novel views. It has demonstrated strong generalization and performance across multiple datasets, offering new possibilities for immersive real-time rendering and scene-level text-to-3D generation applications.
Melty is an open-source AI coding assistant that enhances developers' coding efficiency and code quality. It understands developers' programming activities in real-time, from terminal operations to GitHub interactions, offering intelligent collaboration and code generation. Melty learns the developer's style, assists in writing production-level code, and integrates seamlessly with compilers, debuggers, and other tools. It also supports advanced features like refactoring, creating web applications, and navigating large codebases, making it a powerful assistant in improving programming workflows.
Readtheirlips, developed by Symphonic Labs, is an advanced AI software that transcribes spoken content by analyzing lip movements in videos. It is particularly useful in scenarios where audio is unavailable or unclear. The software detects faces, extracts geometric features of the lips, and analyzes dynamic changes in lip movements to match features with training data and recognize spoken content. While the accuracy can be affected by factors such as the speaker not facing the camera directly or speaking too quickly, the development team is actively working on improving these limitations and enhancing video processing time constraints.
CSGO (Content-Style Composition in Text-to-Image Generation) is a collaborative research project developed by Nanjing University of Science and Technology, Xiaohongshu, and other institutions. It introduces an innovative data construction process for generating and cleaning stylized data triplets, building a large-scale style transfer dataset called IMAGStyle. The CSGO framework achieves image-driven style transfer, text-driven stylized synthesis, and text-editing-driven stylized synthesis through end-to-end training, significantly enhancing style control in image generation.
Seed-Music, developed by ByteDance, is an advanced AI music generation model that converts a 10-second audio clip into a full music composition. It leverages autoregressive language models and diffusion methods to create high-quality, style-controllable music based on multimodal inputs such as style descriptions, audio references, sheet music, and sound cues. Designed to simplify music creation, Seed-Music is accessible to both beginners and professional musicians. It also offers music editing features, enabling users to personalize the generated music.
Claude Dev is an AI programming assistant integrated into Visual Studio Code, leveraging Anthropic's Claude 3.5 Sonnet model. It automates complex programming tasks such as file reading/writing, project creation, and terminal command execution, enhancing development efficiency. With features like real-time tracking, smart permission management, and an interactive development interface, Claude Dev makes coding and project management intuitive and secure.
PopShort.AI is an AI-powered platform designed for creating short dramas with immersive interactive experiences. It features weekly updates of one-minute episodes, making it ideal for modern, fast-paced lifestyles. Users can engage with virtual characters, explore exclusive storylines, and access a vast library of over 1000 hours of AI-generated content. The platform also allows users to become the protagonist of their own stories, offering a personalized and engaging experience.
Avaturn is an AI-based 3D avatar generation platform that enables users to create highly realistic 3D avatars and full-body models by simply uploading photos. The platform leverages deep learning algorithms to simplify the process of personalized 3D content creation, offering extensive customization options such as facial features, hairstyles, clothing, and accessories. Users can fine-tune every detail of their avatars, making them suitable for various applications including gaming, social media, virtual meetings, and more. Avaturn also supports exporting avatars as 3D models for use in popular 3D environments like Blender, Unity, and Unreal Engine. With its focus on accessibility and customization, Avaturn aims to empower users to develop their digital identities and enhance virtual interactions.
Bytespider, developed by ByteDance and released in April 2024, is a high-speed web crawler tool designed to gather internet data for training and enhancing AI models, especially large language models (LLMs). It is 25 times faster than OpenAI's GPTbot and 3000 times faster than Anthropic's ClaudeBot, making it one of the most aggressive crawling tools available. Bytespider excels in web crawling, data collection, index construction, content analysis, and language model training, providing robust support for various AI applications.
KAPWING is an AI-integrated online video editing platform designed to streamline the video creation process. It offers a range of features, including AI video generation, document-to-video conversion, and text-to-speech, enabling users to quickly generate and edit video content. The platform provides rich editing tools and templates, allowing for deep customization such as adding voiceovers, background music, and personal video clips. KAPWING also supports team collaboration, enabling members to edit video projects in real-time.
MARS5-TTS is an open-source AI voice cloning tool developed by CAMB.AI, featuring breakthrough realistic prosody and support for over 140 languages. It can handle complex prosody scenarios such as sports commentary and anime AI dubbing. With 1.2 billion parameters and over 150,000 hours of training data, MARS5-TTS uses simple text markers to guide prosody, supporting both quick and deep cloning techniques to optimize speech output quality.
Musicfy AI is an AI-powered music creation platform that streamlines the music production process. Users can upload voice samples to create personalized AI voice models, generate music with virtual singers, and convert text into melodies. The platform offers features like AI voice imitation, text-to-music conversion, and original song creation, making it accessible for both professional producers and music enthusiasts.
Oasis is the world's first AI real-time generated game, developed by Decart and Etched. It renders interactive video content at 20 frames per second directly through AI models, eliminating the need for a game engine. Players can freely move, jump, and pick up items, experiencing a game world shaped in real-time by AI. Based on the Transformer architecture, Oasis combines ViT and DiT technologies to achieve low-latency real-time interaction. The code and model weights are open-source, encouraging community contributions and technological innovation. Oasis heralds a new era of AI-driven personalized content.
MusicFX DJ, developed by Google DeepMind, is an AI-powered tool that enables users to generate music in real-time by blending text prompts. Users can input various music concepts such as style, instruments, and more, and the tool will produce unique compositions. It supports multiple prompt mixing, allowing users to adjust the importance of each prompt to fine-tune the music style. The tool offers intuitive controls for instrument arrangement, texture adjustment, and rhythm control, and streams high-quality 48 kHz stereo audio in real-time. Users can also share and download their creations, making it suitable for both music enthusiasts and professionals.
FaceSwap is an open-source AI software designed for creating deepfake videos and images. It uses deep learning technology to replace one person's face with another's in videos or images. The software supports multiple operating systems, including Windows, macOS, and Linux, and can run on both CPU and GPU. It is maintained and updated by an active community, offering detailed installation and usage guides and tutorials. FaceSwap emphasizes its free and open-source nature, encouraging users to use it within the bounds of legal and ethical guidelines.
PersonaTalk is a two-stage framework developed by ByteDance, based on an attention mechanism, designed to achieve high-fidelity and personalized visual dubbing. It synthesizes videos with precise lip-sync to the target audio while preserving the speaker's unique speaking style and facial details. The first stage involves style-aware audio encoding and lip-sync geometry generation, and the second stage uses a dual-attention facial renderer to texture the target geometry. PersonaTalk outperforms existing technologies (including Wav2Lip, VideoReTalking, DINet, and IP_LAP) in visual quality, lip-sync accuracy, and personality retention, achieving results comparable to person-specific methods as a general framework.
SeedEdit, developed by ByteDance's Doubao team, is a versatile image editing model that leverages natural language instructions for tasks such as retouching, style transfer, beautification, and adding or removing elements. It excels in balancing image reconstruction and regeneration, ensuring high-quality and precise edits. As the first productized general image editing model in China, SeedEdit supports zero-shot learning and multi-round editing, simplifying the image editing process.