Shulex is an AI SaaS platform that specializes in Voice of Customer (VOC) analysis. It helps businesses understand consumer needs and market trends by analyzing data from e-commerce platforms and social media. The platform offers features like customer review analysis, competitor monitoring, and intelligent customer service. It provides insights into consumer profiles, product experiences, and purchase motivations, enabling businesses to optimize products, develop market strategies, and improve customer satisfaction. The AI customer service robot supports multi-language responses and handles various e-commerce scenarios, reducing manual customer service costs. Category-specific tagging helps quickly locate key issues and provides deep insights for better decision-making.
cobalt is an open-source streaming media download tool that provides a clean and ad-free experience. It supports full-platform video, audio, and image downloads, including mainstream video websites, social media platforms, and music platforms. cobalt offers personalized settings with multiple theme options, allows downloading videos up to 8K resolution, and automatically extracts subtitles. It is easy to use, requires no registration, and supports both web and Docker self-hosted deployment.
Outfit Anyone is a high-quality virtual try-on open-source project launched by Alibaba's Intelligent Computing Research Institute. This technology enables users or models to preview clothing effects without physically trying them on. The project employs a dual-stream conditional diffusion model that processes models, clothing, and text prompts, using clothing images as control factors to achieve more realistic virtual try-on effects. This technology allows users to explore and select suitable clothing more easily, while also providing new creative and marketing tools for fashion designers and retailers.
PictureThis is an AI-based plant identification app that enables users to quickly obtain detailed information about plants, including their names, species, and care advice, by taking photos. It can recognize over 17,000 plants, diagnose plant diseases, and provide treatment plans. Additionally, it warns users about toxic plants to ensure their safety. The app offers personalized care guides and features a community function for plant enthusiasts to share and exchange knowledge.
Tailor is a free and open-source AI video editing tool that leverages advanced technologies such as face recognition and speech recognition. It provides three core functionalities: video editing, generation, and optimization. Key features include face editing, voice editing, speech generation, subtitle and color generation, background replacement, and smoothness and clarity optimization. The latest version introduces voice-driven speech generation and a model self-check and repair mechanism, enhancing the user experience and making video creation more efficient.
Seed-VC is a zero-shot voice conversion technology based on contextual learning, achieving high-quality audio output and timbre similarity. Users do not need to perform specific training; they only need to provide a 1 to 30-second reference voice sample to achieve voice cloning and conversion. This technology is particularly suitable for voice conversion research, entertainment, media production, and speech synthesis. Seed-VC supports zero-shot singing voice conversion, transforming speech into singing while maintaining the original voice's timbre characteristics. Seed-VC provides command-line tools and a Gradio web interface, making it easy for users to perform voice conversions.
Animate Anyone is an open-source framework developed by Alibaba's Intelligent Computing Research Institute that transforms static images of characters or people into dynamic animations. Built on a diffusion model, it incorporates technologies like ReferenceNet, Pose Guider, and temporal generation modules to ensure consistency, controllability, and stability in the output videos. The framework has gained nearly 13,000 stars on GitHub and has sparked widespread discussion both domestically and internationally. Alibaba's AI chatbot, Tongyi Qianwen, features a "Tongyi Dance King" function based on this technology, enabling characters in photos to perform dances like "Subject 3," "Shoulder Shake," and "Shuffle."
Agent TARS is an open-source multimodal AI agent developed by ByteDance. It visually interprets web content and seamlessly integrates with browsers, command lines, and file systems to plan and execute complex tasks. The tool offers a desktop client that showcases multimodal elements and conversational workflows, making it a powerful solution for AI-assisted task execution and research. Currently in technical preview, it supports macOS and is designed to optimize development processes through intelligent agent-driven workflows.
Bolt.new is an AI-powered full-stack web programming tool that simplifies web development by automatically writing, running, editing, and deploying applications directly in the browser. Leveraging WebContainers technology, it runs a full Node.js environment without requiring local installation or configuration. Users can generate code through simple prompts, test it immediately in the browser, and deploy it with one click to cloud services like Netlify. Bolt.new also features automatic error detection and repair, making it accessible even to non-technical users.
EchoMimicV2, developed by Alibaba's Ant Group, is an advanced digital human project designed to create high-quality animation videos. It utilizes reference images, audio clips, and hand pose sequences to generate synchronized upper body movements. Building on its predecessor, EchoMimicV1, which focused on head animations, EchoMimicV2 extends its capabilities to full upper body animations, supporting both Chinese and English speech. The project employs innovative techniques like Audio-Pose Dynamic Coordination, Head Partial Attention, and Phase-specific Denoising Loss to enhance animation quality and reduce redundancy.
XingliuAI is a comprehensive AI image generation platform developed by LiblibAI, leveraging the self-developed Star-3 Alpha general image generation model. It integrates the world's largest LoRA enhancement model library and advanced AI image control technologies. Designed to enhance productivity for designers, photographers, and visual creators, XingliuAI offers features like high-precision image generation, intelligent recommendations, color control, regional redrawing, intelligent image expansion, and detail restoration. It supports various applications, including e-commerce, advertising, and artistic creation, providing diverse styles and exceptional aesthetic quality.
Buzz is an offline speech-to-text tool built on OpenAI's Whisper model, designed for Windows, macOS, and Linux systems. It converts microphone input or audio/video files into text in real-time, supporting multiple formats like TXT, SRT, and VTT. Buzz offers fast conversion speeds, high accuracy, multi-language recognition, and the ability to translate results into English, all while operating offline to ensure user privacy.
Deep-Live-Cam is an open-source AI tool that enables real-time face swapping in videos using just one image. It supports multiple hardware platforms including CPU, NVIDIA CUDA, Apple Silicon, and Core ML to ensure smooth video processing. The software includes anti-abuse mechanisms, adheres to legal and ethical standards, and reminds users to obtain consent from the person whose face is being swapped.
EchoMimic is an open-source AI digital human project launched by Alibaba's Ant Group, designed to bring static images to life with voice and expressions. By combining deep learning models with audio and facial landmarks, it creates highly realistic dynamic portrait videos. It supports generating videos using either audio or facial features alone, or combining both for more natural and smooth lip-syncing effects. EchoMimic is multilingual, supporting both Chinese and English, and is suitable for various scenarios such as singing, bringing revolutionary advancements to digital human technology, widely used in entertainment, education, and virtual reality fields.
Pi (Presentation Intelligence) is an AI-native platform designed to streamline the creation and sharing of presentations. It supports various content generation methods, including one-sentence generation, file import, and URL import. The platform features an AI-native editor for intelligent editing and dynamic layout, ensuring multi-terminal adaptation. Ideal for business presentations, education, and training, Pi helps users create professional-level presentations with ease.
An AI-powered video and audio editing platform that makes content creation as easy as editing a document.
A user-friendly AI chatbot that can help with writing, analysis, answering questions, and creative tasks.
The industry-standard image editing software now enhanced with powerful AI features for generative fill, neural filters, and creative editing.
An AI writing assistant integrated into Notion's workspace, helping users write, edit, and organize content more efficiently.
An AI content creation platform that helps create marketing copy, blog posts, social media content, and more.