DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level ...
DeepCoder-14B-Preview, a fully open-source 14-billion-parameter coding model, has demonstrated remarkable performance in coding benchmarks, rivaling and even surpassing some of the leading models in the field. Here are the key highlights of its coding performance:
DeepCoder-14B was fine-tuned from Deepseek-R1-Distilled-Qwen-14B using distributed reinforcement learning (RL). The training dataset consists of 24,000 high-quality, verifiable coding problems curated from sources like TACO Verified, PrimeIntellect’s SYNTHETIC-1, and LiveCodeBench. Rigorous filtering ensured data quality, including programmatic verification, test filtering, and deduplication.
The team introduced verl-pipeline, an optimized extension of the verl RLHF library, which accelerates end-to-end RL training by up to 2.5x. Techniques like one-off pipelining fully mask trainer and reward computation times, significantly improving training efficiency.
DeepCoder-14B is fully open-source, with the model weights, training datasets, code, training logs, and systems optimizations publicly available. This transparency allows the community to reproduce and build upon the work, democratizing RL training for LLMs.
DeepCoder-14B-Preview represents a significant milestone in open-source coding models, achieving performance comparable to leading proprietary models like o3-mini and o1. Its comprehensive open-source approach and innovative training techniques make it a valuable resource for advancing AI-assisted coding and reasoning tasks.
For more details, visit the DeepCoder blog or explore the open-source repository.