HyperAI

Bielik v3 Small: Technical Report

Krzysztof Ociepa, Łukasz Flis, Remigiusz Kinas, Krzysztof Wróbel, Adrian Gwoździej
Release Date: 5/13/2025
Bielik v3 Small: Technical Report
Abstract

We introduce Bielik v3, a series of parameter-efficient generative textmodels (1.5B and 4.5B) optimized for Polish language processing. These modelsdemonstrate that smaller, well-optimized architectures can achieve performancecomparable to much larger counterparts while requiring substantially fewercomputational resources. Our approach incorporates several key innovations: acustom Polish tokenizer (APT4) that significantly improves token efficiency,Weighted Instruction Cross-Entropy Loss to balance learning across instructiontypes, and Adaptive Learning Rate that dynamically adjusts based on trainingprogress. Trained on a meticulously curated corpus of 292 billion tokensspanning 303 million documents, these models excel across multiple benchmarks,including the Open PL LLM Leaderboard, Complex Polish Text UnderstandingBenchmark, Polish EQ-Bench, and Polish Medical Leaderboard. The 4.5B parametermodel achieves results competitive with models 2-3 times its size, while the1.5B model delivers strong performance despite its extremely compact profile.These advances establish new benchmarks for parameter-efficient languagemodeling in less-represented languages, making high-quality Polish language AImore accessible for resource-constrained applications.