This paper describes implementation of methods and algorithms for the automatic 
speech recognition based on word composition proceeding from acoustic phoneme 
models. Such a design of the speech-to-text decoder is conventional 
and most productive for Western languages. The aim is to explore this approach 
applied to the Ukrainian language that is highly inflective with relatively 
free word order. We use data-driven methods to estimate parameters 
for both acoustic and linguistic components of the mathematical model.
The grapheme-to-phoneme conversion procedure takes into account word stress issue 
and spontaneous continuous speech features. The basic speech-to-text system 
is able to operate a 100k vocabulary in real-time. The prospective of 
dictionary and domain extension, parameter estimation improvement and 
ergonomic issues are discussed.
Index items: Speech recognition, spontaneous continuous speech, 
generative model, real-time.