This paper focuses on processing of direct speech in Belarusian electronic texts 
for the purpose of audiobook creation. Usually, for creation of an audiobook, 
synthesis with only one voice is used. It gives us perspective on the likelihood 
of making text-to-speech synthesis many-voiced, thus making audiobooks more approximate 
to representation of characters’ unique speech features.