Look at these images. See the music.
The description hints at some music embedded in the images. Let’s open one in StegSolve and extract its data as a binary file:

The binary can then be imported into Audacity as raw data. Data plays as a melody, which confirms that it is a sound stream.
Switching to spectrogram view reveals the flag:

The flag is VolgaCTF{SOUND_IS_3D_LIKE_IMAGE}.