That's more or less what I'm doing, and it's not tricky, it's just cumbersome if you have to do a lot of it.
What I mean by "manually" is that first I have to note precisely the length of the voice segment. Then I have to go to my starting point in the music file and select 45 secs, or whatever, attenuate 15db, do the fade-in, fade-out at the ends of the attenuated segment, THEN mix in the voice.
Some of the high-price apps do this automatically (I used to use Sound Forge). You just pick the starting point in the music file (with the voice segment on the clipboard), choose the drop-in mix option (don't remember what they called it), set the fade-out, fade-in, and target attenuation parameters in a dialog box, and Sound Forge does the rest. It's worlds easier than doing the four manual operations GW requires for the same result.