yea. i think both assumption you make are right. it probably does just use the video.
also, i think it sort of has to use the initial exposure, as it would have no way to predict the exposures you will sweep to and generate an average.
but for casual panoramics, especially in even lighting, looks to be a handy feature, and one that many people enjoy playing with
but as you say it is no replacement for doing it yourself and getting an average exposure or doing some kind of hdr.