We explore a novel RF-FSO dual-path UAV net-work for remote scene aerial scalable 360° video capture and streaming, to enable future virtual human teleportation. One UAV captures the 360° video viewpoint and constructs a scalable tiling representation of the data comprising a base layer and an enhancement layer. The base layer is sent by the UAV to a ground-based remote server using a direct RF link. The enhancement layer is relayed by the UAV to the server over a multi-hop path comprising directed UAV to UAV FSO links. The viewport-specific content from the two layers is then integrated at the server to construct high fidelity content to stream to a remote VR user. The dual-path connectivity ensures both reliability and high fidelity remote immersion. We formulate an optimization problem to maximize the delivered immersion fidelity which depends on the content capture rate, FSO and RF link rates, effective routing path selection, and fast UAV deployment. The problem is mixed integer programming and we formulate an optimization framework that captures the optimal solution at lower complexity. Our experimental results demonstrate an up to 6 dB gain in delivered immersion fidelity over a state-of-the-art method and for the first time enable 12K-120fps 360° video streaming at high fidelity.