This paper outlines a novel approach for user navigation in complex virtual 3D city models on mobile devices. Users navigate within the virtual 3D city model by sketching navigation commands in the perspective view on the mobile client. The sketches are sent to the server, which reprojects the sketches onto the 3D scene, interprets these sketches in terms of navigation commands, and sends the resulting video-encoded image stream to the mobile client. This approach allows us to provide interactivity for complex virtual 3D city models on resource and bandwidth limited mobile clients. A high degree of usability is achieved because users can trigger complex navigation commands in a task and goal oriented way taking advantage of the navigation properties and affordances inherent to elements of geovirtual environments.