vector-based language identification