We trained a convolutional neural network (CNN) to map raw pixels from a sin- gle front-facing camera directly to steering commands. This end-to-end approach proved surprisingly powerful. With minimum training data from humans the sys- tem learns to drive in traffic on local roads with or without lane markings and on highways. It also operates in areas with unclear visual guidance such as in parking lots and on unpaved roads.
The system automatically learns internal representations of the necessary process- ing steps such as detecting useful road features with only the human steering angle as the training signal. We never explicitly trained it to detect, for example, the out- line of roads.