How AI works is that whatever input you have is converted into numbers, you do some maths, and then the new number gets converted to an output. The training data is used to make the equations used to do this as accurate as possible, by giving what an expected output would be from a given input. This is a big oversimplification of course, but this is essentially what happens. So yes, the images are encoded into numbers.