It seems as though AI vision is expected to perceive a whole scene in one go and identify the important things in it. I remember when I first saw scanners that could "read" text from an image and copy it into a document that could be edited, and thought "Wow!". And of course ANPR cameras do the same thing with speeding cars. And I thought at the time that, just as our visual cortex has nerve cells that trigger when they see specific things, that actually building a whole series of little systems together would be the most sensible way to build AI. All those logon systems where we are asked to "identify all pictures with bicycles" - OK, but we aren't asked to actually point out the bicycles in the pictures. So how is the AI expected to learn which set of pixels is a representation of a bike? I can imagine some process identifying two circles side-by-side... but ah-ha! they are not circles but ovals! And the one on the left is bigger than the one on the right! This could be a bike, and it may be converging with my route...
I don't think you're wrong (or blind). I love AI's deep learning, but I'm looking forward to Artificial General Intelligence, which I'm guessing, will mimic how the brain infers by "considering" every input it has available. The brain is fascinating, and still not completely understood.
"Or am I wrong?" Not wrong, but blind. Just about everything you write on this subject heads toward active inference without all the deep learning you assume must be necessary, yet you reject active inference for reasons I haven't yet been able to fathom.
It seems as though AI vision is expected to perceive a whole scene in one go and identify the important things in it. I remember when I first saw scanners that could "read" text from an image and copy it into a document that could be edited, and thought "Wow!". And of course ANPR cameras do the same thing with speeding cars. And I thought at the time that, just as our visual cortex has nerve cells that trigger when they see specific things, that actually building a whole series of little systems together would be the most sensible way to build AI. All those logon systems where we are asked to "identify all pictures with bicycles" - OK, but we aren't asked to actually point out the bicycles in the pictures. So how is the AI expected to learn which set of pixels is a representation of a bike? I can imagine some process identifying two circles side-by-side... but ah-ha! they are not circles but ovals! And the one on the left is bigger than the one on the right! This could be a bike, and it may be converging with my route...
And don't ride a penny-farthing on the same road as a 'smart' car?
I don't think you're wrong (or blind). I love AI's deep learning, but I'm looking forward to Artificial General Intelligence, which I'm guessing, will mimic how the brain infers by "considering" every input it has available. The brain is fascinating, and still not completely understood.
"Or am I wrong?" Not wrong, but blind. Just about everything you write on this subject heads toward active inference without all the deep learning you assume must be necessary, yet you reject active inference for reasons I haven't yet been able to fathom.