У нас вы можете посмотреть бесплатно How AI Broke Atari Game with Insane Strategy или скачать в максимальном доступном качестве, видео которое было загружено на ютуб. Для загрузки выберите вариант из формы ниже:
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием видео, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса ClipSaver.ru
Follow me: / georgewangyu Witness an AI learning to play a classic Atari game breakout. The end result is jaw dropping, which changed AI evolution forever. The goal is to move a paddle to bounce a ball off a wall of bricks. Each time the ball hits a brick, the brick disappears, and your score goes up. The primary aim of this research was to develop a deep learning model that could achieve human-level performance across a broad range of challenging tasks. Specifically, the model was designed to play Atari 2600 video games directly from high-dimensional sensory inputs (the pixels on the screen) and improve its performance to human-level or beyond on those games. In AI we need to talk about the input we give and the output we want so: *Input Variables:* The raw pixels from the game screen of various Atari 2600 video games served as the input to the model. *Output Variables:* The output was the action values (Q-values) for each possible action in the game, indicating the expected cumulative reward that could be obtained by taking each action. At 10 minutes, the AI tries to hit the ball back, but is too clumsy. At 120 minutes, the AI already plays like an expert! An hour later, this is where the magic happens. Before I show you, comment below if you think AI will surpass us in all forms of gaming in the future. At 240 minutes, it comes up with an insane strategy that digging a tunnel through the wall is the most effective way to beat the game. The DQN achieved superhuman performance in several of the tested Atari 2600 games, demonstrating that it could learn optimal strategies directly from high-dimensional sensory inputs. This breakthrough is important because it means that not only can AI that we create become human capable, but it is superhuman capable. Meaning it surpasses us as the creator. In fact I learned this from the book life 3.0 by Max Tegmark, where he says: “There was a human-like feature to this that I found somewhat unsettling: I was watching an AI that had a goal and learned to get ever better at achieving it, eventually outperforming its creators.” Before everyone goes AI fearmongering let me just list out some limitations: An similar example can be found in OpenAI's work on training their AI to play the game "Dota 2" at a competitive level, detailed in their paper "OpenAI Five" (not the exact study but offers insight into the scale of resources required for advanced deep reinforcement learning projects). They required *256 GPUs* (NVIDIA Tesla P100s or V100s) working in parallel to train their models over the course of weeks. The system was trained on thousands of years’ worth of in-game experiences, equivalent to playing 180 years worth of gameplay. So the first issue is: our current approach requires a large amount of computational resources and data (gameplay experiences) for training. This also means we are restricted on GPUs and data, not people. In my previous video, I talk about how SORA only needed 13 engineers. So this limitation remains as true today as it did 10 years ago. Second, f you just move the paddle by a few pixels. The AI is screwed it needs to learn all over again. It may not generalize well to tasks that are significantly different from Atari games or where the environment is much less predictable. It is not yet able to adapt and relate as well as humans do, yet. These findings indicate that deep reinforcement learning can be a powerful tool for developing AI systems capable of human-level decision-making in complex environments. However, we are at the stage where we have specialized intelligence but the model needs to improve if we want to get to general intelligence. I suspect the issue in my opinion is the goal we give it is very specific. We need to give it a more general goal. However giving it a more general goal raises a lot of serious issues. That’s outside the scope of this video, but if you want to learn more about why that is please comment below.