• 4 Posts
  • 15 Comments
Joined 1 year ago
cake
Cake day: July 9th, 2023

help-circle


  • one problem ive seen with these game ai projects is that you have to constantly tweak it and reset training because it eventually ends up in a loop of bad habits and doesnt progress

    you’re correct that this is a recurring problem with a lot of machine learning projects, but this is more a problem with some evolutionary algorithms (simulating evolution to create better-performing neural networks) where the randomness of evolution usually leads to unintended behaviour and an eventual lack of progression, while this project instead uses deep Q-learning.

    the neural network is scored based on its total distance between every bullet. so while the neural network doesn’t perform well in-game, it does actually score very good (better than me in most attempts).

    so is it even possible to complete such a project with this kind of approach as it seems to take too much time to get anywhere without insane server farms?

    the vast majority of these kind of projects - including mine - aren’t created to solve a problem. they just investigate the potential of such an algorithm as a learning experience and for others to learn off of.

    the only practical applications for this project would be to replace the “CPU” in 2 player bullet hell games and maybe to automatically gauge a game’s difficulty and programs already exist to play bullet hell games automatically so the application is quite limited.


  • I always find it interesting to see how optimization algorithms play games and to see how their habits can change how we would approach the game.

    me too! there aren’t many attempts at machine learning in this type of game so I wasn’t really sure what to expect.

    Humans would usually try to find the safest area on the screen and leave generous amounts of space in their dodges, whereas the AI here seems happy to make minimal motions and cut dodges as closely as possible.

    yeah, the NN did this as well in the training environment. most likely it just doesn’t understand these tactics as well as it could so it’s less aware of (and therefore more comfortable) to make smaller, more riskier dodges.

    I also wonder if the AI has any concept of time or ability to predict the future.

    this was one of its main weaknesses. the timespan of the input and output data are both 0.1 seconds - meaning it sees 0.1 seconds into the past to perform moves for 0.1 seconds into the future - and that amount of time is only really suitable for quick, last-minute dodges, not complex sequences of moves to dodge several bullets at a time.

    If not, I imagine it could get cornered easily if it dodges into an area where all of its escape routes are about to get closed off.

    the method used to input data meant it couldn’t see the bounds of the game window so it does frequently corner itself. I am working on a different method that prevents this issue, luckily.



  • yeah, the training environment was a basic bullet hell “game” (really just bullets being fired at the player and at random directions) to teach the neural network basic bullet dodging skills

    • the white dot with 2 surrounding squares is the player and the red dots are bullets
    • the data input from the environment is at the top-left and the confidence levels for each key (green = pressed) are at the bottom-left
    • the scoring system is basically the total of all bullet distances

    • this was one of the training sessions
    • the fitness does improve but stops improving pretty quickly
    • the increase in validation error (while training error decreased) is indicated overfitting
      • it’s kinda hard to explain here but basically the neural network performs well with the training data it is trained with but doesn’t perform well with training data it isn’t (which it should also be good at)







  • this change was made by the Tory government, btw.

    here’s the main part of the article copy-pasted (with important parts in bold):

    Today, Minister for the Constitution and Devolution Chloe Smith has announced measures to apply the tried and tested system of First Past the Post to the election of council and ‘metro’ mayors across England, and to Police and Crime Commissioners across England and Wales.

    In this May’s London Mayoral elections, the Supplementary Vote system saw hundreds of thousands void, wasted or blank votes cast, reflecting voter confusion and the complex system. Supplementary Vote also means that a ‘loser’ candidate can win on second preferences. In 1931, Winston Churchill described transferable voting as “the decision is to be determined by the most worthless votes given for the most worthless candidates.”

    First Past the Post is the world’s most widely used electoral system. The change to First Past the Post will further strengthen the accountability of elected mayors and PCCs to their electorate, making it easier for voters to express a clear choice. The person chosen to represent a local area should be the one who directly receives the most votes.

    Chloe Smith, Minister for the Constitution, said:

    Britain’s long-standing national electoral system of First Past the Post ensures clearer accountability, and allows voters to kick out the politicians who don’t deliver. First Past the Post is fair and simple – the person with the most votes wins.

    Kit Malthouse, Minister for Policing, said:

    We are strengthening the accountability and role of Police and Crime Commissioners, to help cut crime and deliver on the people’s priorities.

    Luke Hall, Minister for Local Government, said:

    Elected mayors can provide strong leadership, and must be held to account at the ballot box. The supplementary vote is an anomaly which confuses the public and is out of step with other elections in England, both local and national. Moving to First Past the Post will make it easier for voters to express a clear choice.

    edit: added link