Deconstructing The Roguelike
In simplest form, a roguelike is a game of probability management.
Becoming proficient in any roguelike, classic or modern, is a process of understanding of the game's statistics and associated risks. As one becomes better and executes with a higher success rate, they develop an intrinsic understanding of the game in which each decision can be reduced to just several layers of probability. Arguably a learning algorithm could be applied to these statistical calculations and play the game with a high theoretical success rate.
For the sake of relation and understanding, I'll stick with the more modern roguelite variant of Faster Than Light (FTL). Though not a pure roguelike, it still preserves the spirit of the genera, but in a conventional environment. If you haven't experienced this game yet, I highly recommended you burn a few hours watching your crew die horrible deaths. FTL will teach humility at completely new levels.
We can start with a theoretical run with the best ship in the game, the Red-Tail. If this is not what you consider to be the best ship in the game, please get out of my office. The Red-Tail is balling out of control. It has an amazing room layout and is armed with the Quad-Laser. Although the ship doesn't have the highest starting base value, its low reliance on early key upgrades makes it the most potent in capable hands.
Starting out, one human will pilot, the other human will sit on weapons, the Zoltan will cook the engine room, and the mantid will mash shield controls as best it can. Through the first sector and the first half of the second sector, the Red-Tail will be looking for banked scrap and replacement lasers. You'll plan the most probable scrap-heavy routes and choose resolutions to combat resulting in the most cash money. The power of the Quad-Laser is time-limited and needs to be eventually upgraded into a Quad-Glazer. Mid-game, lasers are still your ideal weapon as the ship only starts with 5 missiles and 0 drones. You're still stockpiling at this point for the endgame and cannot afford to waste supplies.
Early combat can be priority summarized in most cases. Knocking out systems capable of causing you serious harm (missiles, bombs, megalasers, and pierce-capable beams) is most important. After reducing the incoming damage, you can procedurally stomp level 2+ shields, or engines and cockpit. Reducing incoming damage nets a lot of scrap in the long run, allowing you to pool money for laser upgrades. Killing level 2+ shields allows you to more reliably target systems as needed. And burning the engines and cockpit reduces the chances of some rebel coward peacing out and then ratting on you. Bomb, missile, and drone weapons should only be adopted in the case of necessity, or extreme abundance.
This setup was designed with the intention of configuring resources in a flexible manner to increase your overall success rate. Each decision, small and large, is influenced by the build. This is a learned behavior; as any player uncovers mechanics of the game they tend to converge on these strategies. A classic problem for any learning algorithm.
To apply a benevolent AI overlord to FTL, reinforcement learning would probably be the first step in teaching the AI how to maximize scrap and resources while restricting hull loss and resource consumption in a combat environment. This methodology would be most useful in teaching SHODAN how to handle each ship type in terms of system target priority. State levels would be defined on the basis of the ship's status against the enemy vessel's perceived status, including: hull, missiles, drones, scrap, systems, upgrades, and weapons. Since this is a preliminary approach to combat, navigation and upgrade choices are still yet to be considered. They are also far more complex as they require the most amount of pre-conditioning regarding the game's rules and have some very fuzzy success measurements.
Eventually FTL HAL 9000 would awaken and start spacing obsolescent crew members.