Complex / Compound tasks

A while ago (about 6 months? Maybe a little longer?) at work, I spun up a VM to tinker with Rasa and expand my AI knowledge. As a test case for myself, I integrated it into MS Teams (using Selenium). If I remember correctly, I had started with the Python library Chatterbot, to handle one-off simple requests but if someone typed a message starting with the prefix “!bb” (bb = BrainBot), it would bypass the ChatterBot library and send the incoming message to Rasa for much more sophisticated processing.

I had gotten, I feel, familiar enough with the nlu, domain and stories yml files. I had even started implementing custom actions. However, I want to take it a step further. What I’d like to do is have it start being able to generate code or handle much much more complex things.

An example of the later would be something along the lines of:

“!bb Give me a list of devices that were connected to on that also were an Acer Spin 714 Chromebook”

An example of the former would be something along the lines of:

“!bb Write me some Python code that can enumerate the list of users on the domain that are in the group as well as ”

For the coding example, I started thinking about the custom action in terms of atomic operations. That is, operations / tasks that defined in a generic sense. For example, a routine that can construct and return a for loop, a routine that can return a comment, a routine that can make an assignment to a variable, etc. and then cobble all of those together. However, while this seems very methodical, I’m feeling like there’s gotta be an easier way that existing AIs like ChatGPT, CoPilot, and others are doing. They can’t possibly be writing atomic operations for each and every language right? That seems… finite… discrete… static… AGI, is generic by definition. So to implement what I’m looking to do by doing something specific seems counter-productive.

For the device connectivity example, I started thinking about how compilers, specifically parsers, handle such similarly modeled statements in code. Trees are constructed, and parent nodes are typically those operators. In this case, it’d be the word “and”. Each sub-part would be evaluated, and as the tree is walked upwards it would hit either sibling nodes and evaluate them or go to the parent and evaluate what to do with the children results. This seems methodical, but also seems like the dataset would get WILDLY enormous since so many things can be a sub-child of English conjunctions. And again, it feels like there’s a more generic way existing AIs are doing this in a better way.

Are both of my approaches to what I’m looking to do completely wrong? Are they right, and that’s why any AI doing anything of any generalization requires datacenters for the processing? Thanks!