How Excel AI Agents Actually Work for Financial Modelers to Understand LLMs & Tools with Tim Jacks
In this episode of The ModSquad, hosts Paul Barnhurst, Ian Schnoor, and Giles Male welcome Tim Jacks, founder of Taglo, for an insightful discussion on the integration of AI in financial modeling. Tim’s expertise bridges the worlds of financial modeling and AI, and in this episode, he shares his journey and discusses how AI is reshaping the financial modeling landscape.
Tim Jacks is the founder of Taglo, a company dedicated to improving financial modeling with AI technology. His career journey spans financial consulting and software development, including building financial modeling tools. Over time, Tim's interest in artificial intelligence grew, and he delved into how AI, particularly Large Language Models (LLMs), could be used to enhance financial modeling processes.
Expect to Learn
How AI is revolutionizing financial modeling and the specific ways it’s being used today.
The technical components behind AI agents and how they differ from simple chatbots.
The importance of context and system prompts when working with LLMs in financial tasks.
Insights into the memory limitations of LLMs and how agents work around this challenge.
Here are a few quotes from the episode:
"If you're using AI for Excel modeling, you need to remind it to follow good financial modeling principles, like the FAST Standard." – Tim Jacks
"The beauty of LLMs is that you can go back and change the conversation, they're stateless, so it's like resetting the clock." – Tim Jacks
Tim Jacks provided valuable insights into the integration of AI in financial modeling, particularly how LLMs and agents are transforming workflows. While AI can significantly enhance efficiency, human expertise remains essential for applying financial modeling principles. Understanding the technical workings of these tools helps users leverage them effectively. The future of financial modeling will be human-led, AI-assisted.
Follow Tim:
LinkedIn: https://www.linkedin.com/in/timjacks/
Follow Ian:
LinkedIn - https://www.linkedin.com/in/ianschnoor/
Follow Giles Male:
LinkedIn - https://www.linkedin.com/in/giles-male-30643b15/
In today’s episode:
[00:05] - Intro & Hosts
[01:33] - Guest Introduction: Tim Jacks
[02:42] - Tim's Background in Modelling & AI
[04:16] - What Are LLMs Really?
[09:55] - ChatGPT vs. LLMs Explained
[12:09] - LLMs Have No Memory
[15:02] - How Tools Add Context to AI
[19:35] - What Is an AI Agent?
[22:35] - How Excel Agents Work
[30:08] - Demo: Tools in Action
[35:03] - Defining an Agent: LLM + Tools + Prompts
[38:49] - Key Takeaway for Modellers
Full Show Transcript
Host: Paul Barnhurst (01:13):
Welcome to another episode of the Mod Squad on the Financial Modeler's Corner. We're super excited for this episode to have our first guest with us. We're going to do an interview with the guest versus just testing a tool. You'll get these from time to time. So let me just start by welcoming Tim Jacks. Tim, welcome to the show.
Guest: Tim Jacks (01:33):
Hello. Hello everyone.
Host: Paul Barnhurst (01:35):
We're really excited to have you. And then obviously I have here with me one of my two co-hosts, the other one should join us here shortly. We'll mock him when he joins. I mean, we'll welcome him to the show. Giles welcome,
Co-Host 1: Giles Male (01:46):
Pleasure to be back and pleased to see that my punctuality record is perfect and we will mention that when he is here. I'm joking. He's obviously got something that's just delayed him, but yeah, very, very happy
Host: Paul Barnhurst (01:57):
To have Leg as well. So Giles is kind of rib me as well on that, just so people,
Co-Host 1: Giles Male (02:01):
Yeah, both of you. But yeah, excited to have you on, Tim. Obviously I can talk a little bit about how we've got to this, but yeah, good to have you. Yeah,
Host: Paul Barnhurst (02:07):
So why don't you start by maybe just introducing Tim to our audience a little bit and then Tim can add anything he wants and we'll go from there.
Co-Host 1: Giles Male (02:14):
Sure. So Tim, I think I met you at the Global Excel Summit last year. Now the first they
Guest: Tim Jacks (02:20):
Say, yeah,
Co-Host 1: Giles Male (02:20):
Last February
Guest: Tim Jacks (02:21):
Nearly
Co-Host 1: Giles Male (02:21):
Year, right? Yeah. So I remember you were presenting, you are the founder of Talo and essentially you, but I'll let you talk about the details, but you built a modeling tool and then I saw you recently an FMI networking event in London, and we were chatting a lot about ai, and I just saw that the three of us, myself, Paul and Ian have been saying this whole series. We're at this level where we want to explore, but we don't consider ourselves technical in the AI space at all. I'm not pinning the pressure on you to now be the world-leading expert in ai, but I just know that you have been testing these things and starting to kind of lift up that hood in a way that we haven't. So yeah, I kind of reached out and very, very happy you said yes to joining us.
Guest: Tim Jacks (03:07):
My background was as a consultant, where I did lots of financial modelling. I then left a startup, a software company like you mentioned, Tableau, where I built some software around financial modelling, but turns out no one wanted to use that software in the state was in. I went on an interesting journey in becoming much more of a software developer as well as a financial modeller. And I was ready to go full steam ahead into growing a lovely financial modelling consultancy when AI came along. And I thought, hold on a second, this is interesting. And then I sort of went down a rabbit hole of trying to spend a year learning everything I could about AI and then specialising in the overlapping bit of the Venn diagram where financial modelling meets ai, which is I guess where we are now.
Host: Paul Barnhurst (03:56):
Perfect. Thanks for that background. Appreciate it and we really appreciate you carving out some time. So maybe we'll start with just talk a little bit about what you've seen from LLMs and modelling. It feels like when LLMs first came out, we all heard car at math. Nobody used them for that, but remember the errors, they've got better and better now. We've seen Excel agent launched and we're seeing a lot of hype around what they could do. So I'd love to just get a little bit of your thoughts of how that journey's progressed. Any thoughts you have there? I know you've been kind of checking out the AI for a while
Co-Host 2: Ian Schnoor (04:29):
I guess by now. Lots of people are quite familiar with LLMs, but there's still, I think for a lot of people who use generative ai, a lots of different terms are used quite interchangeably and people aren't necessarily sure what they're talking about when LLMs really are a, so I think it's always good to think about them in terms of a subset of what used to be called ai. So AI used to be this broad thing with lots of different ways of doing artificial intelligence. It's now sort of become synonymous with lms, but LMS are really only a part of that and they belong in the space within ai or used to be called and still is called for people to work in IT machine learning. And the reason it was different from previous approaches is that it's done in a way where you are teaching a model to learn patterns about the thing that you want it to be good at.
(05:21):
And the key things, what made the real breakthrough with L LMS is basically the first part of LLM, which is large. So they're large language models. Well what happens if we give it all the data in the world? So it's like, well, okay, here is all the language in the world and this is how it started, what happens if we give all of that to the model and then train it on that? And they basically discovered it was pretty revolutionary. So that's like the beginning of LLM. And so one of the important things to note is that when a lot of people think about LLMs, well they've just learned the internet, so they can spew out sort of word by word pieces of literature or poems, et cetera, and they sort of think of it as it's memorised the internet effectively. So memorization doesn't work for that. You basically have to have generalisation instead of memorization. So it's like it's all about learning patterns instead of memorising everything. And so it sometimes looks a little bit when LLMs are spewing stuff out as if they've just memorised a load of stuff, but actually they're just very, very, very good at recognising patterns in speech and words and that's become very powerful. So that's l lms and mostly these have got more and more powerful as they go, and so their ability to generate useful responses to their input has become increasingly
Co-Host 1: Giles (06:43):
Powerful. Can I ask you a question on that point? Sorry to derail you a bit. When you say they're more powerful, this is a thing that I just don't have that technical knowledge on, is that because the coding under the hood is better or is it genuinely there is more power? So there's more compute power to do more analysis and pattern recognition than there was before.
Guest: Tim Jacks (07:07):
So there are few ways in which they become more powerful. So one way is just making the models bigger. They started out, they made them large. I exaggerated initially when I said they trained them on all the data in the world. Of course they didn't because it wasn't feasible. They trained them on a lot of data more than it had ever been done before. But there was still more data out there to train on. If you just make the model a bit bigger, have a model with 3 billion parameters instead of 1 billion parameters, and you don't really need to worry about what that 1 billion or 3 billion means, just 3 billion is bigger than 1 billion. It's better at pattern recognition now because it's just bigger. So that's one way of doing it. There's sort of a limit to that process though, because you should have run out of data eventually.
(07:49):
So then the question is can you artificially create more data, which isn't necessarily going to get you improvements because how do you artificially create that data? And that all becomes a bit dodgy, but there are other things you can do as well. So once you've trained your model and created your model, you can basically tune the model a bit. You are changing, its inbuilt, I'm trying to say avoid using technical words, but you change the inbuilt weights in the model such that it gets better at a particular task you want it to be good at. But that's basically it. And of course there are improvements in the background algorithms as well, and they introduce things like reasoning, which so now you see models doing their thinking, right? And they say thinking it's not really thinking, it's sort of important to know that they're not really thinking, these models aren't sitting there thinking in the background when they say they're thinking all they're doing is outputting some response. They're processing basically, which is enclosed in some tags, which say I'm thinking now. So you just output some response which are thoughts, and then it's like, okay, I've done thinking, I'll output some more response, which is my response.
Co-Host 1: Giles (08:56):
Is that where, because you were saying before we started recording it, I've heard other people describe it this way. Is that where it might come up with 10 different solutions in the background and then pick what seems the most accurate or appropriate? Is that what you were referring to before?
Guest: Tim Jacks (09:12):
No.
Co-Host 1: Giles (09:12):
Okay, cool. I'm here to ask all the stupid questions
Guest: Tim Jacks (09:16):
And I'm not sure what that would be referring to. Maybe we'll get there later, which are quite different from lms. And that's again where those two terms often get conflated. Or rather, if I were to ask most people what is chat GPT? Maybe I should ask you Giles, what is chat GPT?
Co-Host 1: Giles (09:34):
I mean, my knowledge is that it is an LLM essentially, isn't it? Or it's an interface?
Guest: Tim Jacks (09:39):
No, yeah, it isn't an LLM basically.
Co-Host 1: Giles (09:42):
Okay. It leads on an LLM, it's an interfacet
Guest: Tim Jacks (09:44):
Isn't even an interface. It sort of is an interface to an LLM, but not really. So these companies like OpenAI, they built these LLMs when they built these things, all they were doing is, so you give it some text and you say what comes next? And the LLM outputs, whatever it thinks comes next and that's what it does. And then basically like, well, this is amazing. It's really good at saying what comes next? What can we do with this? What's a useful application? So an obvious application is, well, you can have a conversation with it. So you give it some input like here is a message, give me a response. And that's what it does. But in order to have a conversation, you need to be able to basically build a loop around that. So you need to be able to send, here is our conversation to the LLM, what's the next part of the conversation?
(10:33):
And when you have something like chat, GPT, all it's doing is running a loop like that and it's quite different from how people assume it's working because essentially it's like talking to someone who's got no memory. If you remember Guy Pierce in Memento, it'd be like having a conversation with him where he can only, well, it's more extreme. He never remembers anything you've said basically. So what happens in this conversation, so the LLM, so you've got Chad GT, and you've got the LLM, the L-M-G-P-T five. The LLM never remembers anything you tell it. It's just this model. What happens with the conversation is we have to send it a bunch of stuff. So when you're running those chat interfaces, what you do is you send it a system prompts. So that's the thing where you say you are chat GPT, you are a helpful assistant, don't swear, be nice and a load of other stuff. There'll be a huge prompts that goes in and then it says, here is a conversation that you've been having with a user. The user said this, you said this, the user said this, you said this, the user said this, you said this, and all of that. So system prompts conversation, it all goes into the L of them and the L of them says this is what comes next.
Co-Host 1: Giles (11:49):
Is that where we get these context warnings where essentially every time you add another layer to your prompting, it's got to go back and go through van go, you said this, we said this, this, this.
Guest: Tim Jacks (12:00):
Yeah, that's right. So every time you add a new thing to the conversation, you send the entire conversation, you send the system prompt and the entire conversation plus your new message back to the LLM and it output everything, outputs everything again. But the LLM has no memory of practically speaking, it has no memory of your previous conversation. It's like your friend with no memory as if you're having a conversation with them, but every time you sent 'em a text, you had to remind them of everything you'd already said before that as well. Hello, Ian, Pam, it's great to be listening to you.
Host: Paul Barnhurst (12:36):
A good great point to bring in.
Co-Host 2: Ian Schnoor (12:39):
Yeah, how are you? I apologise. Heavy, heavy, heavy traffic prevented my on time arrival, but it is great to have you here and be listening to your great insights. So thank you. So when out in the video, repeat the entire Yeah, because there's no memory, I haven't been here, there's no memory at all. And
Host: Paul Barnhurst (13:00):
That perfectly covers the example we just gave. He has no memory of it, so we need give him all the context of the prior conversation so he can output his answer.
Guest: Tim Jacks (13:08):
Exactly. But I'm surprised by that. Why is there no
Co-Host 2: Ian Schnoor (13:11):
Memory?
Guest: Tim Jacks (13:12):
So the memory that exists is in the thing that you build around the LLM. Okay, so think of the LLM. This is a terrible analogy I thought of today, like a pasta machine. Have you ever made pasta yourself?
Host: Paul Barnhurst (13:26):
I have a pasta maker and I have never used it.
Guest: Tim Jacks (13:28):
Okay, so you get your ball of pasta, whatever, and you feed it into the pasta maker, you wind the thing round, right? And it all comes out the other end. But basically once it's out the other end, your pasta maker just sits there, it's not doing anything. It's basically off to an intents and purposes. So it is basically just a function you are calling in computer terms, it is like an Excel function. It's like here's the input, give me the output. Okay, so the LLM is just sitting there, we'd call it stateless in programming. It's you put something in, you get something out and then that's it. So it's up to the people who build software around LLMs to save that conversation. But the LLM itself, the model, the model that's been trained on all this data isn't learning anything. Basically it's just once you've set those weights, then the model basically is the model is the model. It just, that's what it is. And so anytime you see an application have memory or this kind of thing that's all built around it. That memory is basically someone storing parts of your previous conversations or your existing conversations somewhere else, somewhere in there in their,
Host: Paul Barnhurst (14:37):
So they're feeding that to the model as it needs it for that. Next answer,
Co-Host 1: Giles (14:42):
Again, I don't want to rush ahead too far, but is that what these third party tools are doing technically in some way?
Guest: Tim Jacks (14:51):
Yes. So I mean everyone is doing that. Yes. I mean, so in a way it's very simple to think about actually. So all your job as someone who's building a product around an LLM is to build up this context. So that's when people are talking about context, that's what they're talking about. So it's what do I feed in to the LLM to get the most useful response from the LLM
Speaker 6 (15:15):
Says,
Guest: Tim Jacks (15:16):
Well, I need to feed in the system prompts, which is like we said, gives guidance to who do I want you to be today? LLM, I want you to be an Excel agent today. Today you need to know all about Excel. Here's some information about the application you're working in, which gives you a bit more sense about the kind of stuff we should be talking about. I don't want you to, if the user asks you to do this, tell 'em you can't do it because you are only designed to be an Excel agent. That kind of fix, you have this big system prompt and it's say, well what else can I give it that's useful for its response? You might say, well let's add on a section about things we know about the user. By the way, this user is Giles, he is made 20 requests in the last day. He appears to not be that intelligent, that kind of thing. Like helpful stuff that the LLM you could do that. Most people probably wouldn't do that kind of thing, but you can add whatever you want it. So you might add memories. So CHA has memories, lots of software has memories and those memories, they're just appended. So it's like here's your system prompt, here are some memories, here's some useful stuff to know about previous conversations. And here is the current conversation.
Co-Host 1: Giles (16:27):
And just to layer on again on that, this is the crux of what I just don't quite understand is when we look at a shortcut or a tab ai, what they I guess are trying to feed in, if they were targeting financial modelling, they will be finding more data or financial models, more information to feed in to add to that context. Is that wrong?
Guest: Tim Jacks (16:49):
No, they probably weren't on the assumption that they're just using the commercial LMS that are out there, like the ones from OpenAI, from philanthropic, from Google, then they kind of have to use the LLM as it comes. You can sort of do a weird thing called fine tuning a model where you can get your own version of a model and pay maybe hundreds of thousands of pounds to train it on a very niche thing that you want to train it on, but is very expensive. And then someone will bring out a new model which is better than the model you were using anyway. So is
Co-Host 2: Ian Schnoor (17:22):
Why would it be that
Guest: Tim Jacks (17:22):
Expensive? Why would it be that expensive? It's a heavy computational thing that someone
Host: Paul Barnhurst (17:27):
Has to, so it's the servers, the cost to process it is that the big cost, when you say power, right, it's computing. The hardware is very expensive
Guest: Tim Jacks (17:37):
And so no, they're not feeding a lots of different models. So they will have their own prompt where they say, and they might give some specific guidance. So they might say, well, when you are building a financial model in Excel, you need to pay attention to these particular things like make sure, so for instance, all of these models will already know inverter commas about say the fast standard. So you could go onto any AI chat bot today and say, tell me about the fast standards of financial modelling and they'll be able to tell you about it. So in some sense, all the models people are using already know about these things, but if you want them to actually pay attention to that when they're working your Excel, you kind of need to remind them. So you might have in your system prompt by the way, it's really important for you to pay attention to good financial modelling principles like you see in the fast standard or whatever other standards.
(18:34):
If you want to talk about standards that you are interested in, you don't necessarily have to then because it sort of already knows about that stuff in its own way. You don't necessarily have to put all of the fast standard into your promise, but you do need steer it towards the right direction. I mean maybe now is a good time then to, we've talked about what a chatbot is and how it's not the same as an LLM, but it's software you build around an LLM. It's probably good now to talk about what an agent is. So all you're doing when you're creating an agent is where you're saying, well now I want my chat bot to be able to do stuff. I want it to be able to take action. So in our case, we want it to be able to manipulate Excel. The question is how do you enable an LLM to do stuff?
(19:20):
And basically is, well, what can LLMs do? LLMs can output text, we know that and I basically all they can do pretty much they output text. You say, well, all I need to do is I need to get it to tell me what it wants to do and then it's my job as a programmer to turn that request into the thing that it wants to do. You could almost do it, like imagine, say I wanted to enable chatbot, my agents to be able to open my front door. I could say, this is a very silly example. Now I'm sorry that I thought of it. I could say, well, you have the ability to open my front door. All you need to do is output the text. Tim, go and open your front door and it will happen. And then I'll message it and I'll say, oh, can you open my front door please? And the LLM will say, Tim, go and open your front door. And I'll say, okay, I'll go and open my front door. I'll come back, I'll enter the front door is open and it'll say, Hey Tim, I successfully opened your door. The LLM doesn't know how that door was opened. It just knows that it requested the door to be opened and then someone came back and told it yes, the door was successfully opened.
Host: Paul Barnhurst (20:29):
So somebody wrote the code in the backend. So it's kind of the instructions are there, but the LLM isn't actually doing it,
Guest: Tim Jacks (20:35):
Right? So in that case, I was the software that opened the door. All you do is you give the LLMA structured way of doing these things called tool calls. So calling a tool,
Co-Host 1: Giles (20:46):
Can I just use the example of where we've seen things like Claude and Chat GBT get to the stage where they can actually return Excel workbooks or PDFs? Is that an advancement in the tools in the background alongside the LLM? So whereas a year ago they would've said, we can't do it, but I can give you the VBA code and you can enter it into Excel and hit run and it'll do it. Now I'm assuming they have advanced to the point where there is a, I can build an Excel model tool. So you are prompting LLM build me this Excel model for X, Y, Z, the LLM goes or I'm going to call on my Excel model tool. It goes away, does it, comes back, goes it's done and it feeds it back. Is that right? That's basically right. So
Guest: Tim Jacks (21:30):
One way to do it is they allow the LLM to call a Python. So a Python code tool which sort of operates in its own computer if you like, the agents almost has its own computer, which it can run code on. So if it can run Python code that can generate PowerPoint or can generate Excel or can generate something, it can just write that code, basically go and do it in its own little computer and it will say it doesn't actually need to the Excel file it creates, it doesn't ever need to really touch that. We'll see it in any way. It just needs to be able to create that and tell you it's there and then the software is, and then you access it through the software that's been built around it. Yeah.
Host: Paul Barnhurst (22:15):
If I could ask a question here, I think I'm getting this, but let's take a couple tools out there. We tested Trace Light. Trace Light would only do a graph via Python. It had built its tool to call Python script to come back. The A LM said, Hey, say run this, it would do that. Where I've seen others, well, they'll go into the Excel object model, whether it's running so kind of J or different thing. So that's some of the choice of the different tools of the call to process it that they're going to feed back to the LLM. Is that a right way to think about this?
Guest: Tim Jacks (22:44):
Yeah, that's exactly right. So very good segue. So we talk about agents in general. So Excel agents really, it's exactly the same thing. Every agent is pretty much the same in how it's built, the difference, what tools do you make available to it? That's the only thing. And you can be as sort of clever or as, I mean it's all quite clever, but you can be as sophisticated or as simple as you want when you're building those tools. So some of them have a very Microsoft own agent has a very simple approach, which is it says we've already come up with a way of automating Excel, which is these office js scripts. You can create a script and run it and it will do so. Well, we already have this way, like the Excel object model like you talked about. We already have this way of automating Excel.
(23:29):
So all we need to do is tell the LLM, if you want to manipulate something in Excel, output a script and we'll run that script for you and we'll tell you whether it succeeded or not and give you the result that you wanted. If your script outputted made a load of changes and then read, the whole sheet will give you the thing that it read from the sheets and that will go back, get added to your context, we'll go back to you and you'll see everything that's done. See, that's a simple way. Can I ask two questions real quick?
Host: Paul Barnhurst (23:57):
Yeah, I just want to, so the first is you mentioned it's the tools that call and so I'm assuming it's also a tool before the LLM. Many of these tools will pick which is the best model, right? Some tell you to pick, but some of these tools might have 10 different models. So they've connected it to Claude and Chat GPT and Gemini and Meta and all that. I'm assuming they're writing code. When that prompt comes in to say send that question to this LLM and let it decide what tool to use, am I thinking about that? Right? Because a lot of 'em may say, Hey, we've made the decision, we're going to select which model is best at processing a task for you versus you selecting the model. But they also gave you the ability to select as well for the most part. But I think Cab AI was one that didn't, or was it El car may have been as well, but there are one or two that didn't and they just select for you. I'm assuming that's just programming they've written up at the front and they're taking that question then feeding it to the LLM, is that how?
Guest: Tim Jacks (24:54):
Yeah, that's at the front. So backs the analogies, that's just like which pastor maker do you want to use today? And yeah, that's your choice as the user. And that comes down to personal preference. Depending on the, they'll cost different amounts, but because remember these LMS have no memory, you could get one LLM to do 80% of the task that you're doing and then decide, well actually I'm going to switch LLM now I was using Gemini, now I'm going to use Opus. And it doesn't matter because the models have no memory. So you send the entire thing that you've been doing with Gemini to Opus and to Opus, it just looks like, well, this entire conversation, it doesn't know, it doesn't know, it wasn't part of the earlier conversation.
Host: Paul Barnhurst (25:47):
So in theory they could switch to a different LLM mid conversation and
Guest: Tim Jacks (25:52):
That's basically it. So really it's all about what tools do you give the model? What tools are helpful? I like to think about it in terms of abstraction. If you have a financial model and you want the LLM to understand that model, there's sort of a question about, well, if you just give it raw tools to just read, the only way it can read that sheet is basically by seeing the value or the formula that's in every single shed on the sheet, including all the empty cells. And then it has to go look at every sheet. There's no appreciation of the fact that maybe columns E to AA are identical. Every formula in them is the same. It's been giving a lot of context, most of which isn't very useful to it. Whereas if we are looking at it as a human, we are reading it, we're looking at the relationships between the different lines.
(26:42):
So you might say, well, I'm going to build my model to be a specialist in financial modelling and that means I'm going to give it a special tool, which is like don't ever try and read the raw Excel. I prefer it if you called this tool called Analyse model, and then you, you might run your own code which analyses the model and says, well, it looks like, and then gives a verbal description back and says, well, it looks like this is a model with a p and l and a balance sheet and it's over 10 years quarterly, and that's useful information. You can give LLM in sort of a more friendly way. Then it just reading the raw data that's on the sheet. So that's kind of the design decision these guys have to make when they're building these agents.
Host: Paul Barnhurst (27:27):
So if I can give an example to walk through this, and I'm sure Giles and in have some questions and I'll be quiet, but what I'm thinking about, so I got my Excel sheet and I ask a question, I'm using an agent, there's going to be some kind of parsing of that data so to speak, that they may use before they even send it to the LLM. Many of 'em because the LM by itself naturally trying to read a big huge spreadsheet is a mess. I know they've been difficult, so I've heard many of 'em have built their own markup language or different ways to pass the spreadsheet data to the LLM. Then the LM is going to read what it's been passed. It's going to look at that and decide on what tool to send the request to. The tool will say, I've processed that. Then the LLM will get that back. So it has that whole conversation and respond to you. Is that kind of a typical way to think about this?
Guest: Tim Jacks (28:14):
Yeah, probably. I mean you don't know. So remember again, we've talked about this, the buildup of the context system, fronts, tools, conversation. We don't know what else they've snuck in there. Every time they send the thing to the lm, they might include some useful information about the spreadsheet. That might be one way of making things better, but we don't really know if they've done that. So I've noticed recently that Claude agent for Excel has added is, as you can see that has happened is every time you make changes to the spreadsheet and then send another message, it knows about the changes you've made. And all that means is that in the background, their add-in has been taking a note of what changes you make and then when it sends a message to the LLM, it adds something at the end. So it adds a bit of conversation that you don't know about that you can't see, which says, by the way, since your last message, the user made changes to these cells.
(29:12):
And so yeah, I mean there are all sorts of creative ways that you can analyse the spreadsheet in the background. You could search for errors. If you are using an Excel audit tool, you could have that running as well and be constantly auditing and always append to the end of your conversation. By the way, we found these errors. There are lots of things and the number of things you can do there is limitless, but at the end of the day it's always the same. It's just giving some more information to that context that you send to the LM, which helps it do something more useful afterwards.
Co-Host 1: Giles (29:47):
I have loads of questions that will probably be stupid. So again, just to continually repeat this journey of you've got an agent you are now prompting from an agent essentially that's right. But it may add stuff to the language of the prompt at the end that gets fed through to the LLM and in our Excel agent world. What we're saying is there is then a tool that is the LLM will call upon to build the model in a way that the LLM doesn't really need to know about.
Guest: Tim Jacks (30:20):
I wonder, I can share my screen briefly and I can show you what some of these tools look like. It might help visualise in your minds. So this was just a little test I did with Trad for Excel, and this was just a really simple test to illustrate some of the differences about why yellows might want to use different tools in different situations. So I've literally said, here, enter the numbers one to a hundred in column A, trying to get this very simple results in column A. And I actually told it because I know what tools it has access to. I've told it to use the office JS tool, which is like we talked about earlier, just running a script and if we click on that we can basically see what it's done, which is it has written a script, which is what you'd expect.
(31:06):
You can basically see it's run a loop here where it said the values between one and a hundred, put those into an array and then it's going to put those values that are created into the sheet. So that's quite an efficient way of putting the values one to a hundred in a column. It's quite good to run a script in that situation. Then I asked it to do in the second sheet, it's going to look the same over here because we've got to the same, do the same thing but use different tools. So don't use the script tool and then we see, we're not sure what tools it's going to use, but we see kind of how it does it and this one
Host: Paul Barnhurst (31:46):
Different language in this case.
Guest: Tim Jacks (31:48):
So this one is not writing a script, it's writing these structured outputs that we talked about. So this is what tool calls really look like. So it's basically saying, well in range A one, I want a value of one and I want to copy that value to a two to a 100. And it actually didn't work. So it thought, now you can't see this here, but it thought it was going to sort of do a fill down type thing. It thought it was going to take the one then fill everything down to a hundred. And actually what happened when it did this is as you would probably understand as an Excel user, if you just put one in a one and then copy that down to a 100 hundred ones just in every, so, and that's what happened. So I mean very simple example, but because it didn't really understand how Excel works, it kind of made some assumptions about how this tool would work and it got it wrong, and you'll notice when it does this one, it's saying, okay, well let's just put formulas in all of those cells. Now it did it correctly. These formulas are all correct, but that's a really, really inefficient way of doing that piece of work. Right.
Co-Host 1: Giles (32:55):
So you mentioned the tool. I know there's something else you've got that might be quite helpful. All of these LLMs have this set of tools. Can you show more about that for Claude?
Guest: Tim Jacks (33:07):
Yeah, so rather we should say that every Excel agent will have a different set of tools and they're all based on the design decisions that the building these agents have made.
Host: Paul Barnhurst (33:18):
It's where the differentiation is between the tools. Yeah,
Guest: Tim Jacks (33:21):
Absolutely. And some of them try to keep those tools quite secret. They think that's an important part of the intellectual property of how their tool works. For someone like Claude, their intellectual property is all about the LLM itself. So they're more worried about that and that means they're happy to tell you about what tools it uses. So we can go into here and say, what tools do you have available? And so these are the things that it's calling. So it's told upfront in the context, you have access to these tools, you have tools for reading, you have tools for writing, and you have these sort of code execution tools and it'll be given some guidance. So it'll be like, well try to use these read and write tools if possible because they're probably more efficient. They're sort of a bit more carefully designed to modify Excel in a helpful way. If there's something you want to do, I don't know, create a pivot table or nicely format a charts, you might not be able to do that with one of these read and write tools, in which case you can use the execute office JavaScript tool as a backup for doing that. But all those decisions come down to how you've prompted the LLM, how you've given it guidance about which tool to use, what tools it has available, and ultimately the LLM itself, how it decides to use those tools.
Co-Host 1: Giles (34:43):
So it is taken me an hour, but I think I'm there because I'm slow. The agent is the collection of the tools around the LLM and if you take any of these kind of essentially major LLMs, you've got an LLM, which is I guess unique versus others, you've got a set of tools that are also probably unique, albeit some of them might be similar, and that makes up the agent, which as a result is also then unique.
Guest: Tim Jacks (35:10):
Plus you've got the prompts and the stuff you've got around that.
Host: Paul Barnhurst (35:15):
Bringing this back, I think we've gone deep down the rabbit hole a little bit. Hopefully everybody's still here with this. Let's bring this back to how does this apply? What do we want our audience to take away as modellers from what we just talked about, Giles, I think the light bulb just went on for you and I'd love to get your thoughts listening to all this. Does this change the way you're going to prompt the tools? How does this help you or what have you taken away from this?
Co-Host 1: Giles (35:41):
So it's really genuinely, I'm glad we've gone down the rabbit hole quite far because this is the step I hoped we take, albeit it's taxed my brain more than any other episode we've done, but it just helps me to understand actually what is going on when we send that prompt. Not perfectly, but the fact that I kind of understand that there is an LLM, but there is this bespoke set of tools around that LLM and the package of that is essentially what we're calling the agent. I don't know whether it will change my behaviour other than I'm quite intrigued by the fact that you could switch LLM partway through or at the end and actually in a sense it doesn't really matter. It's getting the same context. I think that's quite fascinating. I'm just relieved that I understand a little bit more than I did an hour ago.
Host: Paul Barnhurst (36:34):
So it feels like if I'm hearing you and then we'll get in thought. So this episode is really more kind of a knowledge one than necessarily changing the way you're going to work, which I think I've learned a lot of. Well, I kind of tend to agree with that. I don't know that anything you've said, Tim, and this isn't what I've learned is great, but is not necessarily going to change the way I prompt, right? It more is just helping me understand how these work in kind of your thoughts. I know you came in a little late so you might not got all the contact.
Co-Host 2: Ian Schnoor (37:02):
Certainly again, also interesting to hear you talk about, to hear you talk about this curious, interesting understanding how things work under the hood or under the bon, what happens underneath what happens? I don't really want to know. I'm a finance guy. I love the people like you that know and get it. I'm glad that dance exists and the people like Tim are out here. I kind of trying to figure out whether it's to that I need, is it important I, is it important? Did a finance professional know this? Need to know this? Will coding be better coding and your AI tools using your AI tools, if you understand by the same token, using advanced been techniques using advancing advanced TE self from teaching advanced, I don't know how, look, I don't know how, if we look at or data table or the pivot table, were table coded.
(37:54):
I know how the engineers, I don't know how the engineers, Microsoft Tools build those. Has it impacted, has it impacted my ability to use them? No. Is programming their difference to those programming behind each of those features that how they perform it hasn't really occurred to me. It hasn't really occurred to me to figure out whether in this new or whether in this new world order. It's important to understand. But again, it's fascinating. New frontier here. Frontier, it's not a disparaging comments. I'm trying to think through this. I'm trying to think of users in our lens, of our users, in our listeners, much is important for us to understand, to understand. Last,
Host: Paul Barnhurst (38:28):
I'll share one thought on that and then I'll let you answer, Tim. I mean as I've listened to everybody give their answer and we've thought about this, I don't think for the average probably modeller, it's that important. I think many to use it, some of you're more advanced and may be a little more technical. Knowing some of that stuff may sometimes, hey, they may go call a certain tool or they may do some things that they found a way to get better results, but that's the 1% or less, that's very few that are really going to go to that level. I think this is more just an interesting under the hood type of episode for those that want to learn. I think J said, I don't know that I'm going to do any different prompting. I don't know that it's changed the way I've worked today, but it helps me know how to think about it and understand better how the tools are a little different. And when I talk to them about, okay, what makes you different, I can ask about the tools. So for me, I think it's good education, but I'm kind of with both in and Giles on this. I'm not sure that I'm going to do anything different if I'm prompting the agent to build a model. So I'd love to get your thoughts as we kind of share our thoughts so far, Tim,
Guest: Tim Jacks (39:29):
I think that's probably right. So I think, yeah, there'll be power users like me who want to sort of distort the way they work a bit by trying to influence them in what tools they use. I would say understanding the tools is helpful when these agents aren't very good, they help someone like me understand why it's doing things wrong and get a sense of, is this something they could fix in future or not? And that's interesting to me in your day-to-day. I think once the agents are good enough, then Ian's right? You shouldn't really be caring about how they do what they do. You should just be able to review what they do as they do it. And if they're doing it well, then that's fine. I think maybe if there's one thing from today that might help you in your general work with ai, not necessarily just for Excel agents, is that understanding that these L limbs don't have memory is quite a powerful concept when it comes to, okay, so imagine you are on a date and the date's going really, really well.
Host: Paul Barnhurst (40:27):
What do you call that? What is that? No, I'm just kidding.
Guest: Tim Jacks (40:31):
It's going really, really well. And then you accidentally insult your date's mother. Unfortunately in the real world, that's that. In the world of ai, remember these L LMS have no memory, so you can undo that mistake you made and the LM will never know your date to the l and m will never know that mistake happened. So if you're having a conversation, and this might be in a chatbot, it might be in an agent. If there is a rewind functionality in that agent, that's where you go into a previous message and you edit the previous message. You might have done that in chat GBT or similar apps. That's sort of more useful and powerful than maybe you think or realise because what you are doing is saying, oh, actually this conversation went in another direction. I didn't want it to go in. It wasn't helpful. Let's go back to this previous point in the conversation and let's continue from there. And you don't have to worry about the LLM being confused about, it's now got memories of multiple forks of conversations. It's just for intents purposes, that last bit of the conversation just didn't happen. You can go back and sort of start fresh. So
Co-Host 1: Giles (41:36):
That's potentially really useful for me because I go down the route of going, why are you so stupid? Next prompt, why didn't you get this right? Do it again. I wanted this, but what you're saying is I should go into
Host: Paul Barnhurst (41:48):
That's, I feel like you talked to me Giles,
Co-Host 1: Giles (41:51):
But I should actually, if I've got the ability in an agent go back and edit the starting point where the prompting started to go wrong, that's something I,
Guest: Tim Jacks (42:02):
There's one useful takeaway then that might be it. Yeah. Yeah.
Host: Paul Barnhurst (42:05):
That's great. Interesting. Any final thoughts from anyone before we wrap up here? Any final thoughts you want to add?
Co-Host 2: Ian Schnoor (42:11):
We're going to continue to continue the journey and we're going to continue to navigated here. I think it's great to have you on, Tim. I think it's great to have you on, Tim.
Host: Paul Barnhurst (42:17):
I echo that. It was great. Thank you, Tim. I know I definitely learned a lot and helped clarify a lot of things because talked to the CEOs or founders of almost all these different agent companies and asked them what's different and they've shared different things, but now I would ask them different questions. It would give me a little better understanding as I'm talking to 'em. So I definitely found that useful for me was I talk to companies on the whole, I think this is an episode for just the nerds that want to know more of how it works on the backend.
Speaker 6 (42:46):
Yep. Good stuff.
Host: Paul Barnhurst (42:47):
Thank you so much again, Tim.