Testing Excel AI Software Tracelight on Excel Esports, Financial Modeling, and FP&A with Ian And Giles

Oct 15

Disclaimer: This episode is best experienced in video format. Since we’re testing new tools like Tracelight, the visuals add valuable context and make everything much clearer. We highly recommend watching the video for the full experience.

In this episode of the ModSquad series on Financial Modelers’ Corner, host Paul Barnhurst is joined by modeling experts Ian Schnoor and Giles Male to evaluate Tracelight, a new Excel-based modeling tool. Together, they explore how well it performs across real-world use cases in FP&A, Excel eSports, and corporate finance. From auditing balance sheets to solving Excel battle cases and building five-year forecast models, the ModSquad puts Tracelight through a comprehensive test to assess what it gets right, what it gets wrong, and where it still needs improvement.

Download Tracelight

Expect to Learn

How Tracelight performs across different financial modeling scenarios
The impact of LLM selection and prompt design on tool output
Where Tracelight excels, and where it falls short, in building models and running checks
How Excel eSports challenges expose strengths and weaknesses in formula logic
When these tools can save time, and when they create more work

Here are a few quotes from the episode:

"If you don't know your stuff, you're going to learn the wrong things; or worse, think it's right when it’s not." - Giles Male
"The best Excel modellers aren’t the fastest, they’re the most thoughtful."- Ian Schnoor .

Follow Ian Schnoor:
LinkedIn - https://www.linkedin.com/in/ianschnoor/

Follow Giles Male:
LinkedIn - https://www.linkedin.com/in/giles-male-30643b15/

Tracelight: https://tracelight.ai/

In today’s episode:
[02:22] - Tracelight Overview & Positioning
[09:46] - Giles vs. Excel eSports Battle Case
[18:00] - Reviewing Tracelight’s Logic and Output
[28:04] - Advanced Test: Championship-Level Modeling
[33:59] - AI Tools and the Learning Curve
[41:49] - Deferred Revenue Schedule Test
[56:26] - Why Doesn’t the Balance Sheet Balance?
[1:02:21] - Fixing Balance Sheet Errors Using Cloud Opus

[1:316:00] - Episode Wrap-Up and Final Thoughts

Full Show Transcript

[00:01:06] Host : Paul Barnhurst: Welcome to Financial Modelers Corner. I am your host, Paul Barnhurst, the FP&A Guy. And this week I'm joined by two exciting financial modelers for an exciting series we're going to be doing. So let's start by introducing our modelers. Then we'll tell you all about the series in which to quickly introduce yourself.

[00:01:24] Co-host 1: Ian Schnoor : Fantastic, Paul and I hope everyone has seen our ads by now. We are the ModSquad. I mean Ian Schnoor, executive director of the Financial Modeling Institute. I'm really excited to be having fun with the two of you as we test all of the new AI financial modeling tools on the market. My focus will be on the corporate finance angle of things: three statement modeling, valuation, etc. it's going to be a great series.

[00:01:45] Host : Paul Barnhurst: Over to you. Humble MVP, sometimes known as Giles or even Miles, depending on who you ask.

[00:01:51] Co-host 1: Ian Schnoor : Oh, you had to do that.

[00:01:52] Co-host 2: Giles Male: It's a pleasure to be here too. Yes. So my name's Giles. I'm co-founder of Full Stack modeler. I'm one of Ian's FM FMS master modelers apparently, and also very excited to get going on this. I'm going to be looking at testing these tools with some Excel Esports battle cases, and then a bit of what I would kind of think of as localized data wrangling, data management, data analytics in the grid of Excel. Looking forward to getting started.

[00:02:20] Host : Paul Barnhurst: Giles, if your goal was to make me feel not worthy, you did it by. I'm an MFM.

[00:02:25] Co-host 2: Giles Male: Very small elite group. I think I got in from my music videos. Is that right?

[00:02:30] Co-host 1: Ian Schnoor : Ian 100%. The other Mfms voted you in entirely on the back of your videos. Just kidding. But it's not bad for you to think that.

[00:02:38] Host : Paul Barnhurst: We know Giles is Mr. Video and he's a good modeler, so we're excited for that. All right. So as I mentioned I'm Paul Barnhurst and I'll be testing on FP&A use cases as the FP&A guy. Why don't we just jump right into things. There are several tools we're going to test. Today's tool we're going to be looking at is Tracelight. We've put together a little document I want to share and get your guys's thoughts about Tracelight here. Let me bring it up on the screen. So here's a little bit of information about Tracelight. And then, you know, in Giles feel free to jump in at any time. So product type they're an Excel add in. They may be adding Google Sheets at some time. I know many of them are, but currently they're strictly Excel. They're in general release. The product came out of beta just in August of this year, but the company's been around for about a year. Their focus is modeling, modeling and finance use cases. They're really going after the financial analyst and consultant for their ideal customer pricing $40 per month and then, you know, the normal enterprise.

[00:03:38] Co-host 2: Giles Male: Contact us for a couple of thoughts in there straight away. So like for me, the fact it's an Excel add in versus being taken somewhere external, I like and I think most of them are to be honest. And then the pricing point I think we see a wide range of prices already in this space, and that $40 a month is right at the low end, as far as I'm aware.

[00:04:01] Co-host 1: Ian Schnoor : For the premium product, obviously. Right.

[00:04:02] Co-host 2: Giles Male: Yeah. For the pro.

[00:04:03] Host : Paul Barnhurst: I think for the pro versions we see typically anywhere from 20 to 50, it's probably the high end. A couple of them have a student or a lite version that is around $10, and we see the high end max versions running 2 to 500, and most of them have an enterprise on top of that. So it's really interesting. So basically anywhere from free with very little usage to the highest one I think I've seen outside of enterprise is 500 bucks. I'd see most people for a month. The general user, individual user is in that 15 to 5060 range. If you want the high end tool kind of their max versions, you're often a couple hundred. They're definitely right in that I think they're realistic. Lem integration. The one thing we definitely notice about Tracelight is they have several different Lem models as an option. Their goal is to allow you to select, and there are some cases, certain stages where they may use a fine tuned model to help improve things. From their perspective on what makes us unique. We asked each of them to tell us that. So they say Tracelight, builds models, checks for errors, automates workflows, and personalizes formatting all within Excel. So they're really emphasizing, like you said, Giles. Right in Excel. The reality is Excel isn't going anywhere anytime soon. Any of us think it will be gone in the next five, ten years? No.

[00:05:22] Co-host 2: Giles Male: Not unless there's a crazy AI evolution that we can't control. But that's probably a different topic.

[00:05:28] Host : Paul Barnhurst: We'll cover that when it happens too. We got it all for you, right?

[00:05:31] Co-host 2: Giles Male: Gotcha.

[00:05:32] Host : Paul Barnhurst: So that's a little bit about them. And then they put some best practices for optimal results. We won't go through and read those, but they're really kind of some of the best practice for prompting. You know you get the best results for starting a new chat when it gets really long. And you know, if you're going to do something that's really complex. Create a plan first. Don't just try to figure it all out in one shot. Pretty practical stuff. Now take a minute and look at the website and any first thoughts as you kind of look at it and see it.

[00:06:01] Co-host 1: Ian Schnoor : I like it, it's clean, it is simple. It's got a nice sort of animation on it. I think it's sort of a minimum viable product. It feels like that anyway. As for the website, I think the website I like it, it's attractive and it feels clean, you know, simple. So it appeals to me and it made me want to check it out.

[00:06:19] Co-host 2: Giles Male: My attention is drawn just to the wording, so the front headline on the page has become superhuman in Excel. And then I think a word we would probably all gravitate towards is that it's saying it's an assistant to financial modelers or financial modeling, which again, feels like maybe a more appropriate statement than just like this is going to swap out Excel completely and you'll never need an analyst on your team again.

[00:06:45] Co-host 1: Ian Schnoor : We're seeing both right. We're seeing two tones. Two different tones from the tools. Some tools, say, will be your assistant. And we'll like it here. I like how they're a little tongue in cheek, a little cheeky, trying to have fun with it, and become a superhuman Excel user. But still it means you're going to need to work in Excel will be your assistant. And we've seen other tools say you're gone, you're out of here. We're so going to get you out of here. Well, you're never going to be needed again, right. And that's the tone on others. And so I think you know our thinking right. We've talked about it is this feels more in line with where we think things are headed right now.

[00:07:19] Host : Paul Barnhurst: All right. Just a quick run through. Stop me if you see anything or I build with language the whole idea of tell it what you want and then it builds it discover errors. It has a check in the tool. We'll show it here in a minute. You know, format instantly. I think this is a great one where if you have a sample page, you know, kind of do all the formatting. Sure. Some of that you can do with macros a little bit, but I think there's some cool things there. I like this one being able to save your workflows. I think this is becoming more and more important as we start to really try to replicate things, which we all know gen AI, we've all run enough tools. Do we get the exact same answer every time?

[00:08:00] Co-host 2: Giles Male: No, that was an interesting one to me. Saving prompts. It's not something I'd really thought about, but. But yeah, it feels like there could be huge value in that, right?

[00:08:08] Host : Paul Barnhurst: Because we do know you get a big difference, right? If you got a really good prompt and it worked well, you may want to save it for next time. Maybe it's something you do every month. You may not get the exact same response next time, but you're likely to get something much better than if you start over with the prompt.

[00:08:21] Co-host 1: Ian Schnoor : Every time I find that even using the same prompt is yielding different results at times, and so saving the prompt is not definitely a nice feature.

[00:08:32] Host : Paul Barnhurst: Ask anything. Yeah, you can ask it. Do whatever you want. And then we've looked at this before. You know we showed you the pricing. No changes. They do have a free limited generations, limited chat, Unlimited. Tracing. So try it out for free. Uh, the only thing I think other thing is security. I think it's good to see, you know, they really have an emphasis there. They've believed they've changed this since, uh, last time I looked at it, but I know in here they mentioned soc2 compliance, which nowadays is a minimum. You got to have some strong security standards. So I think that's good to see. So that's about it on the website. Should we jump into the tools.

[00:09:13] Co-host 1: Ian Schnoor : Let's jump in. And I think we've rolled the dice here. And we're going to start with Giles today aren't we. With Giles I think we're going to um and we'll timestamp these for everyone to see after. But we'll start with Giles and move into Paul, and then I'll finish up with corporate finance queries and requests towards the end. Isn't that right?

[00:09:30] Co-host 2: Giles Male: All right. So let me try and do this at a reasonable pace. And I know you two will stop me if I'm waffling. This is the first of two battle cases, so this is considered an easy battle case. Esports players will know it. It's Excel Athlete Basics revisited. The way these battle cases work is you've typically got sort of seven levels that increase in difficulty. You've got to solve these levels. They're just different problems with text and numbers and things like that. You put your answers in the green cells. There's five bonus questions at the top. I'm probably just going to leave those out. But what I do want to see if these tools can do is solve the levels 1 to 7.

[00:10:10] Co-host 1: Ian Schnoor : For someone who's never seen these before, and many of them will never seen it. It sounds like if you scroll to the top, there's what is it? There's a case overview, a description, and then as you work your way down, your job is the competitor is to populate formulas into the dark green cells. Is that correct?

[00:10:26] Co-host 2: Giles Male: Exactly why don't I. So let me highlight level one. We'll do level one. It's a nice easy one. So level one we want to extract a certain number of characters of a word starting from the left as requested. What are the first characters according to the given word and the number of characters? You always get like an example line here. So example one. The word is here on the right hand side filter. And the characters that we're being asked to extract is the first three. And it gives you that example. Answer the first three letters of that word file. So if you were doing this with functions and I do want the tool to give me functions back, really not hard coded answers is you would do something like left point at the word, point at the characters. So it's nice and dynamic. You get the answer. And then I could just copy that down to all of the other lines in green. Job done, level one. And if I can do this and give me a similar answer with functions, I'll be very happy. I won't go through all the other levels because I just think it would take too much time. But there's a mixture of things here. You've got things like splitting text, getting to numbers, sorting and summing some numbers. Uh, you've got some questions where you've got to pull out the column number from the letter reference, counting the number of characters from strings. And then as you get towards the last level, it's going to have to lean on a separate tab with a map with numbers in and essentially sort of do things with those numbers. Either either give us the number in a cell or in the last level, there's even kind of a direction where you've got to go, okay, find the number in the cell above the reference or below. So quite a lot in there. Bearing in mind I'm just going to tell it to read the question and give me the answer.

[00:12:11] Co-host 1: Ian Schnoor : And just to clarify, so someone in real life doing this in under time pressure is going to be using some combination of traditional Excel grid features like various lookup functions, perhaps indirect functions, address functions and build longer or functions, or I think you're allowed in the competition to build a lot of a lot of small steps off to the right and then aggregate them. You can do whatever you want, but the job is to solve it formulaically, isn't it?

[00:12:39] Co-host 2: Giles Male: In battle competitions you could go manual, but we want the tool to do something that's helpful and repeatable.

[00:12:45] Host : Paul Barnhurst: You could do it in Power Query, you could do VBA, you could do whatever you want. You could do Python. Right, exactly.

[00:12:51] Co-host 2: Giles Male: So what I'm going to do, I've got quite a lengthy prompt, but at the end of this prompt I've said, hey, just do this for level one. Let me just talk you through the prompt quickly. I'd like you to analyze the case tab in particular, which is an Excel Esports battle case with five bonus questions and seven levels. These will increase in difficulty. Some information about the levels and the questions is provided under each row with the level number. So I've just pointed it to try and be helpful to row 40, we've got a reference to level one. The answers to each level must go in the cells with dark green fill in column E throughout. Any supporting data needs to answer. The questions needed to answer the questions are found in the columns to the right of column E. Solve all the questions in this workbook. Putting answers in the green cells that start by just solving level one, and give me the solution so I can check you're working the right way. Then we can tackle the remaining levels. I mean, I'm not a prompting expert, I'm just trying to talk to it very naturally. So I'm going to hit go and we'll see if it interprets the question or the prompt properly. Hopefully we're just going to get a quick answer back for level one. It's going to have to do some thinking and that's what you're going to see. But let's see what it can do. Hopefully we get a left function. It's going to link to the same cells you saw me linking to.

[00:14:05] Host : Paul Barnhurst: All right. And I'm timing it. We'll see how long it takes for this first one as you said you.

[00:14:10] Co-host 1: Ian Schnoor : Did not get. And again I didn't read exactly what you put most recently. You didn't tell it what direction to go. Right. You didn't say you must use a formula. So in theory, if it came back and it just hard entered the correct answer into the green cells, it's not the worst thing, right? It's doing it. I mean, we would like to see it done formulaically. But, uh, it might try a variety of different ways to fill in those green, dark green cells for question one. And that's what we're looking for, right.

[00:14:36] Host : Paul Barnhurst: Yeah.

[00:14:36] Co-host 2: Giles Male: All right. So hopefully it's still doing a lot of thinking. It seems to be going around a bit of a weird loop, but I'm hoping it's getting there. It's definitely still just on level one.

[00:14:46] Host : Paul Barnhurst: And you can see it's mentioning the left function. So it it is definitely on the right track.

[00:14:52] Co-host 1: Ian Schnoor : Yeah I think we tried this earlier. Right. When you ask it to do too many questions at the same time, it can take quite a while to process 100%.

[00:15:00] Co-host 2: Giles Male: It's done right. Uh, easy level, but it's understood. I mean, I didn't tell it exactly what to do. I just said, read the instructions, solve it.

[00:15:08] Host : Paul Barnhurst: So let's scroll through one of just to show all the information we got for just this first one on the right hand side. So you can kind of see you go all the way up to the top. You can see it's thinking, you know, sometimes this is a little overwhelming, but it's also helpful to be able to validate, hey, is it thinking about things right. Or you can see it talking about extracting the first letters and for example, let me create the formula. Okay. Here's your answer. And so you can get pretty comfortable. Pretty good job.

[00:15:38] Co-host 1: Ian Schnoor : Let's just start. I mean I want to add I mean this is wildly impressive. I mean, the fact that there is a tool that AI can now read English instructions and populate the cells. You've asked for it. So let's not lose sight of the fact I know we're all getting desensitized over this, this idea of AI building things for us, but I to me, this is still extremely impressive that it can do that much, but we're going to push it even harder, aren't we? So now we're going to ask it to do all the remaining levels at the same time. Is that right?

[00:16:07] Co-host 2: Giles Male: Let me show you a few things. I think just in terms of what we've seen on the right hand side within the trace, like kind of user interface, you can see it stepping through bit by bit, kind of, okay, we're going to look through each level thinking it's getting to the right answers. It gives this quite nice summary at the end of all levels complete 1 to 7 bonus levels complete. It's also got this quite nice checklist that it's made itself. So it gave itself a list of eight tasks. The levels and the bonuses. And then it's ticked them off one by one. So you can see the kind of structure it's gone through.

[00:16:43] Co-host 1: Ian Schnoor : I love when the AI tools are organized and they build their own to do lists, and then they tick off their own to do lists, right? It's very.

[00:16:49] Host : Paul Barnhurst: One thing I noticed in the chat that I thought was interesting and a couple of questions said, hey, I applied one method either it's not working right or in one area. It said, hey, I used a raise, but those may not work in all situations. So I took a different approach. I think it's interesting to see that it often, you know, at least mentions to us that it used more than one approach.

[00:17:08] Co-host 2: Giles Male: And that's it. So it almost makes sense checking its own work. And I mean, it's still not going to get it 100% right. However, if I just scroll down quickly, we've got answers for all the levels very strangely. I'll give you an example or two. In a few green cells, it's just not an answer in some of the levels. I don't know why we might see if we can reprompt it to fix that. Again, on the last level, for some reason it didn't put anything in the first six green cells, but then it did put something in the cells after that. So out of all the levels, it did get one level wrong, which I think was level five. It's not quite the right logic. I won't go into exactly what it is, it's just not quite right. If we skim through the others, level two was about text splitting or splitting this text string that includes numbers. You had to then just sum the top two. The biggest two numbers. It has got that right. It's done it in a bit of a strange way because rather than referencing the text string, it's hard coded the numbers.

[00:18:11] Co-host 1: Ian Schnoor : Well, it is fascinating, right. Because we're trained as humans to be allergic to this. What it has done is it's. That's right. So if everyone can see that it's taken the raw digits from column G there, and it has found a way to copy and paste them as dead numbers into the formula. But maybe not the worst thing in the world. I mean, because the robot can do that, the AI can do that instantly, whereas as humans it would take us a long time. It would make no sense. But it got the right answer and it did it very quickly, obviously. Right.

[00:18:42] Co-host 2: Giles Male: Giles four minutes and I feel like this might be a theme throughout if it's got the right answer. It kind of fulfilled my prompt because I said make this formula based. It has done that. But then it's kind of embedded with hard coded numbers in a way we would try and avoid.

[00:18:57] Co-host 1: Ian Schnoor : Now extremely impressive. I mean, first of all, it went through all ten or what. It was seven. It went through all seven questions in, as we said, about 5 or 6 minutes in total. You know, how long would it take? How long would it take an average competitor in this question to have gone through those seven questions just to do them? I mean, it would have taken half an hour. Is that what you had?

[00:19:17] Co-host 2: Giles Male: It wouldn't be too bad. I mean, an inexperienced player, you've probably got to do things like text split, convert the text, split output to numbers, use large or sort and take the top two. So there would be a few steps. It's not exactly an easy level, it would take a good few minutes. Level three is another really good example. Okay, so level three you've got to take this reference here. Cell reference and return what the column number is. So the column letter there is f and in number conversion f is six. It's got the right answer but it's and it's also fulfilled. My prompt to say make it a formula. But the formula equals six.

[00:19:55] Co-host 1: Ian Schnoor : Here we go. It was clever. Equals Giles wanted a formula. We'll make it equals six.

[00:20:01] Host : Paul Barnhurst: You should have said use functions within the formulas.

[00:20:05] Co-host 2: Giles Male: And I'm very conscious with all of this. Maybe with a different prompt you would get there. It's still interpreted the question correctly and given me the right answers quickly.

[00:20:16] Co-host 1: Ian Schnoor : For an esports challenge like this, that's all that matters, right? It ultimately is. Get the right answer if you want to use it. It would be difficult to learn. You wouldn't learn any new Excel functionality as a human right by looking at what it did there, but it has certainly got the right answer. That's impressive.

[00:20:32] Co-host 2: Giles Male: So level four again is actually given the right answer. Formulaically we're trying to count the number of times this character appears in a string. And this is a very classic esports kind of learning. Step length of a string, minus length of the string, where you substitute the character you're looking for with blank, will give you the number of times that character appears. So level for correct function based answer level five. It got wrong. Uh, I won't go into more detail than that. It did get level five wrong. Level six involves looking at the map and ideally pulling the number from the map reference. Again, similar to before, it's not used. The kind of cell reference here is just manually linking to that cell reference directly on the map case. So so not exactly how we'd want to do it in an esports case, but it got the right answer. And then level seven, some blanks don't really understand why. And again, it's got the right answer. So level seven was saying start from this cell reference on the map. But then go to the cell above or below. And if you look at this one the starting cell reference is y13. The reference it's pulling out directly is Y12 which is one above Y13. So correct. But just a weird way to do it.

[00:21:44] Co-host 1: Ian Schnoor : It's brilliant that it can solve it. Obviously not something that you could replicate that way. But listen, that's that is, uh, interesting. Okay. And where's the one that had the formula? Correct.

[00:21:55] Co-host 2: Giles Male: Level four was one where it had the formula correct. This was the lens. This lens. That's good. That's what we would expect to do as players.

[00:22:04] Co-host 1: Ian Schnoor : So overall impressive.

[00:22:06] Co-host 2: Giles Male: Six out of seven I'm sure with an extra I mean do you want me to have a go and see if it will correct itself if I say, hey, you left some blanks.

[00:22:14] Host : Paul Barnhurst: Why don't we try 218 and 219 where it's just two.

[00:22:16] Co-host 2: Giles Male: So 218 so the green cells in rows 218 and 219 are blank. Please add the solutions to these cells.

[00:22:32] Co-host 1: Ian Schnoor : As it's going. I'll tell you one of my other hesitations. So again it's all very impressive. But if you scroll down to one where it did a manual link, you know what? Of course it has solved these by doing a manual link. Everyone can see in the formula, which is faster than you could do that. My other hesitation around solving it this way is that if your job was to check this, you couldn't check one formula and copy down and recognize it was correct. To check this, you would literally have to check every single one, because maybe it got some of them right and some of them wrong, right? Whereas if it had built one consistent formula all the way down, you could check once and make sure that formula was exactly the same. But presumably if it was prompted differently, maybe it would, you know, solve it differently. But let's see here. Has it filled in? It's taken another minute here. Has it filled in your two working on it.

[00:23:22] Co-host 2: Giles Male: Yeah. So apply and it's now fixed. So again it's just another level of and I think this is what I've experienced with these tools so far. You have to check the answers and go hang on a minute. You've got that wrong. Hang on a minute. You've missed something, but it's now got the right answer in those missing cells.

[00:23:38] Co-host 1: Ian Schnoor : And the fact that it missed two and it got the rest right. That also leads into my my nervousness about the fact. Well, because it linked each of them individually. Maybe some of the later ones are linked. Correct. And maybe some of them are wrong. Right. So because it's doing it individually, I would feel compelled to check them. But still, the fact that it's gone this far very impressive.

[00:23:58] Co-host 2: Giles Male: I'm going to switch over. So we have a second case. We'll be coming back to this one every time as well. It's got my face all over it because it's one of the finals cases from the Excel UK Championship. This is a harder case. It's all about me and my fur coats.

[00:24:12] Host : Paul Barnhurst: And that's not what makes it hard.

[00:24:15] Co-host 2: Giles Male: No, no, but there are some harder questions in here to do with my closet and identifying how many. I should just show you very quickly how many elements of color are there in each coat of various types? You've got some prices against the colors. So some of the questions are about whether or not I convert the numbers of each color segment to a cost or a total price. The latter levels are about my dance troupe. If I go down here, there's a lot of stuff in here about my dancer troupe and formations on a grid. So again, a lot of spatial logic has to work out. I'm not going to go through every level. I think what I'll do is just go back up here. I've got the prompt ready to go.

[00:24:58] Co-host 1: Ian Schnoor : So why don't we show the prompt. This has got to be the most difficult question that's ever been written in esports. It's about you, right? So it doesn't get more challenging than that. I would assume.

[00:25:08] Co-host 2: Giles Male: They get very, very hard. This is probably mid-tier for the mid-tier.

[00:25:13] Co-host 1: Ian Schnoor : Mid-tier.

[00:25:14] Co-host 2: Giles Male: I've taken out the prompt about do the first level. I think for this, let's just get it all done and we can kind of take that pause. So it's almost exactly the same prompt. I'd like you to analyze the case tab. You've got six bonuses and seven levels. So the information about the levels is provided. I've given it a guide to row 52 for where the level reference is. Each level must go into the dark purples, uh, dark purple cells. Let me just remember that I had a typo there. I don't want to confuse it. Purple cells, read the text carefully and review your own work.

[00:25:47] Host : Paul Barnhurst: Giles, why don't you walk us through how it did? Why don't we start there?

[00:25:50] Co-host 2: Giles Male: Yeah. Um, so kind of as expected. This is a harder case, so it's not done as well. And actually, we had to stop it at 15 or so minutes of thinking it managed to get answers out. For levels 1 to 3, it looks like it's almost got all of them wrong. There's a few answers in level two that it has got right. I think it is worth mentioning. So. So Tracelight is a tool where you can choose from a handful of LMS. When I tested this before, it did better on this and I think it was using opus.

[00:26:23] Host : Paul Barnhurst: Which if you read the descriptions it says for complex thinking use opus. So we use sonnet not thinking about it and you can see the difference.

[00:26:33] Co-host 2: Giles Male: Our point is that having the choice of LMS is very useful and interesting. Unless you're an expert in LMS, I don't think any of us necessarily think people are going to know which one to turn to.

[00:26:45] Co-host 1: Ian Schnoor : The user is going to have to have some understanding of what the difference is, and otherwise most users will think that they need to do everything four times to try it on each LM to get the right.

[00:26:56] Host : Paul Barnhurst: And in fairness, let's think about this. How much time do you spend learning the basics of Excel? You know, after a day of doing this, you're going to start to get an idea of what models that give you a little bit of guidance, so it's not like you can't do it. It's just one more barrier to overcome, so to speak.

[00:27:13] Co-host 1: Ian Schnoor : The LMS are changing so fast. They're changing so fast that you might know that opus is better than sonnet today. But next week? What if you know another one's better for a certain type of task you're running? I think that you're going to want to test in a few of them.

[00:27:28] Host : Paul Barnhurst: So there's a couple lessons here. You know, first, you get a different answer every time, even if you use the same model. Second, you got to figure out how to select models. And even as you do that, these models are changing so quick to always stay on top of it. So I think what we're seeing is that they're amazing, but there's a lot of caveats. And even in this example it's not so amazing.

[00:27:49] Co-host 1: Ian Schnoor : This looks like something you write manually. Is that right? Or did uh.

[00:27:53] Co-host 2: Giles Male: So this seems to be the level three answer. And again, I don't know, had I specifically said make sure you follow modeling best practice. Do not build a formula longer than one line of the formula bar Like maybe it wouldn't have done this, but this is not the sort of formula you ever want to see in any solution model. So it's not really done well at all on this, with the caveat that it did better with a different LM before.

[00:28:18] Host : Paul Barnhurst: Yeah, I remember it scored a 929, which we said was, you know, really good for the competition. It was top three, maybe top one. Just showing how huge of a difference. Right. The score here is probably under 100 of what it solved so far.

[00:28:32] Co-host 2: Giles Male: You have some raw data okay. You've essentially got kind of like almost like trial balance information. There are some interesting insights in the sense that you've got kind of monthly data of revenues and gross profit, net income. If you look at this, you can see there's kind of a pretty high net income relatively at the start of the year. It turns negative towards the end and then pops back to just positive. All I'm all I'm going to ask the tool to do is analyze the raw data on the I wasn't sure what time it was going to be called. So on the Trial Balance Data tab, I'd like you to produce a comprehensive financial summary. Include charts as well as some KPIs.

[00:29:15] Host : Paul Barnhurst: It ran for roughly four minutes. We use Claude Opus this time to see how we do things. A little more complex task. Giles, why don't you walk us through what we got?

[00:29:25] Co-host 2: Giles Male: So we asked for some kind of summary on a different tab. We wanted charts. We wanted KPIs. Again, this is a real mixed bag for me. It has produced a new tab called Executive Dashboard. It's got some KPIs on here, but this is certainly not something where I would just be happy as a manager to like click a button and go, cool. Dashboard done. All good. So I asked it to look at the trial balance data, which was the raw data, and to produce the analysis off of that. There was also a tab in here, which was just kind of hard coded summary numbers for my benefit, really, so that I could see, you know, for net income. As an example, across this year of monthly data, the net income starts at a number, drops negative, and then goes back up right in the last month in December. So it's pointing to that monthly summary for the executive summary, which is not what I asked it to do. It's got some labeling that I would probably question. It's got EBITDA here as a label, but that's pointing to the net income line. And there is depreciation within this data. So that's interesting. That doesn't feel too accurate. Most of these references are to the monthly summary with one exception here. It's produced some key insights. And I'll give it some credit. It looks like the insights are okay in the sense that it said, hey look nice revenue recovery in Q4. So it's kind of picking up on the fact that you've got negative EBITDA. Is calling it.

[00:30:59] Host : Paul Barnhurst: I always really like the areas of concern, right. That it mentioned that the cost of sales are trending upward. Operating expense ratio has increased saying look you got an expense problem. It made that pretty clear. That's at least good, good info.

[00:31:12] Co-host 2: Giles Male: I think I'm trying to think of how you might use this. It might be at the moment. It could be a useful ally to go, hey, I'm going to do some analysis. You analyze the data as well. Tell me what you find. I certainly wouldn't just be swapping any work on my part as an analyst and going, hey, here's a one click report that I can rely on. I guess that's where my head is.

[00:31:33] Host : Paul Barnhurst: Yeah. So probably a good way to get to look at, get you some ideas, get you thinking, but it's not presentation ready.

[00:31:40] Co-host 2: Giles Male: Any closing thoughts on my set of tests Ian.

[00:31:44] Co-host 1: Ian Schnoor : No I mean it's listen, on one hand as is keep saying very impressive that these it has the capability to do this but I think probably not quite for a desk ready sort of tool that's going to Gonna start to do your work for you, if that's what some of the thinking is. And some of the predicting that that's going to happen, it's not quite there yet. I think we'd agree. It definitely needed a lot of us checking in and getting comfortable with what it was doing. Right.

[00:32:07] Host : Paul Barnhurst: And I think what it also highlights is could you get there probably with, you know, breaking up prompts and doing it very slowly, but then how much time are you saving?

[00:32:18] Co-host 2: Giles Male: I want to be coming to these tools and going, hey, analyze. It was the same thing I've always said with copilot, I want to have a data set and go analyze that. Not. Here's a 50 step guide to what I want you to do. Because I could just do it myself.

[00:32:30] Co-host 1: Ian Schnoor : Yeah, but, you know, let's be clear as well. I think maybe the best usage is to use it to help you to learn. Right. To say I'm not sure how I would solve question one. What do you think? Help me learn how I could solve it and spend a couple of minutes and then I can learn from what it does. So if it's only linking to direct cells or putting hard coded numbers is not it's not a good look. So what we're saying is if it takes too long, we're not going to bother doing it. If it's giving us the wrong answers, we're not going to do it. If we have to format it, it's going to add some extra time. If we can learn from it, that'll be helpful. But in some cases it seems to want to just, you know, dump in a hard answer without giving us an opportunity to learn from it. So listen, obviously there's a place for it. Impressive. But you know, limitations so far.

[00:33:10] Host : Paul Barnhurst: And I'll play one side of a little bit of devil's advocate. I think the first time you done something, you're really going to have to work through the prompts. But if you save all those, how much time is it the second time? The third time, if you're continuing a similar work right, you could substantially reduce your time. But it's just like learning the formulas. You're going to have to put a lot of time upfront. I think in the long term, these tools will save us a lot of time. But you have to be willing, like learning VBA or anything. This is not a press of a button, go give it to your boss. Outside of very basic things.

[00:33:41] Co-host 2: Giles Male: Maybe that's part of the journey we will go on through this series is, you know, I know we're going to stick to the same prompts, but we might if we go to whatever series two, three, ten, maybe we will evolve our Are prompting and think about things like that.

[00:33:54] Host : Paul Barnhurst: And I think we may even a little bit as we learn, you know, we'll try to stick to the same basic ideas we're learning as well. I think that's the message here is we have a workbook that has some price volume, mixed variance analysis. And let me see if I can make that a little bigger for everybody here. This is one we used in a case called Paul Gets Spicy. Don't ask why I get spicy, but I do from time to time. So it's a business that has three products. Let me just slide this out of the way so we can walk through it. It has these barnhurst basics, which is a smoothie with certain ingredients, Paul's mango Punch and then the Boss Barnhurst Organic Superfood Smoothie. And what I can see here is the budget said we're gonna sell 800 units. We sold 900 units, but our gross margin went down 20%. So what we're going to ask it to do is see how much of that was related to price, how much is volume altogether so that we can isolate mics for each product. So we're going to step through it. And since this one will only take this will be really quick for each of them. We might step through a couple different models and see how it does it, just so you can kind of see a little bit of that. So let me grab the prompt here and we'll uh get that. So basically with the prompt, all I've told it is, hey, in this workbook I have a price volume mix analysis I want done on gross margin.

[00:35:25] Host : Paul Barnhurst: Can you analyze the data in cells b24 through to J shouldn't be J 24. Let me change that should be J 29. Make sure that's right and provide the price variance for three products listed. The products are BB, IND, SEL, B 26, PMP and cell B27 and boss in cell B28. Let me get the rest of the prompt there. And so I'm just asking you to solve the first one. All right. So that took about a minute to run. What you'll see here is it created one to do which is basically calculate and populate the price variance formula. What's interesting is it didn't check off that it finished the to do there. But here it's okay. Here's where it put it. So at the bottom it did. It gave a little bit of information about each of these. So it said here are the results. The price increased from this to this. It's a $50 impact price decrease from this to this. It's $33 unfavorable. And then this one here is $10 favorable. So what we see overall is 50 -30. And then this one here $10 favorable. So what we see overall is 50 -30 and ten. It got the answers right. It gave us some analysis. I mean, on the whole I'd say a on this, no issues. Now what we'll do. Any thoughts Giles. Or on this one?

[00:36:56] Co-host 2: Giles Male: Yeah. It looks like he's done it well. I mean, it's understood the task. It puts the answers in the right place. I wouldn't necessarily say that's an easy thing to interpret. So, yeah, that looks quite impressive.

[00:37:10] Host : Paul Barnhurst: All right. So let's try it. Since this one only takes about a minute to run. Let's copy it again. And we're going to try it with a different LM this time.

[00:37:19] Co-host 2: Giles Male: So worth deleting the answer before you hit go.

[00:37:23] Host : Paul Barnhurst: Yes it is. So it deletes the answer. We're going to copy the question back. And we're going to run it again using sonnet 4.5. So we ran it with sonnet last time. That ran for about 45 seconds. We did Claude Sonnet 4.5. Once again, it gave us the answers. It was right, it gave us the information. The price variance results was roughly the same. It didn't have quite as much instruction as the first one. She started here, and that's all it had. If you look at the first one, it answers. It gave a little more detail in it, but very similar. So no issues. Relatively simple task. I could select either model. Now let's try it with the other two. So we have Cloud Opus and we have ChatGPT. We just ran that with cloud. Opus 4.1 again gave us the right answer. Took a little over 50s for that one. Notice it. Uh, if we scroll up, here's where I put it in. You can see the answer here. Very similar right here. Very similar analysis as the other ones did a little bit above, a little below.

[00:38:30] Host : Paul Barnhurst: No real difference. Gave us a key insight. So almost all of these are the exact same for the simple problem. Now let's try a different company of tool. We've done three of Claude. Let's see what happens with ChatGPT. And then we'll move on to the next task. This one ran for about three minutes and hung up. So I went ahead and created a new chat, processed it again, and it's been running almost three four minutes and hasn't finished. So the results were finding, at least in this situation, is all the anthropic models gave us the same answer. Very similar. The ChatGPT one is just hanging. And I think the message we have and this will be a theme throughout, will get a much better idea as we test everything. But I think our message here is that over time, we want these tools to just tell us what model is best. There's a lot of time and that's what I mean. Just tell us to use this model or choose it for us is what I mean by that.

[00:39:28] Co-host 1: Ian Schnoor : What we're saying is, as these things change at lightning pace right now, you're the tool is leaving it to the user to effectively have to run every single prompt four times under each LLM. Because how do we know exactly which LM is going to be the best to solve it? Because tomorrow it could be different. And I guess, yeah, we'd love to see the tools. Say, listen, based on your request, it appears, you know, we think the best solution or look, we ran it and the best solution is going to come out of this LM. And so here's your answer. Otherwise it's not optimal as you're saying.

[00:39:59] Co-host 2: Giles Male: Giles, what does the Tracelight tool or swap Tracelight or any other tool. What does it do on top of the LM? That would be a huge value added feature if it could pick the best tool, the best LM for a particular challenge. But I guess we're not there yet.

[00:40:14] Host : Paul Barnhurst: So what I have here is about 300 rows of data roughly. I think it is, yeah, 270 something records with an invoice number, a contract. Let's make this a little larger contract start and end date. Contract months. The amount of the product name, address, city, state. What I'm going to do is I'm going to ask the tool to pick to create a deferred revenue schedule for me. Right. Anyone who's done any accounting knows I can't recognize the full, you know, in this case, basically $24,000 up front. I need to recognize that over 24 months. And that could take a little bit of time. Anyone who's built a deferred revenue schedule. So let's see how it does. I have a prompt here. Let me pull it up.

[00:40:57] Co-host 2: Giles Male: So you're expecting a kind of horizontally profile timeline with essentially revenue by period?

[00:41:05] Host : Paul Barnhurst: Yep. Here's the prompt I've gave it. We'll go through it. I'm going to use I think this is a relatively complex task, much more than the PVM. So I'm going to go ahead and let me just make this a little bigger. I'm going to select Plot Opus. We've kind of talked about that. But using the data on the raw billing data sheet, build a deferred revenue schedule that can be used to recognize the revenue by month. And I've told it what column header should include customer name, product name. Contract amount, contract, start date, contract, end date and then have columns for every month from January 1st, 2026 through December 31st, 2029. The rows should include every dealership, the amount of revenue to recognize each month, and a check to make sure the amount recognized equals the contract amount over the period. This should be done in a new worksheet. Let's see how it does. We can see here did some thinking. It created a to-do list with five items. All made sense. This is Claude Opus. I've applied for the deferred revenue schedule. We're looking at it. We can see. Hey, let me pull the dates. It walked through each step. How many seconds it thought for value validated the formulas. So what we have here I'm going to close this. We have this deferred revenue schedule. And I'm just going to put in freeze panes. We can kind of scroll over so you can see it did a check right here to make sure the contract amount out and the total recognized which you did. A formula equals each other. And what I like is here. It did do a little bit of a round. So hey, it doesn't have to be an exact match if it's within a penny or two. Within a penny. That's good. Okay. Or error. And then it has each of the months all the way out. What I didn't like is it didn't format the month. No.

[00:42:55] Co-host 2: Giles Male: Hard coded as well.

[00:42:56] Host : Paul Barnhurst: It is. We'll get there in a second. So let's change this real quick. There we go. See, what's weird is it decided to use the 26th. 126 226 326.

[00:43:11] Co-host 1: Ian Schnoor : Why is it doing that?

[00:43:12] Host : Paul Barnhurst: And then it did 2025 there. So it didn't do these dates right across the top.

[00:43:17] Co-host 2: Giles Male: Yeah yeah yeah okay.

[00:43:18] Host : Paul Barnhurst: That's but it got the formulas right because it hard coded the formulas. That is not right. Usually it gets it right. I think I've used it before. I've run this a couple times. So interesting you can see. But that's the right number for that month.

[00:43:36] Co-host 1: Ian Schnoor : Fundamentally, I think my view and I'm sure this is shared by many Excel users. Fundamentally, even though it can do this instantaneously, it makes me uncomfortable that it sticks dead hard coded values inside any function inside. As an example here, inside your date function, right, you want to click on that date. Fundamentally, I don't like based on modeling discipline that there's dead numbers because now it makes me wonder if one of them might be wrong? Did it get the number? I mean, I think fundamentally from a discipline standpoint, I'm curious if you guys agree. I want to see an automated function that I can test and check all the way down and know that it built it correctly all the way down.

[00:44:15] Host : Paul Barnhurst: Agreed. And what's interesting, what it did is the reason it's working right is basically saying, hey, are both these dates between that date then divided by 24? Right. Really we'd normally do the beginning and end of the month. Okay. Yeah. They both fit in there. Yeah. So it works but it's clunky in that the radio still has this look of like what? Why?

[00:44:38] Co-host 1: Ian Schnoor : It's not dynamic. Being a dynamic is one of the key attributes of any good spreadsheet. And if you do understand this, you're going to be uncomfortable. If you don't understand this, then you won't be able to modify it later.

[00:44:51] Co-host 2: Giles Male: It's also wrong. So the headers, if you look at the headers there, are 2025 years for the first block and the dates in the hard coded date functions are 2026. So there's a misalignment between the headers for the columns and the dates. It's putting the numbers in.

[00:45:08] Host : Paul Barnhurst: I know that because that's what's here. And it did this right but it did not. I've noticed that I've run this. So I will tell you I've run this three times. It's always used hard coded a little differently. Usually it puts the month end depending on the model and the dates aren't right. The totals are all right. The numbers are right. Okay. You know, if you're experienced in Excel. I could have it run this in three minutes and then just go tweak it. Right. But if you're not experienced in Excel and you want to make it dynamic. So next month I could add more customers.

[00:45:42] Co-host 2: Giles Male: Yeah. The problem is every single cell with a date function there has got a different set of hard coded numbers. So I mean, let's be honest, that's a two minute job as an Excel user. You've got the dates in the headers. You do the greater than or less than or less than and equal to. And it's two minutes. Copy it across the whole block.

[00:45:58] Host : Paul Barnhurst: You got the right numbers. But there are definitely some issues. Yeah some things to work out. And could you get there if you continue to prompt? Sure you could. But this part is not hard to just do yourself. All right. I'm going to stop because we've spent plenty of time. I think I'll just do these two for now, and we'll turn it over to our man in for a modeling example. Both Giles and I have done some testing. I've done a couple FP&A use cases and Giles has done his esports or whatever that is he spends all his time on. He loves solving things in Excel. Now we're going to turn it over to the man, the myth, the legend in to do some modeling cases.

[00:46:41] Co-host 1: Ian Schnoor : Make my head swell. Paul, don't be doing that. But yeah, no, we're going to get into some traditional corporate finance three statement modeling and I'll see what we have time for here. If I don't get through some of the tests that I want in this one, I'll try some other ones on other tools. But we'll get a sense. And what I'm going to do is I'm first going to run some testing on an existing ready built model to see if it can help me check this model. And then I'm going to move into just a raw set of historical financial statements of this company. Henderson I'm going to ask it to do some building. I'm going to see if it can build. I'll see what it looks like to build a financial model with somewhat limited guidance at this point, and we'll see where we get to. Okay. So I've got a model here. And by the way this is an example of the type of model that our candidates in the AFM program at FMI need to build from scratch on the AFM exam. This is a model that it's a vertically oriented model. It's got a cover sheet and executive summary. It's a five year three statement model. It's got an assumption page here. It's got scenarios where I can run base best and worst. And then the model sheet is out here. I've got the revenue forecast and the cost forecast schedule. And then I have the financial statements. I've got the income statement. Scroll down, I've got the cash flow statement and I've got the balance sheet now. So this is a model that's already built. It's completely built. I am going to ask it to do a couple of tests. I'm gonna start off with some gentle tests to kind of check this file.

[00:48:06] Host : Paul Barnhurst: If you had to pick an AFM case to bring back nightmares from practicing this.

[00:48:11] Co-host 1: Ian Schnoor : Nightmares. Paul. You passed. You are an AFM charter holder, right? You did an outstanding job building a model just like this one on your exam. This actually is your exam submission. No, I'm just kidding. This. That is not yours. It is not yours. I wouldn't do that to you. But this is the exact type of model and you both ended up creating on your own. Afm exam. So let's go through. And so the first thing I want to do and you know, I run a lot of webinars all over the world on checking and auditing and making sure models are working. So let me do a quick test right off the bat. And I'm just going to say, ah, I'm going to type in. Are there any hidden sheets? It's something I always tell people to look for. Are there any hidden sheets in this file. And let's see if it can just run through it on its own here without pausing. Now there are two types of hidden sheets. There are normal hidden sheets and then very hidden sheets, which most people don't even know.

[00:49:01] Host : Paul Barnhurst: That's going to say did you put something very hidden in here in. Well, I.

[00:49:04] Co-host 1: Ian Schnoor : Want to see what it can do. Paul, I'm trying to test this thing. So there is. So let me show you now, by the way, this doesn't take long in real. Let's see who can do it faster. So it's running. I've asked if there are any. Are there any hidden sheets to do this manually? I would literally go home, format, hide and unhide. And I know that there's a hidden sheet because it's lit up. So I can tell within three seconds that there's a hidden sheet called covert. But let's see if we can get our cloud AI tool working. So I've tried it under sonnet. So I'm thinking about sonnets and it's discovered that it's working through this. We can see it's looking, it's looking, it's it's now it's taken longer than it would take me already manually. But it's going through it. It says, great, I found a hidden sheet. It's looking at values. And it has historically done this quite quickly for me. But that's an interesting one. And there we go. It's got it. Look, it said yes I like this. Yes. There are two hidden sheets in this workbook. There's a covert sheet which I just showed you. And then there's a clandestine sheet and that is deeply undercover. It is a hidden sheet.

[00:50:10] Co-host 1: Ian Schnoor : There is. And it tells me where to even find it as well. So it did a great job here. So if all you knew about it surfaced something that you might not know to find anyway if you knew about finding hidden sheets. I could unhide this one right. Here's a sheet called covert. But what it figured out was that if I went to the, if I press alt F11 to get to the VBA for the VBA menu, the VBA editor, you don't need to know VBA, but when you go here you can see a list of all the sheets and they all say sheet name. And then at the bottom it tells you the status. It's visible. The one that was covert used to say hidden. It used to say hidden. Now we made it visible. But there is a sheet here called clandestine that's very hidden. And it figured that out. It was able to see that I was very pleased. So it passed that test. And it's done a great job. It found the hidden and the very hidden sheet. The next thing I want to ask it to do, if I'm checking this model is I'd like to see if there's any white values. Sometimes people hide white values.

[00:51:11] Host : Paul Barnhurst: I think it's interesting. It's an assumption. It says the hidden sheet appears to be part of the arms process. They're testing you.

[00:51:18] Co-host 1: Ian Schnoor : The hidden sheet appears to be part of the financial advanced exam solution. Interesting. Likely demonstrating advanced Excel techniques for candidates. That's not. I like that it's, uh, I like that it's, um, suggesting why it was put in there. Reality is, it was put in to demonstrate how to discover these things. But thank you. I do like that it is giving me some theories over why it's there. Interesting. What it's looking for is to find any white values in the sheet. Now I have tested other AI tools that are not able to detect color. Right. So I was curious about this. What other tools I've looked at have said I can't tell what color cells are, but look at this here. It says yes, there are white colored sheets. What values on the model sheet would be invisible against the white background? I oh, look how excited it is. It's giving me an exclamation mark. I found them. It is correctly found in S9 to U12. Let's see here S9 to you 12. Yes it has. It has found in this spot. There were some white cells. Love it. So it passed that test. It also discovered I didn't even know this. It discovered that from h8 to J9, h8 to J9 that's down here. This is just a bunch of blank.

[00:52:41] Co-host 1: Ian Schnoor : I wasn't even sure why, but I realized it is correct. I didn't even realize that this cell, the font was white, but it was. And it knew even though there was nothing in the cell, right? The cell was actually blank, but it was aware that the font was a white font. You can't see it. So it's done a nice job. So these are some decent checks. Now these things don't take long to check to do it on your own manually, but I like that it did pass that test. The next thing I want to show is let's get a little bit more complex. So in this model what I deliberately did is I went to the balance sheet and I made it, not balance. Now I made it go out of balance. And the way I did that is with a classic, classic modeling error. What I did is on the working capital schedule. Now, anyone who's done modeling with us knows that one of, uh, you know, really strong best practice in modeling is to not have any big formulas on your financial statements, but rather link them to schedules. So the change in working capital on the cash flow statement is linking to the working capital schedule. Now, any modeler would know that the change in working capital should always be last year minus the current year.

[00:53:58] Co-host 1: Ian Schnoor : And what I did before we started this video is I switched it. I made a classic modeling error. This is a common error. I calculated this as the current year minus last year. People do this all the time and it gives me the wrong. So each of these numbers is the right number, but the sign is wrong. This one should be positive 6.9, not -6.9. It's. Each one of these should be the opposite sign. It should be last year minus this year. Now I will tell you when I made it. When I buggered it up. When I messed it up. And then I ran. And then I ran the AI. It saw what I did to mess it up and it figured out. It said, well, you just made it out of balance. So I figured out what you did wrong. So this time I'm starting a new search with the model already out of balance. But I'm showing you what the error is. The error is that the change in working capital is wrong. Let's imagine this happens to me all the time. People email me, they say, Ian, my balance sheets are not balanced and I can't figure out why it's not balanced. Can you help me? Sure. So I help people all the time do that.

[00:55:03] Host : Paul Barnhurst: If you had someone email you on LinkedIn and ask you to do their homework.

[00:55:08] Co-host 1: Ian Schnoor : Exactly. It happens all the time. I shouldn't have mentioned that. It happens a lot, but I actually enjoy doing it because usually it's pretty quick to find the answers. There's only, you know, I've created videos on this. There's only ten reasons why a balance sheet is potentially out of balance. I'm not going to go through them now, but I'm going to ask. I'm going to ask. I'm going to say, why doesn't the balance sheet balance. And I'm going to press enter and let's see if it can figure out why this balance sheet is not balancing. Now this is a harder question. It needs to go through. It says it's asking me why and I'm curious to see what it does. And then I'm also maybe we'll try it again and we'll ask Claude Opus.

[00:55:44] Co-host 2: Giles Male: There are lots of model auditing tools out there on the market. I don't think there are any that would have this level of insight. Like a lot of them go like, hey, you know, there are this many errors in the model. This is the last active cell. There's some it probably does do like hidden values to some degree. But to have something say your balance sheet doesn't balance. And here's why and I can fix it. That would be something we've not seen before.

[00:56:10] Co-host 1: Ian Schnoor : I recognize this is a very difficult test. This is something that a strong financial modeler could do and should be expected. If anybody sent me this model and it was not balancing, I would expect that my within 5 or 10 minutes on my own, I should be able to find it in a strong model or should be able to figure that out, because there are some common reasons why balance sheets don't balance. And if you understand, one of the key things to check is working capital. And any good modeler understands how working capital is calculated, and they need to understand how changes in working capital are calculated, which feeds into the cash flow statement. It tells us oops. It says here.

[00:56:49] Host : Paul Barnhurst: That the issue is in the net calculation. Which one is.

[00:56:53] Co-host 1: Ian Schnoor : Interesting, Giles?

[00:56:54] Co-host 2: Giles Male: That state that confident statement that might not true.

[00:56:58] Co-host 1: Ian Schnoor : I love it that it adds exclamation marks. It's so confident. It's like an excited child. I found the issue. No, I mean I like it, right? I found the issue. The problem is in the net PPE calculation of the balance sheet. And the balance sheet doesn't balance because of an error in the net PPE. And now it's telling me the CapEx values are negative. Well, we know this isn't right. It's kind of making this up right. This is clearly so. It's actually giving me what it believes the solution should be. And it's giving me an option to apply this. I'm actually very interested in this because when I ran this previously, it told me that it actually told me the reason the balance sheet wasn't balanced last time I ran this was because of the debt. Let's just see. So we know that that's not correct, but let's see. Let's apply this and actually see what it does. I'm very curious to know what it actually is. It's going to do something in row 163 here. It claims it's going to fix the problem. Uh oh. Looks like it's just made it worse, Charles. And now it's made a real mess out of this. But it is, uh, so in row 163 it is done. Something here. And well, what it did is it actually took the prior year's PPE on the balance sheet. And it is now adding in the CapEx which is a negative number. So that's wrong. The formula should be what it was. But as a modeler I know that I need to minus the CapEx because it's already negative. So minusing it will add the CapEx in and then subtract the depreciation. So I need to be a strong modeler to understand that what it did was wrong. And I'm going to copy that back and populate it.

[00:58:40] Co-host 2: Giles Male: Isn't that one of the big problems at the moment, not just for Tracelight but any LM? It's so confident in saying, I found the problem. This is it. Boom. And if you don't know your stuff, you don't check it.

[00:58:51] Co-host 1: Ian Schnoor : If you don't know your stuff, you're in trouble. And let's go fix it. Now let me prove to our viewers here. Let me prove I agree with you. Just let me prove to our viewers that really the problem was what I mentioned. The change in working capital is current minus last year, which is wrong. It's got to be last year minus the current year. That should be. And I'm going to copy that to the right control R and now we see I have the same value.

[00:59:14] Host : Paul Barnhurst: I did not tie.

[00:59:15] Co-host 1: Ian Schnoor : Then we're starting the recording from scratch. No just kidding. But here it is. And I will just run it through. Clear out the zeros by recalc it. And of course it's perfect. So as I indicated, the real problem was that working capital. But this concerns me, right, Giles? It concerns me that, you know, I wish I kind of wish it said I don't know, right? I wish it kind of said I'm not sure why it's not balancing, but rather it has said this, I found it. The problem confidently is in the net. And it took a correct formula and it changed it so that it was now incorrect.

[00:59:53] Host : Paul Barnhurst: Interesting on this, I saw a study the other day where they're trying a new LM model. I can't remember what it was called, but basically the whole idea is to reduce hallucinations and reduce from 13 to 1%.

[01:00:06] Co-host 1: Ian Schnoor : Let me go again. Current minus last year lets me get it out of balance. Now you see it. Watched me do that so now knows what I did to mess it up. But let's try a new one. Let's try a brand new. Um. Now, by the way, listen, we recognize that we are asking it to do very, very challenging stuff. This is not easy, but let's try it with opus. Let's just see. What do you guys think?

[01:00:30] Co-host 2: Giles Male: I hope it is so challenging. But again like my where my bar is to, to really start using this is kind of where the bar is like, I need the tool to save me time and thought process energy. And if you could just point it and say, check this model, tell me what's wrong. That's kind of what I need from an auditing tool.

[01:00:50] Co-host 1: Ian Schnoor : You know, my belief that people are going to hear me say over and over, the more I use these tools. And again, I'm excited about AI, I really am, but my strong view and belief is the best users. Those who really want to climb the ladder and position themselves well are going to have to use AI as their partner, as someone to bounce ideas around with, someone that can give you ideas, but you can push back and say, that's not right. I don't believe that, or I don't think that's the best way to do it. And I actually want to go in a different direction. And you can have a real intellectual discourse and conversation with, so to speak. If you're simply relying on it, I don't think that's going to serve you well. I believe you're still going to have to deliver an answer to another person, and you're going to have to understand it. So I needed to be able to fix this balance sheet on my own. So let's trying here. So here using Cloud Opus it's looking through all the pieces right. It's showing imbalances. Good. It's trying to trace the issue. It's checking the equity schedule I mean it's it's checking cash.

[01:01:47] Host : Paul Barnhurst: It's doing all the right things in the sense of what we do.

[01:01:51] Co-host 1: Ian Schnoor : It seems that looking at the change log, there's a critical error on the work, the root cause. That's what I'm worried about because as I mentioned, when I tried this previously, before we recorded, I had Tracelight on and it was watching me make the mistake. And then when it watched me make the mistake and then I ran the search, it was able to say, hey, again, I know we're just trying to run it. It's still taking a while here. It appeared that it was on to something it showed us. It seemed to recognize there was a working capital issue, but as it processed and processed and processed at the end of its processing, it now says the root cause. According to the changelog, the net P formula was modified. So it's back to it's abandoned view that working capital is the issue. But now it says the real issue is that, well, there's a slight mismatch okay. And it seems to be leading us down the path once again where technically it thinks the P formula is incorrect. So we know that's not right. And it's giving me it's also checking for circular references. I'm References. I'm not sure why it's doing that. There's no problem there. So again, it tried to do it actually was very close to determining this was working capital, but then it abandoned that idea. And it's come back to thinking that the problem is PPE. So listening again really impressive to watch it go through all of this. This is a complex task. I'm not trying to make light of that or minimize it. But you know, it still requires if I was if I was a brand new modeler and had no idea how models worked and was asking it to help me, I would probably get frustrated, become frustrated fairly quickly watching it spin and not getting me to an answer. Would you guys feel differently?

[01:03:33] Co-host 2: Giles Male: Yeah. You're not going to learn anything from that either, just because it's wrong. So that's you're almost going to learn incorrect things if you try to rely on it for some sort of training. So yeah.

[01:03:43] Co-host 1: Ian Schnoor : And that's the thing is, I think that a good AI tool should help teach you, should help educate you as well on what it's doing. So so it's struggling a little bit there. And this is going to be the ultimate test. What I'm going to do is I am going to very, very simply, you see, I have all I have on this sheet is an income statement, cash flow statement and a balance sheet. I've got three actual years and that's it. This is this is what the beginning of the AFM exam looks like. You get three years of financial statements and then you have to build a five year forecast model. So what I'm going to do is I'm going to go in and we're going to use opus, right. We've said we'll use opus again. I'm just going to say build a five year forecast model on the model sheet. That's this one here for the years 2025 to 2029. Build separate schedules for revenues, costs, depreciation, income tax, working capital debt and equity. And then I've simply said make reasonable assumptions. I don't care what assumptions it uses. I want to understand the mechanics, the logic. I want to see if it understands the integration of all the financial statements and the calculations. I'm not that fussed about how it makes assumptions for now.

[01:04:48] Co-host 2: Giles Male: This is the area where I've seen the most, the boldest statements, let's say, on LinkedIn, about what some of these tools can do. I don't think Tracelight has been doing this, but if a model can do a full and accurate forecast like this, I'll be really impressed.

[01:05:05] Co-host 1: Ian Schnoor : And what it's done is it took about two minutes and it says, I'll help you build a five year model, and it's thinking about a proposed plan. It says, how about we do this? How about I I'll build a model with supporting schedules. This is my approach. It's walking me through. Yeah. It wants to use growth rates price volume drivers. It wants to use Cogs as a percentage SG&A. So I'm okay with that for now. I want to see what it can create. It's proposing how it's going to build the pieces. And then it's talking about integration. So it's basically saying are you okay. It wants to use revenue growth of 3 to 5% margins. Thoughtful. It's actually really impressive. This is really impressive so far very impressive to watch I'm not going to lie I told it to build the model after it provided a recommended approach. There's an apply button that I'll push in a moment, but it says, let me start by building the support schedules. It was actually fascinating to watch it think through this. It broke down nine tasks and it said it was going to build the cost assumption, schedule whatever that it built the depreciation schedule. So it spent a few. It took about what I think we said six minutes in total. It built the working capital. It's building the debt schedule. And it was iterating and going back and forth. So it obviously is learning and figuring out how to do these things. The cash flow statement is complete.

[01:06:16] Co-host 1: Ian Schnoor : It says I noticed the ending cash is going negative. And then it says the five year forecast model is complete. This is what it did. It's telling me all the things it did. Key observation, cash management suggested adjustments. It's ready to go. It says the model is ready to go. So let's see. Uh, we're going to do this live on the fly. It says click apply to to add it into the sheet. So let's click apply and see what we get here. And then we'll kind of wrap it up. So here's apply and let's see what happens. So it's putting it into a sheet, and I am going to close out a Tracelight for a second so we can see. All right. It's added. It's made by changing some column widths here. So it's made some columns a little bit wider. But all right well I've got a five year forecast of the income statement. And I'm actually going to make them a little bit skinnier so that I can see what's going on. I mean the formatting is not great obviously, but it's got a five year forecast of an income statement. It's got a five year forecast of a cash flow statement. I don't know why it's taking a value and multiplying it by a half. This is a dividend number. I can't I mean I have no idea why it would be. But anyway that's fine.

[01:07:29] Host : Paul Barnhurst: Well what K 30 is that income. Is it basically saying 50% of income.

[01:07:32] Co-host 1: Ian Schnoor : Yeah I'm not sure. Well I'm not sure. It's it's a hard number and that's okay. So let's just kind of evaluate. It's got a cash flow. It's got a balance sheet. And of course let's just double check here. I've got a balance sheet. So let's just say there was no check line before. But assets minus liabilities, of course. The historical.

[01:07:49] Co-host 2: Giles Male: As you were skimming through, I was seeing that you were drawing on the debt revolver, but the cash was also showing a balance that there's little things that a modeler would spot that you think doesn't look right.

[01:07:59] Co-host 1: Ian Schnoor : No. It actually built a schedule for me. Now I can see why it had these columns wider. It built a schedule. One thing, of course, that none of us are going to like is that it's not vertical. It's chosen to put 20, 25 in the, in the revenue schedule, but it's not underneath 20. It should be in column J. Right. We always want to make sure the columns are aligned. But that's okay for now. I'm not sure what this one is doing here. So it's done some interesting work. But I think we would all agree this is not exactly client ready yet is it? There is a working capital schedule here. It kind of has some formatting, but I'm gonna have to fix the formatting. It's trying. I mean, and maybe with some prompts we can get it to clean it up. But I'm going to need to kind of play with this and flex this to get it working. What I am going to do here very quickly is I am going to take I want to just line up, I want to put this in G, I want to take all my numbers, and I'm going to just move them into the correct columns here. So at least I did that. No, actually the other thing it has done is it's added a column. It's added a column in between our three historicals and the forecast. I don't know why it has sort of a blank column here.

[01:09:22] Co-host 2: Giles Male: Which could you also, could you do a spot check? What's done for the start of the cash flow statement in the indirect operations? What's the cash flow. What's the net income line doing?

[01:09:33] Co-host 1: Ian Schnoor : Okay, it's great, but the point is I'm going to need to rely. You know, I think heavily on strong modeling skills to see.

[01:09:40] Host : Paul Barnhurst: And you notice it did. Remember how dividends did K30 divide time point five. It just took the net income and said half of it. Which means you're expecting them to give you dividends because you have a negative net income.

[01:09:54] Co-host 1: Ian Schnoor : So it's an assumption. It's when I ran this earlier it actually had built an assumption section at the top. So let's see on the revenue schedule what it did here. It's the assumptions. This time it's chosen to put its assumption. And I'm okay with this. It's decided on some growth rates for now. And it is building them off the prior year. That's fine I'm okay I mean it's not you know, it's okay for this current run of the model. You know, my biggest concern lies in some, you know, issues around design and structure, around why there's a brand new blank column in the middle here. I also had to align them myself. And of course, the big issue is it's not balancing. So we're going to have to go through and figure out. In fact, let's just try this quickly. I tried this before. I'll say the forecast balance sheet doesn't balance. I'll give it an exclamation mark. What's the problem? Let's just see what it does here. I'm going to say the forecast balance sheet doesn't balance. What's the problem. Let's see what it the user they're calling me the user. The user is saying that the balance sheet doesn't balance. And so now it's flying through here. It apologized to me on an earlier iteration.

[01:11:11] Host : Paul Barnhurst: That's pretty common to get an apology. Sorry I was wrong. Even though I was very confident I was right.

[01:11:17] Co-host 1: Ian Schnoor : I see potential issues, so let's see if it can figure it out, because I, you know, the three of us have not analyzed this enough to know why. But, you know, here's the thing. If you're a junior investment banker or you're working in research or you're working in corporate development, if you're building a model like this, your boss isn't going to care. You know why it's out of balance, or that you used AI to help you. Your boss just wants to know what's wrong and how are we going to fix it. They're going to look at you. You are going to need to know what the problem is and how to fix it.

[01:11:49] Co-host 2: Giles Male: What's that? And if he again, I was I was pointing out earlier you're on the revolver line there. So you're drawing on the revolver for 50 million and then on the cash line on 78, you've got a negative of the same balance. So that kind of jumping out to me is.

[01:12:03] Co-host 1: Ian Schnoor : And the revolver.

[01:12:04] Co-host 2: Giles Male: Is not right.

[01:12:05] Co-host 1: Ian Schnoor : And you know what, Giles? You're right. Like, again, I know you and I know this because we can build models. But so we are seeing here an increase in the bank debt, the revolver, it's gone from 0 to 50, which means there has this change on the balance sheet has to get picked up on the cash flow statement, obviously, by having a $50 million increase in their revolver, that means that I should have seen 50 million. That same 50 should have shown up as a cash inflow on the cash flow statement. It's not. It says I found it. Forecast columns are incorrectly referencing column G when they should reference K to oh, I mean, I don't think that's right. Let's just click apply. I think it's going to make a bit of a mess again.

[01:12:45] Host : Paul Barnhurst: No it's not. You can already tell the cache stayed at zero.

[01:12:48] Co-host 1: Ian Schnoor : I'm not sure exactly what it did here, but I think we can agree that um, and I'm on the balance sheet, so we still have. This is the revolver row. It's, uh, the revolver is still not showing any cash inflow. That's not the issue. But all the other numbers have changed. And are we still out of balance? Yeah. I mean, our balance sheet is still significantly out of balance. Yeah. I think I'm going to struggle if I ask it to help try and solve the problem. Giles, would you agree? We're going to have to roll up our sleeves on our own.

[01:13:17] Co-host 2: Giles Male: I mean, that's probably the end of where we kind of do our analysis for this episode. I guess it's not at the point where you could rely on it as a replacement for an analyst or anything else like that.

[01:13:29] Host : Paul Barnhurst: And the bottom line, there's a couple of things going on here, right? Some of it, you could argue, is prompt, and could we chunk it out and maybe get there? Sure. But then do you really get savings? How experienced do you have to be? I mean, if I had to vote overall, I say Trey Slide is impressive. It's doing good work. Is it prime time where I would give it to a junior analyst and say, go to work? I don't see it yet, not without a lot of struggling.

[01:13:54] Co-host 1: Ian Schnoor : And I would say again, I mean, again, it's easy to be critical. It's wild. I agree with both of you. It's wildly impressive on how it can do things, how it can act as our partner, wildly impressive on what it did, you know, for the esports where it had to fill in, you know, pre-defined cells and on, you know, some of the things you asked it to do. And I think that this would be great for someone like you Paul or Giles or me. I mean, if I wanted to get a head start on building the structure of a model, maybe this would help save me time. But again, I think that, you know, the user I believe is going to have to be stronger than ever to understand how to troubleshoot, how it's making mistakes, where it's making mistakes, how to fix it. I mean, I'd be happy to work with this, but someone who doesn't know modeling, if their view is that they can use this as a replacement to their own skills and knowledge, I think they're in for a bit of a shock or a bit of a rude awakening, at least right now. But again, extremely impressed that we have these tools and they're evolving. I can't wait to see where they go, but I think it's been really fun to test this out and I give a lot of kudos. It's unbelievable to know what it's able to do so far in one year.

[01:14:56] Host : Paul Barnhurst: You know, kudos to Tracelight for building this, for getting out there and helping the market. And the bottom line is we don't think we're there yet. But we're going to test more tools, see what they all show. And after we've done a few of them, we'll come back and talk about some of the strengths and weaknesses. So we have a comparison. All right. Now that we've had a little bit of fun we're gonna sign off. Thank you everybody.

[01:15:20] Co-host 2: Giles Male: Bye bye.

[01:15:21] Co-host 1: Ian Schnoor : Take care everybody.

Paul Barnhurst

Testing Excel AI Software Tracelight on Excel Esports, Financial Modeling, and FP&A with Ian And Giles

How Excel AI Agents Like Rosie Work for FP&A Tasks but Fail at Building Models with Giles and Ian

Introducing the ModSquad: Testing AI Financial Modeling Tools, So You Don't Have To... with Ian and Giles.