Testing Shortcut AI's bold claims: Did it live up to the hype with Giles Male

In this episode of The ModSquad on Financial Modeler’s Corner, Paul Barnhurst and Giles Male put Shortcut under the AI microscope, testing one of the most hyped AI tools in the financial modeling world. With claims like “the most accurate Excel agent in the world” and the ability to outperform human champions in modeling tasks, Shortcut has made a big splash, but does it live up to its own bold promises? Paul and Giles run it through a rigorous series of real-world modeling challenges, from esports cases and financial forecasts to dashboard analysis and deferred revenue schedules. What they find is a tool with clear potential, and some serious red flags.


Expect to Learn

  • Where Shortcut impresses with formatting, speed, and usability

  • Where it fails, especially with modeling logic 

  • How Shortcut compares to Excel Agent and TabAI across key modeling tasks

  • Why reversing formatting in a model is a huge red flag

  • What to consider when investing in premium AI tools for modeling


Here are a few quotes from the episode:

  • “Shortcut has potential, but right now it’s flash over fundamentals.” - Giles Male

  • “Could you imagine an analyst reversing formatting to make a number look negative? They’d be out the door.” - Giles Male

Despite the hype, Shortcut proved to be a solid tool with promise. It delivered impressive formatting and UI, yet had some serious issues like incorrect logic, hardcoded values, and non-balancing models which held it back. A promising AI assistant, just not a replacement for real modeling expertise.

Follow Giles Male:
LinkedIn -  https://www.linkedin.com/in/giles-male-30643b15/


In today’s episode:

[01:15] - Intro & Where the AI Modeling Journey Stands

[05:11] - Shortcut: First Impressions & Bold Claims

[14:28] - Viral Demo Video Breakdown

[23:12] - Esports Challenge: Basic Excel Tasks

[31:19] - Intermediate Case: Modeling Accuracy

[36:10] - Building a 3-Statement Forecast

[44:24] - Red Flags: Formatting & Balance Sheet Errors

[50:27] - Deferred Revenue Test

[56:32] - Trial Balance Dashboard: Visuals vs. Substance

[1:07:22] - Final Thoughts & Shortcut's Ranking


Full Show Transcript

[00:01:15] Host: Paul Barnhurst: Welcome to another episode of the Financial Modeler’s Corner featuring two of the founding members of the Mod Squad. One of our illustrious members is on a flight back from London Ian Schnoor will not be able to join us, but I have here with me my trusty copilot, Giles Male why don't we start by having you introduce yourself real quick? Giles.


[00:01:40] Co-Host: Giles : Sure. Hi, Paul. And hello, Ian. If you're watching. My name is Giles Male. I'm co-founder of Full Stack Modeller. Do a whole bunch of Excel and financial modeling training. And, yeah, I am part of this gang going around testing all of the AI tools. And now I think probably at some stage LMS and things like Microsoft Agent. I'm really enjoying the series so far. I don't know about you. I feel like I'm learning a huge amount as I go along with this. I'm sure there's a lot more to uncover and looking forward to testing shortcuts today.


[00:02:11] Host: Paul Barnhurst: I'm definitely learning a lot as well, and one realizing frankly how beneficial they can be when used right? Yeah, and also how scary they can be when used incorrectly. It's like I said before, it's a magnifier if you know what you're doing, great. If you don't, please don't use it.


[00:02:30] Co-Host: Giles : Yeah. And I think we've said before, you know, I have said certainly in a few places we're getting to a dangerous phase where the outputs are really looking good in lots of places, which I think is a fantastic step forward. But if you don't know what you're looking at and you don't know how to review what it is that's been built, that's probably the most dangerous time. It's only going to get better from here on in. So this is I reckon this is the most dangerous moment in Excel financial modeling tool. Lifetime.


[00:03:01] Host: Paul Barnhurst: I agree. They're going to be a lot better in six months. Yes, Excel agents are going to be better. And the tools that survive are going to be better because they are only going to survive if they get better. Yeah, right. Absolutely. I mean, think of where AI was three years ago. So before we jump into this episode, I'll do a quick just kind of couple minute recap of where we're at. If this is the first episode, you're coming in at a little bit about our co-host, who's not with us in Schnur. He's the head of the Financial Modeling Institute, and he's usually here with us each week. But he's been in London this last week. He got to hang out with him there. Giles, anything you want to share from that experience?


[00:03:38] Co-Host: Giles : Yeah, it was great. So the FMI and Ian and Amy ran a kind of networking members session in London. So I think there must have been 50 or 60 people there. Oh, fabulous. Two fms I was there. Jeff Robinson was there as well. Ian did a great speech in this bar that we were at, which was great, and I think it was just a really nice showcase of the community that the FMI have kind of started to build globally. And it was certainly very strong in London.


[00:04:07] Host: Paul Barnhurst: Awesome. Well, you know that. That's great if anyone doesn't know me. I didn't technically introduce myself at the beginning, but I'm the FPA guy, Paul Barnhurst. I've been hosting this podcast for a couple of years now, so hopefully you know who I am. If you don't, you recognize the beard. There we go. That's all I ask. Kind of like with Giles. You recognize that sometimes he wears a fur coat and he's humble. Yeah, we both go for brands. All right, so quick recap. We're in series, I think episode 6 or 7. Now, I think this is seven. We've tested several tools. We've looked at Excel agents. We've looked at Rosie AI which is shut down. Shout out to Dennis. I love the work he did there. I think for just a moment. He was one of the very first to two years ago to really start on this journey. And I think he helped push things along a lot. So a good shout out to him for all the hard work he did with Rosie, and we're sorry it didn't work out for him. Others we've tested trace light, we've tested Tab AI and we've tested Excel agent. And today we are on the list. Why don't you tell our audience what they're going to win today? What our testing is.


[00:05:11] Co-Host: Giles : Kyle, what are they going to win? I don't know what they're going to win. Uh, but we're going to test shortcuts. So I think for reasons that we definitely should go into, I think we have to be open with the audience. I think this is an interesting one because this tool and the founder, Nico, got more hype through, I would say, quite deliberately hyped up marketing than any other tool. But for me, it potentially started the ball rolling on a lot of the hype in our space. Big claims, which we are going to have to unpick. So I'm expecting a lot from shortcuts. And again, as we have always said, like I want good tools on the market. So if the shortcut lives up to the claims, it's going to be really impressive.


[00:05:58] Host: Paul Barnhurst: Yeah, exciting. I mean, they definitely did. I think we showed the web page. Many of you will probably have seen the one video that was hype. We might run that. It's only a minute long, and I show the web page here in a minute and we run through it. They've done some great things. I've had Nico on my podcast. They're clearly working very hard. They've clearly believed they have the best tool out there. They've shared that many times. They've definitely talked a lot of hype, so it'll be exciting to test it. Unfortunately, as we go through this, many of the tools responded to us, not all of them. When we sent out surveys to try to get people to participate, a shortcut was one we did not hear from, so we had to sign up ourselves for their free trial. We'll talk a little bit about that here in a minute. And smiling because.


[00:06:42] Co-Host: Giles : A little bit nervous, Paul.


[00:06:43] Host: Paul Barnhurst: I am a little bit nervous. We'll explain why here in a moment. But why don't we jump into their website and kind of walk through that like we've done with all the others, what we like, what we don't. This one's definitely a little different. This is the first tool we've tested that has its own spreadsheet agent, and then added an Excel agent. And at first, that Excel agent was only available for Macs. And then it's so many credits and they've, you know, kind of changed that as they've gone. We've seen a lot of changes in a few months. I think they've been around, which is common for startups. So let me go ahead and share their screen and let's run through it. So why don't I give you first shots just looking at it right now. Anything jump out to you.


[00:07:22] Co-Host: Giles : Yes. I mean so this is why I'm saying as much as, you know, I have personal opinions on the approach that's taken just from a professional, neutral perspective. They have set the bar themselves for the world's best Excel AI. So that's I guess my comment is what I'm expecting.


[00:07:44] Host: Paul Barnhurst: Yep. And you know, I think it's trusted by thousands of paying users at top companies. Great. So they're growing. You know, you know, some of the others have mentioned similar type things. I do find it interesting, the first prompt you read right here is to build me a three statement model template for my company. So build me a template to use. When's the last time you used a template to build your three statement models?


[00:08:06] Co-Host: Giles : Well, I mean, I've actually got it just because I used to consult ten years ago. I do have templates all over the place. It is interesting to get an AI to build it, I, I can see a world where that is useful when it's reliable, because I think a lot of professional modelers will tend to work off templates of various sorts. Interesting.


[00:08:25] Host: Paul Barnhurst: Yeah it is. I've never been a big I mean, I definitely think there's places for templates, but I find especially for every company's revenues, things are so different. I was always just building kind of really starting from scratch.


[00:08:36] Co-Host: Giles : Well, I also just think if you are a professional modeler, whether you're a one man or woman band or you're part of a big modeling team at office or whatever for Mazars, F1-f9 all the big names, I mean, you're already going to have templates.


[00:08:49] Host: Paul Barnhurst: So again there's definitely a template level. So yeah I agree.


[00:08:53] Co-Host: Giles : But but they they might be looking those big companies might be looking at tools like this. And I'm sure part of their thinking is that okay well can they help us do this better somehow. And we don't know the answer to that yet.


[00:09:05] Host: Paul Barnhurst: Yeah. Well, we know I mean, Inez told us, you know, some of his travels just recently, one of the big firms he met with, you know, they're doing a bake off with looking at Excel and shortcuts for using in their firm. They still think the tools are not quite there. They have a similar opinion to us. That doesn't mean they may not select one to use and get benefit, but I think it's like what everybody says: vibe coding at first just wasn't there. Now people are really starting to feel like it's much more there. But you know, vibe coding has been going on for about 18 months now.


[00:09:33] Co-Host: Giles : Word of the year, apparently.


[00:09:35] Host: Paul Barnhurst: Yeah.


[00:09:36] Co-Host: Giles : Word of the year that nearly made me fall off my chair. Good lord.


[00:09:39] Host: Paul Barnhurst: Well, what other dictionary? Uh, said the word of the year was this phrase six seven, which I don't know if you even heard it. It's been popular with kids in the US, and nobody really even knows what it means. Sometimes they'll say it to me and I love you. Sometimes it's just a nonsense word. It's really weird. My daughter doesn't do it at all. But she's that age. It's kind of junior high. And one of the dictionaries labeled six seven. Okay, it's two numbers as the word of the year. It's not even a word. But anyway.


[00:10:06] Co-Host: Giles : I think we're old. I think we're just old 100%.


[00:10:10] Host: Paul Barnhurst: I felt old, there's no question there. I'm going to enlarge the screen a little bit. So here's what we get. Shortcut turns, hours of Excel work into minutes. You know, then we have their video running. And then I think this is an interesting accounting use case. And maybe we'll just run through a few of these. Take my aging accounts receivable data and show me days outstanding for my customer. All right, fine. So that's an example of PE diligence for me by performing a customer concentration analysis. All right. Yeah. And then they keep going and we won't run through all of them. But you know, take my bank statements attached. So it looks like you can attach files, which I do like. Not every one of them had that. They are using a mac. So we have to end the episode. I'm just kidding.


[00:10:57] Co-Host: Giles : I am a third party tool that's not in Excel on a mac.


[00:11:02] Host: Paul Barnhurst: Alrighty. And then finance and operations. So there's that. Let's kind of keep going down. Everything you love about shortcut Excel native works with your existing macros, keyboard shortcuts, large files, and more. Use shortcut is an Excel plugin. So what I think's interesting is how quickly they've kind of even though they have their own spreadsheet, they've realized we have to do this in Excel. Would you know, if you didn't already know before that it's its own spreadsheet in addition to a plugin?


[00:11:29] Co-Host: Giles : No. And it's one of the things that I kind of got a little bit frustrated about with some of the early videos. I don't think it was obvious that it's it wasn't originally in Excel. Um, and I would just say this again, from a personal perspective, I don't want to be using a spreadsheet tool that isn't Excel. I don't want to use Google Sheets. I'm not interested in using something else that looks like Excel. Like for me, if I'm going to use an add in for my spreadsheets, it's going to be Excel. But other people may be.


[00:11:59] Host: Paul Barnhurst: I actually kind of find it fun to test the different ones and see where they're good. I haven't gotten enough to really start using them just because of that switching cost. Yeah, even though spreadsheets are pretty similar because I've seen some cool things for others, so I like to see them and test them. I think it's kind of fun when they do it, but I still do. All my work in Excel.


[00:12:18] Co-Host: Giles : Is this. Is this. The main video below that I see? Is this the this is.


[00:12:21] Host: Paul Barnhurst: So let me make sure I have sound turned on here. Let's uh, let's see if there's anything else below that. Then we'll run this video because this is really, I think it, germane to this whole conversation because it drove a lot of the hype. I know you called him, called him out with a little bit of concern on it. There were several conversations. So here's again the claims. It's kind of run through this. The most accurate Excel agent in the world. They show their benchmarking, beating Copilot in Excel. Now, what's interesting in this benchmark and the others haven't chosen to participate. Cabs aren't here. Trace light isn't here. You know, Cabs has many different models and all these other tools. There's, you know, probably 20 we could list. So it'd be really interesting if all of them participated to see if that holds true. Maybe it does and great if it does. But it would be interesting to see.


[00:13:10] Co-Host: Giles : Exactly that, I think. Great if it does, they're just enormously bold and black and white claims the most accurate Excel agent in the world. So again, going into this, I can't be as generous as I think we're always trying to be because they're not coming at this neutrally like everyone else. They are saying they're the best out there. So so that's what we want to see right.


[00:13:33] Host: Paul Barnhurst: So then you have okay Auditability Excel parity. So the shortcut is immediately being able to open and export Excel files. So here's where you really see but without the manual work, it's still almost hidden that there's a spreadsheet here. I would like it much clearer what they're offering. I think there's some confusion. That's my take here. And then, you know, speed. So not nothing surprising there. Again the industry leader in gold standard securing the future of finance. No big shock here. They have all the standard security that any tool that wants to work in the enterprise space is going to have to have, you know, and then media, the YouTube video AI powered spreadsheet demo.


[00:14:19] Co-Host: Giles : There's been a lot of this. So and again, this drove me nuts because it was getting reposted all over LinkedIn. I don't know whether people have, you know, been incentivized to say these things on their behalf. Possibly. Possibly not. But we had another round of Excel's dead shortcuts here. And that stuff really annoys me.


[00:14:40] Host: Paul Barnhurst: Me, right? I mean, you can see it right now. Is a shortcut. The new Excel. Rest in peace. Excel.


[00:14:45] Co-Host: Giles : It's clickbait, isn't it? It's a classic visual and.


[00:14:48] Host: Paul Barnhurst: Excel agent at the top. They claim to have thousands of customers. Excel has 700 million. Excel is not dead.


[00:14:56] Co-Host: Giles : But like, genuinely, if they've got thousands of paying customers and you're on the hook for $14,000.


[00:15:02] Host: Paul Barnhurst: Yeah, we'll get there in a second. Big reveal.


[00:15:07] Co-Host: Giles : Can we watch the video? Because I'm hoping if it's the original video, um, we I think it's important.


[00:15:12] Host: Paul Barnhurst: It is. Let me make sure I'm going to remove and reshare because I don't think I turned on the sound. So give me one second here to stop my screen share reshare. Bear with us. Everybody. Okay. I did have a sound on. So let's go ahead and watch this video. All right. So this video came out, maybe three months ago now roughly 3 or 4 months ago.


[00:15:37] Co-Host: Giles : Maybe a bit more. I can't remember exactly, but it's not been that long.


[00:15:42] Host: Paul Barnhurst: So what were you. I know you have very strong feelings about this video. Like this. For example, you felt like they didn't make it clear that they were using their own spreadsheet. I know that was one frustration for you, right?


[00:15:55] Co-Host: Giles : So what? I'll do 1 or 2 statements that, uh, I don't like. I'll just focus on one, and I might just let it. I'll stop talking just so everyone can hear it. Okay.


[00:16:05] Nico: Hey, I'm Nico from fundamental, and today we're launching an early preview of shortcut, a superhuman Excel and spreadsheet agent shortcuts. Pretty good at Excel. In fact, it can solve the Excel and Financial Modeling World Cup championship cases ten times faster than the human champions. Let's look at a couple examples. You'll notice the shortcut looks pretty familiar. It has near perfect feature parity with Excel. You can open up existing Excel and work from there, or you can export them. And of course you can run formulas, but you can also one shot most of your work and come back when it's done. Here's a shortcut solving one of the hardest problems in all Excel the Financial Modeling World Cup case that takes about an hour. I attach multiple PDFs and I ask it to solve it using the existing model and to make a new sheet with the answers. Now it actually uses the existing model the way you're supposed to, and using real Excel formulas. So it fills out the net revenues and then goes and starts to tackle costs and fills out the rest of the income statement. Then it starts tackling the balance sheet, and it realizes that the PA line item is messed up and it just recursively solves its own mistakes. So it fixes PA, fills out the rest of the balance sheet, and then it proceeds to the cash flow statement.


[00:17:03] Nico: Then it makes a new sheet for the answers. Shortcut completed this case in less than ten minutes and in a single shot. Shortcut is also very strong at building complicated models from scratch. Here I asked it to build a multi tab proforma cap table for a series A company, and it did this in a single shot in less than ten minutes. Something that takes professional lawyers up to a full day to get done correctly. Shortcuts can clean, edit and analyze data way faster than humans can. Here I fed it a 5000 row CSV of all of the companies that have ever been through Y Combinator, and I asked it just for some deep insights. It charted it and it built dashboards that explained the data to me. And here it turns out the majority of companies are still active. They are B2B, but that consumer has a higher success rate. Until now, the 2 billion white collar workers using Excel haven't really felt the wave of AI that's coming. Shortcuts can already do 80% of human work on Excel, and it's only getting better. Our early preview is live now. Comment for an invite code and I'll give you access.


[00:18:03] Co-Host: Giles : There's a lot I didn't like about this. I didn't like the classic LinkedIn tactics of like, you gotta comment a thing in the comments to get a link to whatever. There's some big statements in there, so it can already do 80% of the work that's done in Excel. The big one for me, partly just because, you know, I'm involved with the esports stuff, it can solve the Excel and Financial Modeling World Cup cases ten times faster than human champions. That's a big, broad statement, and I'm guessing they didn't test it on all of the cases. Um, solving some of the esports cases can vary in difficulty, so we will test it on what I would call an easy and a sort of intermediate case. So it should genuinely get that 100% correct within a couple of minutes. Financial modeling cases are hard. So I don't believe it is solving the most complex financial modeling World Cup cases and the hardest questions ten times faster than the best humans. I'm sure we will get there eventually, but if it's doing that now reliably, everyone should be using shortcuts.


[00:19:12] Host: Paul Barnhurst: All right. So we watched the video. We've gone through that. Let's go back. And I want to show just one other thing here. So I'm going to bring back up my screen. So let me just put it back on since you were presenting before we get into testing. So we know we've been going a little while. But this is all important so bear with us. So the first thing I think is interesting is right. You can test it two ways. We're going to test it in Excel today because that's what we prefer. But we want to show you it also has its own spreadsheet you know. And so we're going to see a few things here with its own spreadsheet. I'm going to make this a little larger so we can kind of see it a little bit better. Has its own prompt library always like that. You can save your prompts. I think that's a nice feature. You know, you can put your rules and instructions, your AI preferences. Again, others have that. So I'm not seeing anything groundbreaking here. Uh, you know, the new chat context remaining is your memory. I do like that it tells you what context is remaining because you run out of context. You need to start a new chat. That's something that I think is really nice to have. You know, they have their pro and max version. We're using the max. They have. Hey, do you want speed or do you want power in an answer? So they give you that.


[00:20:29] Host: Paul Barnhurst: But they're still selecting behind the scenes what they're doing, which I prefer. So I like that, you know, similar to the others they have. Hey, do you want it to perform the action or do you want it to ask questions without making changes to your spreadsheet? Again, I think that's an important feature. So, you know, they have their own model guide building a model. I'm going to stop it there. I don't know exactly what it means, but we'll come back to that later. But you can click on a build model guide. And I think it helps walk you through things. The one thing I want to show is that if you click on an account here, you have several options. We got the Max unlimited. We signed up for the free seven day trial. We'll show that here in a second. We got the tutorial Privacy blog. They have a community there, support settings, subscription, etc. so what we had to do and that's not the image I wanted. Let me pull up my image. Why do I not see it? Oh I, I think I saved it so let me just pull. Give me one second here. We'll pull it up. I'm going to pause for one second. We'll be right back. All right. We're back. Giles, anything jump out to you from that screen and then I'll explain it.


[00:21:36] Co-Host: Giles : Yeah, I just thought, I hope you've been saving, uh, saving hard pool.


[00:21:40] Host: Paul Barnhurst: So shortcut offers a 14 day free trial for teams, not their pro or Macs. Requires a credit card. If you don't cancel in time, you're on the hook. I did three members for a year, so the cost is almost $5,000 per member. What does it come out to? 4800. So 400 a month for each seat. So we're paying you're paying 1200 a month here. And so I had to put my credit card. You can see that information is hidden outside of, you know, my bank there. But yeah, uh, I told Giles I will be canceling at the end of this session before he gets off the the call to make sure I'm not on the hook for $14,400. So pricey.


[00:22:21] Co-Host: Giles : Pricey, you know, for for a lot of big companies, if it can genuinely save the swathes of time that is spent, you know, organizing data, tidying up data processing things, then that might feel like an extra model.


[00:22:36] Host: Paul Barnhurst: Half a model a month if you're working in PE, if every analyst is giving you an extra half a model a month, it's worth a lot more than $400. Drop in the bucket?


[00:22:44] Co-Host: Giles : Absolutely. But I guess for us, having gone through this journey, it's just very , it's right at the top end of what we've seen. Uh, so.


[00:22:51] Host: Paul Barnhurst: It is and it was a little more of a harder process, a little clunkier than some of the others, to be fair. And so that kind of jumped out. But what we're going to do now, I think we'll start with Giles. Why don't we get into testing? That's what everybody's here for. They don't want to hear us rant forever. Really?


[00:23:07] Co-Host: Giles : So you don't want to hear that?


[00:23:08] Host: Paul Barnhurst: Oh my God. Well, I don't want to hear you rant forever, so I assume nobody else does. How's that for an answer, Giles?


[00:23:14] Co-Host: Giles : That's great. Do you want to put my screen up?


[00:23:16] Host: Paul Barnhurst: Yep. We're going to go ahead and throw Giles' screen up here.


[00:23:20] Co-Host: Giles : So I'm sticking to shortcuts on the web. I've uploaded the first.


[00:23:25] Host: Paul Barnhurst: All right. So we're trying the web version here first.


[00:23:28] Co-Host: Giles : Yeah. And depending on time maybe we'll try the desktop as well.


[00:23:32] Host: Paul Barnhurst: Well, we could do the first case web, do the second desktop, whatever.


[00:23:35] Co-Host: Giles : We try that. I assume it's the same.


[00:23:37] Host: Paul Barnhurst: Uh, yeah, it's the same agent behind the scenes, so.


[00:23:40] Co-Host: Giles : Okay, so hopefully everybody knows what this is now, but I'll do a quick recap. This is what we think of as sort of intro level Excel esports case. So we've got seven levels of challenges that get harder and harder. First level we've seen is, uh, if I go up here, extract a number of characters from a string. So we want to see something like the left function with the character choices and the word. What we're really looking for is like functions getting the right answers, but hopefully getting it in the right way. But also, it's really impressive even for us when we just see a tool understanding what the challenge is, because the prompt is really just saying, look at this case tab, there's five bonus levels and seven, uh, main levels. Answer the questions.


[00:24:27] Host: Paul Barnhurst: And just to give context, no tool has got 100% on this tab. Last week everyone tried, got the cloak, got all seven. Right? Which others have done. Got three of the five bonus. Some of the others have got four. But what was impressive is tabs got really close on the color one. They made a slight mistake there. The so on the whole they got the closest even though they didn't have the highest score. Yeah. And so let's see how shortcuts does. So really for the shortcut to be the best here we would expect 100% bonuses and all seven levels. So we're going to kick it off. Similar prompt we've used before to do everything. We're going to go ahead and pause it here. We'll be back. We're going to let it run for a minute. We're back. Solved level one. What would you like to add Giles?


[00:25:11] Co-Host: Giles : No. Fine. Happy so far. So it's solved level one. It's used the left function. It's linked to the kind of driving cells. So I think even in one of the earlier tools, we're kind of not seeing it use the cells that we wanted. Level one done. We the prompt said stop at level one. So what I'll do now is we will get it to proceed with the other levels. And it's probably worth pausing there because I'm going to guess it's going to take a few more minutes.


[00:25:38] Host: Paul Barnhurst: And we'll probably be 3 or 4 minutes here and we'll be back. All right. We're back. Uh, shortcut ran for several minutes. It went through levels one through seven. One thing Giles and I noticed you'll see here is it didn't answer the bonus questions. All the other tools did. So we're going to prompt it to answer those while we go through what it did. So that's a little disappointing when you agree, Giles. All the others picked that up but they didn't.


[00:26:00] Co-Host: Giles : Yeah. It's I think it's going to be a mixed bag. So let's ask it to do the bonuses.


[00:26:06] Host: Paul Barnhurst: And then how about you run us through how it did kind of what you, what you thought of its performance. Yeah.


[00:26:11] Co-Host: Giles : Do you want do you want me to do that now or. Yeah.


[00:26:13] Host: Paul Barnhurst: Yeah. Let's just let it run. Why don't we go through the levels one through seven now? Well, let's work on the bonus questions if we can.


[00:26:18] Co-Host: Giles : Okay. So good and bad. Uh, I think it's got the I don't understand what this overlay is, but again, that's just I'm not familiar with the tool in detail. So level two is a great example right. Level one it got right left function perfect. Level two it has. Oh, God. It keeps jumping.


[00:26:38] Host: Paul Barnhurst: Yeah. It's gonna move around a little bit on you. So we'll do level two.


[00:26:41] Co-Host: Giles : It's got the right answers here. Okay. You had to split the text string with numbers and sum the two largest. But it's done that initially. Okay. Uh, it's going to annoy me jumping around, but but anyway, this is the formula. So it's actually kind of split each of the numbers out here. But look at what it's doing.


[00:27:04] Host: Paul Barnhurst: Why don't why don't we pause and come back. We'll let the bonus questions finish. You don't have to see the screen bouncing back and forth.


[00:27:10] Co-Host: Giles : Yeah, sure. Let's do that.


[00:27:11] Host: Paul Barnhurst: Okay. Give us. We'll be back in one minute or so. All right, we're back. We had a prompted on the bonuses, but it got all of them right. But one and it was one off. Made a reasonable assumption of why it's off. I think in that it counted the color pink in the legend, plus the map. Right?


[00:27:31] Co-Host: Giles : Yeah, I think we can forgive it that I mean that. That's hard, because we wanted to count the number of pink cells in the grid here. It included this. So it's one off I think for me okay. It got it wrong one by one, but it's quite understandable why it would get that wrong.


[00:27:50] Host: Paul Barnhurst: So I did a formula that was correct. It had the right logic. There was just a misunderstanding which is reasonable now a hard coded it so we don't know how it did it.


[00:27:59] Co-Host: Giles : We don't know which is not perfect but it's very close. So I think very.


[00:28:04] Host: Paul Barnhurst: Good job I give him. Yeah.


[00:28:05] Co-Host: Giles : So here's where I'm at with this. In terms of the answers I think it's got all of the levels right, which is great. But I have massive issues with the way it's done it. So this is going to be a classic thing where if the challenge was just get the right answer. I don't care how you build the model. Great. It's got full marks. Uh, this was the one where you've got to pick the top two numbers from this text string and add them up. Uh, so I thought it had a dynamic range here, but it's actually individually built a calculation that is adding.


[00:28:44] Host: Paul Barnhurst: The splitting each amount individually.


[00:28:46] Co-Host: Giles : If you look at this, I'm assuming if I keep going to the right here.


[00:28:51] Host: Paul Barnhurst: Each one gets bigger.


[00:28:52] Co-Host: Giles : Yeah. So let me and I've also noticed my Excel spreadsheet is now seriously slow. So I might just go back one. I just want to show you this. So here is the formula that it created for I don't know I can't even bring that.


[00:29:12] Host: Paul Barnhurst: Looks like that may be that that burns my eyes. Could you stop.


[00:29:17] Co-Host: Giles : So there's getting the right answer and then there's building something like that, which to me is absolutely horrific.


[00:29:25] Host: Paul Barnhurst: Agree, that is a terrible method to build it.


[00:29:27] Co-Host: Giles : So it's then doing some weird things. So the next level was uh, pull out the column number corresponding to the reference here. Uh, so if you take this we've got F10, F is the sixth column. It's done the right approach in our eyes column and indirect. But it's, it's made it a range which you don't need to do. So I don't know why it's done that uh got the right answer though, which is good. If you go down here. Good work. It's done. Len. Minus Len substitute. It's using the, the cells, which is what we complained about with some of the others. It was sort of, um, not linking. So I think that's good. This is horrendously overcomplicated, but it is eventually getting to the right answer. Uh, so ticking the box and then indirect map is great. And the final level where you've got to go to the map and offset by one up or down.


[00:30:24] Host: Paul Barnhurst: Good works. Interesting. They did the if statement first most the others you see the offset and then the if statement. But irrelevant.


[00:30:32] Co-Host: Giles : Yeah I'm okay with that. So I think.


[00:30:34] Host: Paul Barnhurst: Okay with that as.


[00:30:35] Co-Host: Giles : Well. I think it's done a good job. But you would not in a million years accept that formula in in a professional model. No chance.


[00:30:46] Host: Paul Barnhurst: Sure. Definitely concerned with that. All right. Why don't we move on to the next one? In the interest of time? We've been going a while. We're going to give it the harder case. Why don't we do that in Excel itself versus its web version? You're good with that.


[00:30:57] Co-Host: Giles : Yeah, I think I am. So I have to figure out how exactly I do it. But bear with me one second.


[00:31:03] Host: Paul Barnhurst: We're going to pause this for one second while Giles goes to Remedial Shortcut School. We'll be right back. We're back. We've moved on to the, uh, we'll call an intermediate case, one that was used for the Excel UK Championships in London. Right? Yep.


[00:31:19] Co-Host: Giles : Harder case, Just like before, I'm going to give it a pretty high level prompt to analyze the case tab. Answer the questions. I'll hit go and let's see what it does. I'm probably will have to pause for this.


[00:31:30] Host: Paul Barnhurst: We are back. We're going to see how, uh, shortcut did here with this intermediate case. So Giles, why don't you take this one away?


[00:31:36] Co-Host: Giles : Yeah, sure. So, uh, if you ignore knowing anything about the claims made, I'd say it's done really well. So it's got five out of seven levels, right? Got level one wrong. Uh, got level two. Right. I would say that the formulas it's using here. Okay. Nothing horrendous.


[00:31:56] Host: Paul Barnhurst: I think there's some of the better formulas shorter we've seen so far.


[00:31:59] Co-Host: Giles : I feel like I need to look into that one myself. What how is that doing that so quickly. That. So there's quite a lot of that is fascinating to me. I need to figure out a match.


[00:32:11] Host: Paul Barnhurst: So it's doing a max of the range and matching the max.


[00:32:16] Co-Host: Giles : I'll have to look at that because there's quite a lot of layers to what goes on in this, uh, where it's kind of changing the scores. I'm just wondering whether it's actually changed the underlying, uh, scores here, because it doesn't look like there's much other logic going on in these. Oh, oh, hang on a minute. Oh.


[00:32:39] Host: Paul Barnhurst: So they use different logic for each one. It's not a drag and drop.


[00:32:43] Co-Host: Giles : So what was confusing me was the kind of, um, emotions here change the scores here. And I couldn't see how it was doing. What is quite complex work, the answer is, is doing it all manually one by one. So again, that's this is something that Ian calls out. That's fine. It's got times two. But these twos are hard coded. It's fine. But you couldn't replicate that. So it's good. It's a hard level. It got the right answer but it didn't do it in a very good way. Level five I mean okay, it's another hard level. It's got the right answer. That's pretty crazily horrendous. But it's also a hard answer. And it's chosen hard coded.


[00:33:27] Host: Paul Barnhurst: Again, like round 200, divide the min of the absolute value. So you got that that round right up front. Yeah.


[00:33:35] Co-Host: Giles : But I think like it's.


[00:33:36] Host: Paul Barnhurst: Got it right.


[00:33:37] Co-Host: Giles : It's impressive that it got that right. Level level six. It got it got three right. But then everything else wrong. And this is hard coded. So we don't know what the logic.


[00:33:47] Host: Paul Barnhurst: That's always frustrating. But again we didn't specifically did ask it for formulas but it decided to hard code.


[00:33:53] Co-Host: Giles : So at level seven it got right. So and it got some of the bonuses wrong here too. The bonuses are wrong. It did find my favorite drink. So shortcut gets massive kudos for.


[00:34:06] Host: Paul Barnhurst: That's 100 bonus points to shortcut for getting that one.


[00:34:12] Co-Host: Giles : So my position is how did they not made claims like it can solve Excel and FMC cases ten times faster than the human champions. I would be stood here and the only thing I would say is that's really impressive. Probably one of the best we've seen. But because they've made that claim, I think they've failed against their claim. That's not ten times better or faster than the world champions.


[00:34:37] Host: Paul Barnhurst: Sure, they failed to live up to some of the hype, but it's a solid tool.


[00:34:42] Co-Host: Giles : It's annoying, but that's what drives me nuts, is I have an air of negativity about shortcut because of the hype, and it's genuinely one of the better tools on the market that we've seen, even compared to agent.


[00:34:54] Host: Paul Barnhurst: Let's keep going. In the interest of time, we're going to hold off on the trial balance. One. We may come back to it. Why don't we go to some of the modeling cases. We'll come to my for last. Since you're already going, you're sure?


[00:35:05] Co-Host: Giles : Let's pause because I have to switch over. But then we'll go into that for sure.


[00:35:09] Host: Paul Barnhurst: Alrighty. So should we pause here for a moment?


[00:35:12] Co-Host: Giles : I think pause, because I'll bring the files up and then we'll do.


[00:35:14] Host: Paul Barnhurst: All right. So we're going to pause. Let him bring up the files. We're going to do the modeling one next. And then we'll go through some of the others. So we'll be back in a moment. We're back. We're going to start with letting it build a model. Then we'll uh, kind of go from there. We don't have any prompts here with us today. So Giles has recreated roughly what he did. And what we're basically asking to do is, if, you know, is this is one of the FMI cases for the AFM exam, Henderson Manufacturing, I can remember it's one of their practice models, building it more than once actually practicing to for the test going through it several times. So it's funny. Every time I see it I'm like, I remember that thing. But, uh, I'm not sure if that's a good or bad thing, but we'll leave that alone. Um, so what we're going to do is he's basically asked it, given a lot of freeway. Hey, you have historical data. Build me a five year model. So walk us through the prompt, let's go there and then we'll kick it off.


[00:36:10] Co-Host: Giles : Yeah. So I remember it was simple for me. And so build a five year forecast in the model starting from historical actuals provided. Make reasonable assumptions, build schedules for each area of the model, and follow best practice. And for context, what we've seen so far is almost every tool has really struggled here. And then we saw agent Microsoft's agent, and it did a very impressive job, really. Bizarrely, agent almost seemed like a uniquely better in the financial modeling area. And the other tools that we've seen didn't do particularly well here. So if shortcut can get to agents level, it will be doing really, really well.


[00:36:48] Host: Paul Barnhurst: Yeah. And so just, you know, we'll probably pause and be back in ten, 15 minutes. This is a long task. We'll fill you in if we think we need to along the way. But just remember we're letting it work. So why don't we kick off the prompt and see how this goes. All right, we're back. And, uh, it's done work. So overall seems impressive. Definitely some issues. But we did get a workable model, some tools we haven't even really got what I call workable model. We have something workable here, so kudos on that. There's some good layout things, but why don't you walk us through it? Giles? It's a little bit of a mixed bag.


[00:37:21] Co-Host: Giles : Yeah. So again, I if I just start as if I'd seen or not heard or seen any of the hype, I think this is done. One of the best jobs out there. This is this is kind of in and around the same level as agents. So I think Presentationally that is, you know, it's actually built a three statement, five year annual forecast, which is what we asked it to do. Things that I really like, which I don't even think agent did, was it's added like a separate assumptions tab and a supporting schedules tab, like in terms of the way I model that's more in line than agent was.


[00:37:56] Host: Paul Barnhurst: Well, I really like it. Did all the color coding.


[00:37:59] Co-Host: Giles : Color coding so that there is lots in here that I'm like, do you know what? That is a really good job. It's also wrong. Uh, so that is a big.


[00:38:10] Host: Paul Barnhurst: Problem because it's wrong. So let's walk through some of that. Then we're going to give it a chance to fix itself like we've done with others. So why don't we walk? Why don't you walk us through the issues you see?


[00:38:18] Co-Host: Giles : Well, I'll walk through everything because I'll go through the good and the bad. So, uh, in terms of this is kind of like a very vertical, uh, not a vertical. So a particular, uh, there's bits of horizontal, but you've got like a lot of the logic happening in the financial statements lines. I misspoke when I said vertical. Um, so not a lot of the schedule work. Does the calculation work? It's almost like a little bit of a mini schedule, uh, model. So you've got the growth rates being applied here for, for lots of these lines.


[00:38:51] Host: Paul Barnhurst: Instead of having a schedule that it's linked to.


[00:38:53] Co-Host: Giles : Yeah. But but to this point the income statement looks pretty good. Depreciation is coming through. It's making an interesting choice here where you're actually hard coding the assumptions for depreciation if you look at the schedule. But by the way kudos for having a PA schedule that looks nice and is right in the sense that you've got an opening balance CapEx depreciation, which is just matched, and then a closing balance. Cool. Uh, I assume that formula is right. Yeah. So that looks good. Interest taxes. It's making some high level assumptions on the split between current and deferred. But again that's fine. We asked it to make assumptions. So down to net income really good.


[00:39:39] Host: Paul Barnhurst: Stop. Go back for one second. One thing I think is interesting go back to the net income line. So notice G and I are all US dollar. And then it took probably because you're you're computer settings is GBP. And you'll notice it's now using pound currency which is not a big deal. We could easily ask it to fix it. But it's just interesting.


[00:40:01] Co-Host: Giles : It is interesting. But I think um, one of the other tools, you know, but.


[00:40:04] Host: Paul Barnhurst: I'm not going to hold that against them. If that's their biggest problem, they get 100.


[00:40:08] Co-Host: Giles : It's not one of the other tools had the net income link for the operating cash flow up here. So. So look, it's produced a very well structured model. Uh, net income is coming down. It's adding back depreciation. You've got deferred yet. So perfect deferred income taxes are being adjusted often the operating cash flow. Good then. So this is where do you remember I said at the beginning like this is the most dangerous period at the moment. I think that we'll ever experience in this space because this looks great. And if you were to rely on the number, you'd be in a lot of trouble because it's got the signage wrong on the CapEx spend. So this is coming through as a positive cash increase because of CapEx spend.


[00:40:52] Host: Paul Barnhurst: Isn't the sign wrong on other as well. But it's all zero, so it doesn't matter.


[00:40:56] Co-Host: Giles : Possibly you've also got. So this is interesting I thought this was just immediately.


[00:41:02] Host: Paul Barnhurst: You know what else I just noticed. It didn't start with the right year in the model.


[00:41:08] Co-Host: Giles : Yeah, interesting.


[00:41:09] Host: Paul Barnhurst: That should be 2025.


[00:41:11] Co-Host: Giles : Yeah.


[00:41:12] Host: Paul Barnhurst: It linked it to the last year of actuals.


[00:41:14] Co-Host: Giles : So that's so this is a really weird one. And again this isn't unique to shortcut. Sometimes it's just like it has a bug in where the links are. This should have been pointing to the year before and it was pointing two years before. I don't know why that happens. Um I'm sure that will get better with time. So the biggest issue for me is the CapEx is wrong. It's also, you know, it's quite unusual to see something like this where you've got, uh, what was I going to show you? The revolvers being I think that is wrong. You know, I think the signage is wrong on the revolver. That's a repayment of the revolver year on year and a cash balance of ten.


[00:41:50] Host: Paul Barnhurst: Yeah, I would say put a put a $10 plug hard code in the bank debt revolver. So there's clearly some issues there. So the CapEx issue.


[00:42:00] Co-Host: Giles : The revolver in the cash doesn't look right to me.


[00:42:05] Host: Paul Barnhurst: It doesn't matter.


[00:42:06] Co-Host: Giles : I mean, yeah, so if you're repaying the revolver balance off by these amounts, you wouldn't expect to see the revolver balance increase period by period. So I think the revolver is wrong and it doesn't balance. So the main, uh, let me just do it down here. So if you do assets minus liabilities and equity, uh, assume this is in millions. So it hasn't produced a balancing balance sheet. So agent produced a balancing balance sheet. And with one prompt it got the revolver. It actually said if you remember, uh, I haven't done the revolver yet. I can add it. Do you want me to do this? And we said yes. And then it added it. So it's this shortcuts done well and it's not at the level that agent did because as far as we could see.


[00:42:59] Host: Paul Barnhurst: A revolver plug to maintain a minimum 10 million cash balance. So why don't we ask it to flip the sign on the CapEx and let it run through the model. Then let's ask it to say, hey, it doesn't balance. Can you review and tell us why it doesn't balance? And then we'll ask it to fix it if it comes back with a reasonable answer. Okay. So we're going to try both of those and see if they can fix them. Good job. But this reemphasizes our point. Regardless of how good the model looks, you need to really know what you're doing.


[00:43:34] Co-Host: Giles : Absolutely. I mean, it's only because, you know, we've done financial modeling before that you could go through this and go, hey, that's that's right. That's not right. And this is why I think it's so interesting. I mean, shortcut will get better and better with time. So, you know, I'm sure it won't always be the case, but I think one of the other big claims was like, it can do the job that an analyst would do in four hours, in 20 minutes or something.


[00:43:57] Host: Paul Barnhurst: Um, I think it said it's better than the eight is already better than 80% of models or whatever. Yeah, I had some claims that.


[00:44:04] Co-Host: Giles : So that might be the case if 80% of your modelers are terrible. But if you imagine you were an investment manager and your junior analyst came to you and said they'd go, they'd be out the door.


[00:44:14] Host: Paul Barnhurst: Yeah, maybe if they were brand new first time you'd train them, but it wouldn't be long before they'd be out the door.


[00:44:19] Co-Host: Giles : So that's. Should we pause? We'll see if it can fix. Yeah. Let's pause.


[00:44:24] Host: Paul Barnhurst: Give it a minute to rework itself and see where it goes. All right, we're back. We asked it to fix two things. Giles asked it to fix the CapEx, which showed the wrong sign, and we'll explain what was going on there. And we asked it to fix the balance sheet. It got the number closer. But there are two very concerning things here that I would call red flags. Whole great job. But let's let's talk through the first one. Help us understand this negative number. This is what it did there because this is bizarre.


[00:44:53] Co-Host: Giles : This is one of the most insanely ridiculous things I've seen anything happen in Excel. So it is if you look at this number. Right. Capex is negative. Uh 1415. Sorry. Let me bring it over here. This is the CapEx line, which I was looking at and saying, oh, they've got the sign wrong. That's clearly a logical error. And then we saw as it was going through it just said in in its own text it said, no, it's all right. The four I can't remember what it said, but it was like the it's we've just used parentheses in the formatting. So I thought, that's weird. Actually when you do look at this, you've got in the assumption line, it is a positive set of numbers for CapEx. And there is a negative sign at the start of this. So in theory that should be negative. If you look at the format, I swear to God I've never seen this in my life. It's reversed the formatting so that the positive numbers have negative and the.


[00:45:50] Host: Paul Barnhurst: Negative numbers are positive.


[00:45:52] Co-Host: Giles : So I just like again stepping back from my frustration of the hype, really like top marks for doing well with the structure and everything. But when you think of where we're at and the claims being made, not just by shortcut all over the place, that is deadly. Like, could you imagine an analyst, what would happen to an analyst if they just manually switched the formatting to show this? Now I could add another prompt and I could tell it you've done this wrong. Don't do that. But you are getting to the point where you're like, you have to really be careful and find all of these issues.


[00:46:27] Host: Paul Barnhurst: The biggest thing with that is not that it. It's not that it's not easy to fix. That's not the issue. Yeah. The issue is it's extremely hard to spot to find. That's what's going on. That they took a custom number formatting and reversed it.


[00:46:42] Co-Host: Giles : Yeah. Well because I'll tell you, a.


[00:46:44] Host: Paul Barnhurst: Normal human as I've ever seen done in a model.


[00:46:47] Co-Host: Giles : Yeah. Because I think most people. What would you do? Okay. You'd be look, I mean, if you were, if you weren't really careful, you'd be looking at this and you'd probably, you know, maybe you'd remove the minus without checking anything else. Because now that looks right. But actually that's positive numbers because it's reversed the formatting.


[00:47:05] Host: Paul Barnhurst: That's actually wrong.


[00:47:06] Co-Host: Giles : That's just so dangerous. Um, it also so it does look like it's trying to maintain a cash balance of 10 million. The revolver. What did it do with the revolver in the end. So the revolver drawdown is like 29 seven. Um, that looks better. It's still. So it's. This was the other thing, right? As we watched it, we both laughed. It still doesn't have a balancing balance sheet, right.


[00:47:33] Host: Paul Barnhurst: But read what it said. We have to read this because I think this is worth reading. Can you go to where it so I would I can I have the fun on this one please do. Other LTA is correctly hard coded at 12. Given the complexity of tracking down $1.30 constant offset, which could be from rounding and net working capital calculation, tax Calculations are any number of places. Given that it represents only 0.26% of total assets. I'll consider this acceptable for a financial model. Most financial models have some small rounding differences. Let me provide a final summary. No.


[00:48:18] Co-Host: Giles : Yeah, it's 1.3 million, which he hasn't spotted. Uh, so.


[00:48:21] Host: Paul Barnhurst: And that gets to their kind of confidence. Really interesting though. Basically said look, it's kind of like we as a human it's two in the morning. You ever been there, Giles? They're like, screw it, I'm plugging it and I'm going to bed.


[00:48:32] Co-Host: Giles : That's exactly what it felt like.


[00:48:35] Host: Paul Barnhurst: It raised the white flag and said, I'm out. Tap me out.


[00:48:39] Co-Host: Giles : Yeah. So I think, you know, I we want to do your tests. In summary, it did better than most. I don't think it performed as well as agent mode, because agents seemed to do more of the fundamentals right on the.


[00:48:55] Host: Paul Barnhurst: Fundamentals, a better job of layout and color coding than agent.


[00:49:00] Co-Host: Giles : It did. It did.


[00:49:01] Host: Paul Barnhurst: I'll take fundamentals over presentation.


[00:49:04] Co-Host: Giles : But it also made one of the most ridiculous decisions I've I've seen anywhere. That is just so illogical. I can't really put it into more words than that. Um, yeah. So. So again, lots of hype.


[00:49:18] Host: Paul Barnhurst: Huge red flags. One telling me a variance in your balance sheet is acceptable. It's one thing if you were adding in actuals and your cash didn't plug by, say, $1,000 on a million or whatever, and you put it in because you couldn't reconcile it as an unreconciled item, I've done that when updating to actuals. Yeah, I could live with that balance sheet. Not balancing. No. You need to figure out why that I don't find acceptable. So all right let's jump into my case. We're going to pause here. I'm going to get set up and we'll be back in just a moment. All right. We're back. And I think in the interest of time, we're going to directly start with the, uh, deferred revenue, Avenue, because I'm going to assume it's going to do a good job on the PVM. Nobody's missed that yet. And then if we have time, maybe we'll try the analysis one with your ledger, because I think that's a more interesting use case. So let's run through the deferred revenue and we'll just see how much time we have left, whether we can run one more or not. So I'm going to kick this off. But just as a reminder this is prepaid revenue. You get 25,000 or let's say 24,000. For simple math, it's a two year contract. I need to recognize a thousand a month as I earn that. I'm asking it to build that schedule.


[00:50:27] Host: Paul Barnhurst: I've given it a single prompt saying, hey, I want you to include the following columns. I want you to put headers at the top. I want to make sure everything goes to zero. I want roughly a four year schedule. So let's go ahead and let it work and see how it does. All right. So we're going to go ahead and pause it here. Probably take 4 or 5 minutes. Then we'll come. We'll come back. But we're going to go ahead and let it work. So it's already kicked off and started here. All right we're back. It's just putting the final formatting touches on it. It's nearly done. Excellent. The schedule is working perfectly. Let me verify the overall structure. So let's walk through it. Well, it does its last little tasks again. We're in Excel here. Nice. What I do like I left the column and then, you know, the first row outside of the title there. And so there's a little more spacing. I like that all the column widths look like the formatting looks right. That's nice. Um, some have flipped contract amount and put it over here. They put it here. Don't care. Formula works. What we don't like is again it hard coded and it did the year minus year times 12 plus month minus month plus one works a little more complex. Do a date diff or something else to divide it by.


[00:51:42] Co-Host: Giles : Every tool has consistently put hard coded numbers in from the for the dates, isn't it that it is.


[00:51:49] Host: Paul Barnhurst: The other thing you know again gets getting the totals right here if we come all the way to the end. So it summed it all and then it checked it. It all comes to zero. Many others did an absolute balance and allowed an abs and allowed it a rounding error. Some would do an okay in conditional formatting. I've seen I prefer those a little better than just a straight, but again, I'm not going to penalize them for that. I'm fine with that. We'll ask them to correct the formula. And so I probably should change the prompt to say they should have made the formula dynamic. And I'm guessing it would get it right. So not a not a huge issue but an annoyance. So starting with G3, please change all formulas to be dynamic. And instead of hard coding the date, set the weights equal to the date in the column headers in row two. I think that should work. See if it gets. Let's see if it did the dates right here. Ooh, these aren't dates that it's not going to work. Yeah. Let's see if it figures that part out. So again that's another issue. That's why it did the is it got to be very specific in the prompts. Right. If I rewrote that and prompted it differently now I can fix it on this end. But little things. So I'd say not the best one I've seen here. One of them did do dynamic, but on par with the others.


[00:53:21] Co-Host: Giles : So it fills in and around where all of them have more or less.


[00:53:25] Host: Paul Barnhurst: Yeah, it feels about the same on this task. All of them have had little issues. Some have been a little better, some have been a little worse. Like sometimes they get the the dates way wrong and but they've all pretty much done the task and with minimal prompting I can have a usable schedule. So I think that's the key takeaway. We'll see real here. So it did the date. Why is it doing one minus n two.


[00:53:50] Co-Host: Giles : You don't know.


[00:53:52] Host: Paul Barnhurst: That's kind of interesting.


[00:53:54] Co-Host: Giles : That's in quotation marks as well. Yeah, I don't know.


[00:53:57] Host: Paul Barnhurst: So very again, this gets back to what we've said many times. The formulas are weird sometimes if you know exactly what you want. Be specific.


[00:54:07] Co-Host: Giles : Oh, do you know what I think? Sorry. It's it's making the date. Because. Because of exactly what you pointed out. The date. It's their text. So it's adding one to turn it into a concatenated set of characters. That's a date.


[00:54:21] Host: Paul Barnhurst: And what I would prefer to see, because if you ever decided to go to weeks or any other schedule is you really wanted to take end of month, kind of taking that first day and saying, is it just outside of that which works.


[00:54:31] Co-Host: Giles : Interesting that it's double. It's got the oh no, sorry, I thought that was a dollar sign on the column and the row. So it actually does have a dynamic in terms of like the rows and the columns. It does work. It's just a bit of a bit of a weird way of doing it.


[00:54:46] Host: Paul Barnhurst: Yeah. I should have told it to fix the date, but fine, let's go ahead and move on. I think I'd like to do your analysis case, if you're open to that. I think we have a few more minutes.


[00:54:54] Co-Host: Giles : Yeah. Okay.


[00:54:55] Host: Paul Barnhurst: I want to see how it compares to cab AI because let's be honest, so far no shortcuts. Done a good job ignoring the hype. There are definitely some key concerns, but I really think that task after we were both very impressed with Tab AI, I'd really like to see what shortcut does on that one.


[00:55:11] Co-Host: Giles : Let's pause. I'll bring it up and then we'll come back. Yeah. So this was the sort of the challenge that as you said, Tab, I did pretty well on. Uh, we have, uh, some trial balance data across a year, the kind of. Oh, well, the only the major interesting insight that you would hope it would pull out is that the, the net income kind of goes negative and then pops back up towards the end of the year. Uh, but really the ask is just analyze the data on the trial balance data tab and produce some sort of output summary with visuals and KPIs. So I will hit go and we'll see. And I think just before we probably pause for a second, I don't know if your head's in the same place as me, Paul. Uh, I think it's it's done reasonably well on. Certainly. Like, my initial esports challenges didn't get it perfect. But if you step away from the claims, that's still really good. It makes some weird decisions with functions and formulas, but we've seen that with every single tool. And then financial modeling wise, again, if I just say it without the hype, I think it did really well with the structure, and I think we probably could have got there with a few more prompts like, never say that it's okay to have a balance sheet that doesn't balance. Don't ever do that with formatting. You probably could have got there 1 or 2 times.


[00:56:32] Host: Paul Barnhurst: If you put an instruction in, don't use custom number formatting, right? If we had put that in in the beginning because it has instructions, problem solved.


[00:56:39] Co-Host: Giles : Exactly. But then you have to layer in the claims, which is, you know, it's better than it's ten times faster than the best financial model is in the world, which I mean, that that was nowhere near that. So it's a shame, but I still think shortcut has done overall a good job.


[00:56:57] Host: Paul Barnhurst: Yeah, it hasn't lived up to its claims. And we get it. There's marketing hype. We see a lot of that with AI. It's not just shortcut. Let's let's be fair there. But it is something we were measuring it against. So if you go against the claims, it's not where it claims to be. But it is very good right now. If I look at all of them, I, you know, I think agent is very I think many of them are solid, my favorite so far. And it's because of the overall like if you, you know, this one has a two. But I think it's the overall way it's structured, all the different models that it does. The reasoning that has a nice dashboard, you can see the usage, the most complete product, and I think it's competitive in whole against anyone else right now is Tab AI. But many of those are just UI things. Others can catch up. The reality is none of them are all that different. Some are really bad in certain areas. Some are very similar on tasks you could do if you're smart. I think with any of the tools we've tested so far, you could be more productive with certain parts of your job. Is that a fair statement?


[00:58:00] Co-Host: Giles : Yeah, I think I'm I mean, I have to say like what is producing now? It's it's definitely got a good sense of style and formatting, which some of the tools have been a bit further down. So I think tab AI and what I'm seeing as we talk, and my favorite in terms of the creating things that are visually appealing, which is great. So that's a really good start. Yeah, there was a lot about Tab AI that I liked. There were bits about everything that I've liked. There were bits of Rosie that I liked.


[00:58:26] Host: Paul Barnhurst: I like some of the Python stuff that that trace like, did I really like that? I like some of the other its tasks and its to do lists, and I would love to be able. There are certain things I would take from every single tool. If I was wanting the ideal tool, that's what I would like.


[00:58:42] Co-Host: Giles : And you know, it's going to change so quickly. I mean, all of these tools are, I'm sure, evolving every day. And it might be that in a month's time they've completely shifted in terms of where they compare, and agent will be a completely different beast. Again, it is just fascinating seeing.


[00:59:01] Host: Paul Barnhurst: And there's also price. Yeah, right. Yeah. There is from what I've seen for agent, am I paying $5,000 for the year? No, I mean for shortcut. I would go with a different tool that I could get for 30 a month.


[00:59:17] Co-Host: Giles : If, I mean, if you're paying 5000 a year, you would expect close to perfection, I would think, which we've not seen. But look, again, this.


[00:59:26] Host: Paul Barnhurst: Is really good. This is really good right here. Now, I don't know about numbers. Presentation is really good. Let's let's be clear.


[00:59:33] Co-Host: Giles : Presentationally that's the best. Even though it hasn't finished I'm already I would put this above what Tab AI did, albeit.


[00:59:40] Host: Paul Barnhurst: I don't know that I would because it's layering graphs on top of each other if it doesn't fix that.


[00:59:46] Co-Host: Giles : Oh, I think it may go down.


[00:59:48] Host: Paul Barnhurst: Scroll down. See right now some of those. If it doesn't fix that I would. That's a that's a big red red flag for me.


[00:59:58] Co-Host: Giles : We'll give it a we'll give it a bit more time. But I, I.


[01:00:00] Host: Paul Barnhurst: Let it keep running because remember how long we ran today for like 20 minutes. It never quite finished. It was taking so long.


[01:00:06] Co-Host: Giles : So so that's a big negative for me. These are all hard coded numbers.


[01:00:10] Host: Paul Barnhurst: Yeah that's a real problem.


[01:00:11] Co-Host: Giles : I mean that that. So today I did it with links to the right places. These are hard coded numbers. So that instantly kind of I'm.


[01:00:18] Host: Paul Barnhurst: Not I won't I would send this back to my analyst and say redo it all.


[01:00:22] Co-Host: Giles : Absolutely.


[01:00:22] Host: Paul Barnhurst: I'm not even going to look at it. Right.


[01:00:24] Co-Host: Giles : All of these numbers, if I move that over, I'll do I'll do its job for it. These are all hard coded numbers.


[01:00:29] Host: Paul Barnhurst: Correct me if I'm wrong. You're the manager. Your analyst gives this to you. What are you going to say?


[01:00:35] Co-Host: Giles : I'm going to say go and redo it again. Like you, you are not I grasping the concept of what we do when we try to build tools where you can automate and rerun processes.


[01:00:48] Host: Paul Barnhurst: Everything needs to be auditable, flexible, you know, traceable. And so yeah, we could probably ask it, but I don't know if those numbers are right. I have no idea.


[01:00:59] Co-Host: Giles : Yeah, I mean that the net income looks fine, but it's hard coded. So it's again, it's it's one of the shames of all this stuff. I guess at the moment it's really impressive. And it's also fundamentally wrong.


[01:01:09] Host: Paul Barnhurst: But what it gets back to too is those are things you could get right with prompting. So with time and with prompting, you can get some really good stuff. We're not encouraging people not to use these tools. I think it's the opposite. Make sure you know what you're doing, make sure you're willing to check things and be patient and recognize that there may be times quicker to do it on your own, and that they're just not they don't live up to the hype. I think that's at least for me, that's the message versus, hey, I'm massively disappointed because it did this or that. It's like, no, on the whole, I'm massively impressed.


[01:01:47] Co-Host: Giles : Yeah, it's a really good point. And actually, to be fair, I've just put the prompt in. So I appreciate we'll be wrapping up pretty quickly, but this is probably the headspace I would be in if I was really determined as a modeler to be using an AI tool. I just think you have to get this nonsense out of your head that you can click a button, grab a coffee, come back ten minutes later, the work's done. Send it off like it really is. Almost. You've got to spend as much time after your coffee. Go and get a coffee, but come back and just be ready for the fact that you've then got to check absolutely every part of what's been built, because the chances are there's going to be stuff you're going to have to fix or ask it to fix itself.


[01:02:28] Host: Paul Barnhurst: Yep. No. Agreed. I think we're all on the same page. So we'll give this a couple minutes here and we're going to head toward wrapping up. So we still have several more episodes we're going to do. We have some testing, some guests. We want to bring on some other exciting things. So we want you to stay with us. We want to give, you know, in fairness, we'll give a shout out to shortcut. They've done some good work, Nico, a good job. There's some, uh, definitely hope you take the time to watch this episode. There's some things we'd love to see fixed, but, uh, so far it appears pretty good job. Let's let this finish. We'll give last thoughts once this ran and we'll we're going to pause and just give it a few minutes to fix itself. And if it doesn't we'll just come back and wrap up. We're back. We're going to go ahead and wrap up here in a moment. It finished the analysis. So Giles, take us through it. Take us through the good, the bad. Give us your assessment here.


[01:03:21] Co-Host: Giles : Yeah I think it was doing a very good job, Presentationally. You're right, Paul, it was. It didn't leave the visuals very well. Kind of laid out. This is not ideal with stuff overlapping. Yeah. I'm sure we can do that in two minutes. Or we could ask it to fix it. Uh, it has now turned the kind of major numbers at the top here into formulas. And it's done that by adding this kind of calculation section here, where it's doing some ifs on the raw data. Annoyingly, again, it's I don't know whether it's just all LMS at the moment, but it's hard coding these dates. So rather than linking there it's hard coding one individually every cell but pretty good. I mean it's pretty good.


[01:04:03] Host: Paul Barnhurst: I think uh, looks pretty outside of the graphs, but it doesn't have near the depth of the analysis that tabs did. Remember. It gave every product and gave a breakdown of all the expenses. And yes, there's some nice graphs here, but this feels surface to me. Presented really nice, but it's just not as deep as what we saw on the other tool. But it's better than most of them. I would probably say it's the second best one I've seen. I think I'd have to go back and look at all of them. That's going off memory. So in fairness, maybe it's third, Stirred, but it's one of the better we've seen. But I don't think it's the deepest analysis. I think it's more surface than some other stuff we've seen.


[01:04:44] Co-Host: Giles : I think I probably agree, I think I probably did the best job overall. And then I think this there was nothing else that jumped out to me from the other tools. So no.


[01:04:54] Host: Paul Barnhurst: I don't think so. I mean, I think Trace Light did some good things, but they also had some fundamental flaws from what I remember with the EBITDA and some of that. So. All right. So as we wrap up again, I don't think anything's fundamentally changed for us in all of our testing here. I think our my statement and then I'll let you give your statement shortcut did a good job. As I said before, kudos. If we analyze it against the other tools, it's a solid tool. It definitely has areas it's better. There's some things I'm impressed with. There were some clear red flags. It does not live up to the hype, which has always been a complaint of Giles and mine, but on the whole solid testing. Good job.


[01:05:34] Co-Host: Giles : Yeah, I'm frustrated more than anything because I really want to support this tool the same way that I do the others, but I do find the overhype very annoying. I think it's led a precedent of others potentially hyping things up. So I have a bit of a personal bugbear with the approach. I understand that you have to do that for marketing, but if you just take the tool, I think it's done a pretty good job. In some areas. It is not better than the best financial modelers out there in the world, categorically, and some of them are. The mistakes were just shocking, but it's good. So I would probably still roughly be in the area of I think agent is is probably the best tool for me at the moment, bearing in mind it's on the web, which I don't like. And then I would say in some areas shortcuts. Next, in some areas I think Tab AI was probably next. There's bits, like you said, that I like of all of them, the pricing is really interesting because if we're comparing this versus Tab AI at 30, $40 a month.


[01:06:39] Host: Paul Barnhurst: Yeah, but the Pro version, which probably could have done everything we did today, is 40.


[01:06:43] Co-Host: Giles : I guess that's what we don't know. Is it so.


[01:06:45] Host: Paul Barnhurst: So we the other ones are probably a couple hundred two. So in fairness, the pricing isn't all that different. But seeing the 14,000 for a free trial price tag is like, wow.


[01:06:57] Co-Host: Giles : I'm glad we've tested it. I'm glad you know that. We've kind of seen some really huge areas of potential in this one. I don't want it to not work. Like, I really want there to be strong competition for Microsoft. Otherwise, Microsoft will just take its foot off the, you know, the pedal and get lazy. So. So yeah, I hope this works. I just really hope they back off the crazy hype and marketing a bit.


[01:07:22] Host: Paul Barnhurst: And where I put it, if I was kind of ranking them like you did, I think right now Tab AI on the whole is my favorite. I think because Excel doesn't have instructions, I think it doesn't allow files to be uploaded. It doesn't have the dashboard, doesn't have all the different models. I think you can do more with tab AI at the moment. And then shortcuts probably next, then Excel agent. But I'm looking at the overall product, not just how well the model performed on the tasks like. And so we'll see at the rest of it. And the thing is it could change in a month, two months, three months. And there are really some things I love from Trace Lite. As I said before, if I could take something from all of them and put together the ideal product, that's what I would do.


[01:08:05] Co-Host: Giles : It feels like a really nice way to end this episode.


[01:08:08] Host: Paul Barnhurst: So on that note, we're going to go ahead and wrap up. I'm going to go ahead and cancel my subscription. You should do that. So you don't see me on LinkedIn begging you to buy courses because I broke the bank. I may do that anyway for Black Friday, but it won't be because I wasted all my money. Until next time. Thank you everybody.


[01:08:25] Music: The mod squad. We are the Mod squad.

Next
Next

How TabAI stacks up as an Excel AI Agent for Financial Modeling Pros, with Ian and Giles