Nicholas

How Two Engineers Ship Like a Team of 15 With AI Agents

Nicholas

If you’re using AI to just write code, you’re missing out. Two engineers at Every shipped six features, five bug fixes, and three infrastructure updates in one week—and they did it by designing workflows with AI agents, where each task makes the next one easier, faster, and more reliable. In this episode of AI & I , Dan Shipper interviewed the pair—Kieran Klaassen, general manager of Cora , our inbox management tool, and Cora engineer Nityesh Agarwal—about how they’re compounding their engineering with AI. They walk Dan through their workflow in Anthropic’s agentic coding tool, Claude Code , and the mental models they’ve developed for making AI agents truly useful. Kieran, our resident AI-agent aficionado , also ranked all the AI coding assistants he’s used. If you found this episode interesting, please like, subscribe, comment, and share! Want even more? Sign up for Every to unlock our ultimate guide to prompting ChatGPT here: https://every.ck.page/ultimate-guide-to-prompting-chatgpt . It’s usually only for paying subscribers, but you can get it here for free. To hear more from Dan Shipper: Subscribe to Every: https://every.to/subscribe Follow him on X: https://twitter.com/danshipper Sponsors: Microsoft Teams Want seamless collaboration without the cost? Microsoft Teams offers a robust free plan for individuals that delivers unlimited chat, 60-minute video meetings, and file sharing—all within one intuitive workspace that keeps your projects moving forward. Head to ⁠https://aka.ms/every⁠ to use Teams for free, and experience effortless collaboration, today.

Published
Published Jun 11, 2025
Uploaded
Uploaded Jun 12, 2026
File type
POD
Queried
0

Full transcript

Showing the full transcript for this episode.

AI-generated transcript with timestamped sections.

0:00-1:42

[00:00] You're figuring out how to do compounding engineering. There's two people on the team, but it really feels like there's 15. Coding with AI is more than just the coding part. Utilizing it for research, for workflows, it should be used for everything. We haven't touched WinServe or Cursor in the last three weeks. Both of those have agentic coding capabilities, but Claude just [00:23] takes it one step further by simplifying it by a factor of 10. We've been really leaning into let's AI do the work for us and we're just [00:32] managing the AI. You speak your feature into cloud code and then it does all the research to create that long document and then just adds it into GitHub issues. That's really cool. [00:44] What you did first is spent time building a prompt that effectively builds other prompts. Having an idea that has a lot of outcome. This is part of the compounding effect. We had, I think, six or seven running at the same time because we were just like, "New idea, let's go. New idea, let's go." [01:15] Kieran, Nitesh, welcome to the show. [01:17] Thank you. So much for having us, Tom. [01:20] I'm psyched to have you. So for people who don't know, both of you work on Quora, which is Evry's AI email assistant. Kieran, you're the GM. Natasha, you're an engineer. And beyond the fact that Quora is a really cool product, and I'm really excited to bring that to everybody who listens to this show or watches this show, I wanted to do an episode with the two of you because I think that you're figuring out a new way to do engineering.

1:44-3:32

[01:44] Because really, Quora has, you know, there's two people on the team, but it really feels [01:50] because you've got agents who are pulling down PRs and working on branches, and then you're like pushing them up and other agents reviewing. And it's just like this kind of crazy thing that... [02:04] It's a new way to build software. [02:07] And Kieran, you said something the other day that really stuck with me, which is like... [02:12] Thank you. [02:13] you're figuring out how to do compounding engineering. So with each piece of work you do, you're making it easier to do the next piece of work. And I just think that it's really important to bring what you guys are learning [02:27] to everybody that watches the show because it's like we we have new tools and so we need new principles and new workflows for using those tools and so i'm really excited to talk to you about that [02:39] Yeah, thanks. It's really fun to [02:41] builds Quora, but [02:42] like being part of every and like being in an environment where you get access to tools like access to thinking access to exciting new ways to to work really helps us rethink how we build so like it's it's really an experiment we're building a product quora but at the same time we're figuring out how we should build and and that's super interesting and we're like [03:07] right in the middle where people say, what do you think of this new model? Like, like, how do we use this research tool? And we're just trying things out. And Natasha and I, we've been like, really feeling a shift in the last weeks, I would say, where we're like, like, things are changing. And we're not the only ones like we hear other people say that as well. But not a lot of people. And

3:33-5:21

[03:33] What we've learned is... [03:36] a lot and we want to share a little bit of what we learned and also what we know is like we're just barely starting we're scratching the surface of this and it's a big shift that's happening right now uh by new models by how people think by mcp by like just [03:53] it's a lot and uh yeah like it's great to talk about that from different perspectives uh yeah [03:59] Yeah, I agree. And I think it is so special to be at every because we do like every day there's someone new in the discord who's like, I built an AI agent, do you want to like use it before we launch it? And so, you know, we get access to open AI models before they come out and sometimes anthropic models. And so we have this like early edge. And then and then you guys are so good at. [04:22] figuring out how to actually incorporate them into like a production process. So you said, Kieran, that something changed. So I guess I want to get a sense of what you think changed and what, like, draw the broad strokes of what the workflow is that's starting to emerge for you guys. [04:38] Yeah, for me, like, obviously, it's like everything coming together. But I think the biggest thing is a realization. [04:47] In myself that... [04:50] coding with AI, [04:53] is more than just the coding part. And it's really about... [04:57] like [04:58] utilizing it for research, for workflows, for everything. Like it should be used for everything. And we're now at a point where the agents are good enough that they can actually do everything. So we need to rethink again, like, hey, cursor, windsurf, like the old school way of coding was great, like more of the vibe coding, like that was one step. And then now

5:21-7:16

[05:21] is the realization, oh, actually, we can just give a task and it will do it. But still, the work needs to be done by like, what do we do? How do we do it? And just a realization that we should lean into that more and really go deep and, [05:37] And it's like cloud code, like it's just good coding agents or agents available that actually start to work with new models like cloud for like really good at following directions and instructions. And it's all that's coming together. And that's. [05:52] that I realized like, oh, we're here, like the future is here. This thing we've been talking about that was going to be the agentic evolution. Suddenly it works and it's working in real world, non experimental playing. It's just like we're building an app and it's working building the app. [06:11] So what I'm hearing is like, [06:14] it's not just about developing with AI, it's all the things that go into developing that you're using AI for and that the [06:21] the thing that you're using the most for this is Cloud Code. Is that right? And if it's right, tell me about like, [06:28] For people who don't know Cloud Code or haven't used it, give us a little introduction to Cloud Code and then tell us about exactly how you're using it. [06:35] Yeah, Claude code is basically the coding agent version from Anthropic that uses Claude under the hood. And, [06:44] It runs in your... [06:46] your terminal as a CLI tool, which is kind of... Do you want to share your screen and show us? Yeah. Yeah, so Cloud Code is a tool that you use in your terminal. And I know for non-technical people, this is like, oh, this is scary. But I've converted friends who were not technical to use Cloud Code, and they were like, oh, this is great. But it's really simple. You just hit, you start your terminal, you say Cloud, and an interface will pop up.

7:16-8:57

[07:16] And basically for people who are listening instead of watching, he's in his terminal. It's, you know, the classic black screen that you're, you know, feels like you're using DOS or something. And he just typed Claude. And then we just got a thing that says, welcome to Claude code. And there's a little text box for him to type in any command. [07:34] yeah and why is this different or what makes this different this has access to the directory or the computer so it can look through files on my computer already it can run things on my computer it can take screenshots of websites it can search the web like it has tools but way more tools than available in a normal [07:56] cloud version. And [08:01] That's important because engineering work, like building stuff, you do need more tools than just like the basics. You need GitHub to see what you need to build or what the status is or what the CI pipeline does. Like do the test fill, like having all these things available in one coding agent. [08:22] actually makes it possible for me to have a workflow or like a thing I do actually be done by an agent. And that's the important thing, like really the compound word comes in by. [08:38] doing more than just coding because lots of like if you talk to an engineer, like most of the work is maybe coding, but maybe it's actually 20%, maybe 80% of the work is like figuring out what to do next or understanding what people like what their feedback is and how to interpret it.

8:57-10:27

[08:57] um, [08:58] And what you can do here is you can, for example, like a fun way is like to use it to say, let's say, what did we ship in the last week? [09:08] So it knows stuff. [09:11] So I'm asking it, [09:13] what we shipped. And [09:15] it will most likely look at [09:18] the Git log because that's how we track what we did ship. And yeah, so it looks through the Git log, it looks at what we merged to main, [09:28] And... [09:29] Yeah, that's a fun way to use it. And for example, we can use this for product marketing. And it says, oh, these are debug fixes, a brief skip functionality, chat panel state, email summary, XML tags, major features, brief help monitoring, time zone, auto detection. These are all things we released. And now I can say. And it's written in a nice write up that. Yeah, it's more technical. Yeah. [09:59] there's there's what like six major features and like five important bug fixes and three infrastructure updates like that's a lot. [10:08] Yes, it's a lot. And like this week, we've been really leaning into like, let's AI do the work for us and we're just. [10:16] managing the AI. One other thing, for example, is if you have someone come to you like, oh, what is the status on this? Or like, what are you going to ship next week? Let's see what it will do.

10:28-12:26

[10:28] Can you see what is in the pipeline and what will come out soon? [10:34] So this is awesome. Nitesh, while this is going, if you want to jump in at any point, feel free to. At some point, I'll lob it to you, but also just feel free to jump in. [10:43] Yeah. [10:44] Yeah. [10:49] Um, [10:50] So we'll see. I don't know if it has project access, but like you get the gist. Like if you have the information connected to the agent, [10:59] it's very easy to use it. And, you know, [11:02] And it's very important to use a tool you're familiar with. And at this point, I think... [11:08] plot code works the best for me. It is the most flexible because it doesn't only solve coding issues. And that's important. Lots of these coding agents are made to code. But I want to do more than coding. I want it to be [11:23] like a support in engineering in general. And I think the Claude... [11:29] team really thought about that. They made it not too specific and they kept it general while actually being really good at solving things and looking at what it did, thinking about the mistakes it made and self-correcting. So [11:44] that is stuff coming together that's very hard that makes it possible to [11:49] use now, yeah. [11:51] What's the difference between coding and cursor and agentic coding? [11:57] flawed code is such a simple departure from the cursor and windsurf that we're used to. Both of those have agentic coding capabilities, but flawed just takes it one step further by simplifying it by, I think, like a factor of 10. So what Kieran was telling earlier about how flawed code may feel intimidating because it is a terminal, but in reality, it is so much simpler

12:27-14:04

[12:27] the Vinsurf and cursor because there is nothing except a text button, text box here. There's no command K, no shortcuts, no accept, delete, reject, remove. There's nothing. It's just a text box and it works because the model, the underlying cloud model is so much more capable now. So it's able to work for longer and do tool calls. So it works. [12:55] It's like a simpler UI, which makes it at the same time more powerful, even though the underlying model behind cursor and plot code is the same. [13:07] Yeah. And an example of this is this morning I was pulling some metrics. I was like, why didn't we get any any responses to this form? And then and for context, like basically we have a form that we ask people how disappointed they would be if they could no longer use Quora so we can tell how how well we're doing. And you noticed we have a we have a weekly meeting where we go through all the metrics and we know that you notice that no one had filled out that form. So you're going into cloud code and you're asking, [13:37] hey, like, [13:38] Why is no one filling out this form? [13:40] Yeah, I was like, there has to be something like this form was not sent. And I asked like, hey, 14 days ago, something went wrong. [13:48] can you see what went? And what it did, it made a checklist to do's, like fetching recent log changes to the controller, searching the code base. So it looked through what changed around that date, and it found...

14:04-15:36

[14:04] we removed a piece of code that's, [14:07] adds people there, which is here. Like it says, hey, actually, you just need to add this. And I said, OK, do it for me. Create a pull request. And it did that. And I said, oh, yeah, by the way, I'm also going to create a script that will then add everyone that we missed to it. [14:24] migrate, it's [14:27] And that was it. And the fun part was like, [14:30] I didn't it didn't cost me any energy like it was as easy as me writing it down in GitHub to look at later I don't. [14:39] Need to. I just [14:40] ask it and it does it immediately, which is really nice. It's like the inbox zero. Does it take less than five minutes? Do it kind of thing. Yeah, I think the thing that people may not fully realize is that that's a thing. [14:54] that task could take [14:56] anywhere from like 30 minutes to a couple hours without... [15:00] without AI. And it's not just that it would require you to like focus on it and like, put aside time to like sit down and do it. And now you just sort of like, [15:10] send off requests like that and then you can send off another one and another one you have a bunch of these like sort of working in parallel so give me like a snapshot of of what that looks like concretely like what your actual workflow is what are you actually doing how many tabs do you have open like are you actually doing any hand coding yourself do you have like five in parallel are you just using cloud code like give give give us give me a sense of that [15:33] Yeah, I'll show you my screen as well.

15:36-17:16

[15:36] Maybe, Nitesh, you can tell what we did before, like when we got early access to Cloud, like we were excited what we did. I'll share my screen. [15:45] Yeah, yeah. So this is like one day before the live stream was scheduled. We were like, okay, tomorrow coding is going to change. We'll have a much more capable model, which will be able to work for everything that we want. We're basically going to get like a coding genie for us. [16:15] we want the future, like tomorrow's superior model to solve. And we did that. We created like 20 issues in terms of, you know, like what we want to fix, what were the things that we were planning to work on and prepared the system for the new cloud model. [16:37] Yeah, and it was funny because Natasch had like, he prompted ChatGPT to say, hey, tomorrow we have, we reached AGI. Can you, can you, yeah, can you help us come up with. [16:52] everything we need to do and like prepare the AGI to solve everything we did. And then we fed that into the prompt improver of Anthropic. And then we use that as a prompt and [17:05] We create it. Wait, before you move on, before you move on. So for people who are listening, so basically you have this sort of Trello board type thing instead of GitHub Kanban board.

17:16-18:49

[17:16] And for each thing that you've identified as what you want to do, it looks like you have a document that lays out in detail. OK, if it's a feature or it's like a bug fix or whatever, it lays out in detail what it is and how to actually do it. Can you open up one of them? OK, so like a feature is you want to generate you want to have AI generated synthetic data. And it has this document has everything from a problem statement to like a solution vision to all the requirements like. [17:44] and all the technical requirements and like a bunch of stuff. But it's, and even it has, it seems like it has implementation steps with day counts and stuff like that. Which is funny. So this is all chat GPT generated, right? One day is like one second. Okay. Yeah, yeah, yeah. So we use closed code and we have this custom prompt that we generated to create these. [18:06] Because like it's a lot of work to create these. And even with chat GPT, like there's a lot of steps. You need to look at all the code. [18:14] think about it a lot. You have to think about it. Yeah, there's a lot of thinking, so it's really hard to do well. So what we did, we created a command in Cloud Code. A command is kind of a custom prompt that you use a lot. And ours is like, hey, [18:30] Sorry, this is a command in Cloud Code or a command in Cursor? Because you have Cursor open. Yeah, I have Cursor open because that's how I edit files, but it's Cloud Code. So you can see we're in Cloud and... [18:42] And I can use this command by hitting ccy, which is closed code, and then I say,

18:49-20:37

[18:49] something, like a problem I have, like a bug, a problem, anything. So it's very low friction. So I have this CCI command, and Natasha and I were just [18:59] jamming, we're like, oh, what if we do this? Oh, that sounds cool. And then a voice to text and it starts, um, [19:07] So let's see how this works, and then while it's running, we can go over the thing. [19:13] I want infinite scroll in Quora, where if I am at the end of a brief, it should be [19:21] load a next brief, [19:23] And it should go until every brief that's unread is read. So like so, yeah, I just want people to understand like Kieran almost never types anything. And and does all voice attacks. So he was just doing voice attacks into his into his terminal into cloud code with, I believe, an internal as of yet unreleased internal every incubation called monologue. [19:48] which he is the number four biggest user of, but still under wraps. But, you know, a little preview in here coming soon. And basically what it seems like it's doing is it's taking that. Is it turning that into that document that we were looking at earlier or is it actually going and executing it? [20:10] Yeah, so what it does is it will insert whatever I said here in the future description, and then it will follow all these steps. And these steps are research, research best practices. So one is grounding itself in the code base. So researching what exists. Then it's researching best practices. So it's searching the web, finding open source patterns. So it's like grounding it in like best practices in general.

20:40-22:25

[20:40] Mm-hmm. [20:41] And when I say, yep, sounds good. Like I like that review human in the loop for the plan because sometimes it [20:47] does it wrong, but most of the time it's right. Then I say, yep, sounds good. And then it creates the GitHub issue and it will put it in the right lane and [20:56] Oh, interesting. So it's like that whole Kanban we were looking at in GitHub earlier, you've created a way for you speak your feature into Cloud Code and then it does all the research to create that long document and then just adds it into GitHub issues. That's really cool. [21:26] for it, like the tools made to code. Yes, you can create markdown files and all of that, but let's lean into... [21:34] like an issue tracker. It exists and it works well and people use it and it already hooks into existing... [21:43] patterns. We can give this to a developer and they can implement it. [21:49] Yeah. And one of the things that just to point out is like you're running this. And I think one of the special things that when we saw Opus 4 for the first time, we were like, holy shit, is that it just runs forever with the. [22:03] without any intervention and then gives you a pretty good result, which we've had sort of agentic type things for a little while, but it's just a way different level of autonomy and quality than than we've ever had before. And it's like just checking things off of this to do list in a way that I think other agent loops are just going to be a lot less thorough.

22:25-23:56

[22:25] Yes, absolutely. [22:27] Me and Kieran have a fun thing going on where we're trying to see who can have Cloud Code running for the maximum amount of time. [22:34] Kieran is stopping the list right now. 25 minutes. He ran it for 25 minutes. I'm only at eight minutes right now. Oh, man. [22:45] How did you get it to go so long, Kieran? [22:49] A very, very long plan includes, yeah, it's just very, very complicated long plan and also include a lot of tests and just make sure that it runs all the tests and fixes all the tests. Interesting. It goes pretty long, yeah. Take me through, wait, I want to understand, how did you make that prompt that creates the prompt? Like the prompt that creates the research document? So like, how did you know which elements to put in? Did you just use, did you just... [23:17] do the same thing where you use like the Claude prompt improver, the anthropic prompt improver to make that? Or? Yeah. Why? How did you think about putting that together? [23:27] Yeah, this is part of like the compounding effect. It's like having an idea that has like a lot of outcomes. So this was what Natash sent me. He said, we just got AGI. It got delivered and we can write software. [23:47] This was your initial prompt, which is kind of fun. Like it's very dramatic. And then ChatGPT said, I'm ready. Okay, so now do this.

23:57-25:33

[23:57] Okay, yeah, that's fine. [24:01] that's fine but like uh do you know uh the [24:06] and throwback console prompt improver. You're like, what is that? Well, anyone that doesn't know, this is the... Oh, they changed it. [24:15] Is this console? Yeah, it is the... [24:17] Oh, yeah, it is. Okay. They change it a little bit. This is great because basically you paste in a prompt or something like that. And you can say, yeah, we love thinking and you click generate and it will improve the prompt automatically. [24:30] And you think like, [24:31] How good can it be? It's pretty good because it's also very low friction. So like it's very easy to just take a minute to see if something comes out, if it works. If it doesn't work, delete it. [24:43] Doesn't matter. We were just jamming and we were like, well, we're going to come up with 30 research tasks. So like we better have a problem. So I just copy this prompt. [24:53] And that became the document. Yeah. OK. Into here and change the arguments. And then you can trigger those in clause by doing slash. And we have these two. [25:04] custom prompts here. [25:06] Hmm. And then... [25:08] I think that actually gives me a much better idea of like what you mean by compounding engineering, because what it says to me is what you did first is spent time building a prompt that effectively builds other prompts because those research documents are effectively prompts for cloud code. And so now that you have a prompt that builds prompts, every time you want to make a new feature, you have to specify less.

25:37-27:18

[25:37] build it out into a big document versus before every single time you have to do a feature, you have to say, at first, I want you to research it. And then I want you to like think through all these like different corner cases or the ways that, you know, I like things built or whatever. [25:51] I think that's so cool. And what's also really interesting to point out is it's working while we've been talking, and that's just a different way to code. We were on the phone together last week or the week before, and we were testing this out together, and I shipped a feature that went to prod while we were talking, which I'm not in the code base at all, so it's kind of crazy that that actually happened. [26:21] more social way to code like we're coding right now building stuff which was not possible before. Hey there, Dan here. I wanted to take a one minute break from the episode to tell you about our latest sponsor. All right, let's play a game. What powerhouse productivity tool is also free for individuals? Nope, not that one. Try again. You may not expect this but it's Microsoft Teams. [26:51] big enterprises swear by also has a free plan for individuals. Whether you're jamming on a side project or bootstrapping a startup or building a community, Teams has all of the features that other platforms nickel and dine you for using. You can get unlimited chat, 60 minute video meetings, file sharing, and collaborative workspaces all for free. And the real magic is that everything is integrated in one seamless collaborative workspace. That means there's no more hopping

27:21-28:52

[27:21] meetings and file sharing. Teams puts it all at your fingertips to save you time and money. So ditch the app overload and the subscription fatigue and use Teams to experience effortless collaboration today. Are you ready to streamline your workflow? Head to aka.ms slash every to use Teams for free. [27:40] - Yeah. [27:40] Your productivity will thank you, and so will your wallet. This episode is brought to you by Adio, the AI-native CRM built for the next era of companies. [27:49] With Adio, setup takes minutes. Connect your email and calendar and it instantly builds a CRM that mirrors your business. [27:55] with every contact enriched and organized from the start. From there, Adio's AI goes to work. It gives you real-time intelligence during calls. It prospects leads with research agents. And it automates your team's most complex workflows. Industry leaders like Union Square Ventures, Flatfile, and Modal are already building the future of customer relationships on Adio. Go to adio.com slash every and get 15% off your first year. That's A-T-T-I-O dot com slash every. And now, back to the show. [28:25] Yeah, absolutely. [28:27] And so while we were talking, we did the research and we created this issue, which is cool. And we had, I think, six or seven running at the same time because we were just like, new idea. Let's go. New idea. Let's go. And what we also did, we went through user feedback. We read emails. We just everything we could. We gathered and we were just like brainstorming. And it's really fun because...

28:52-30:28

[28:52] if you're in this brainstorming [28:54] place, you can just kick off agents and see what comes up, what they come up with. And [29:01] take another time to then review. So what we do also is... [29:06] Yeah. [29:08] agree with you on like, it's really fun to do this together on a call. [29:13] because that's where magic happens and there is still a human review. [29:19] step here because we found that [29:21] We want to look at it, see if it makes sense, if anything is missing. This is having taste, experience, intuition. [29:30] Thank you. [29:31] like this, this, the bug I solved earlier with the email not going out. Natasha did the same with his clause code, but it didn't give the right answer. Yeah. So, so there is like, there is still like a human touch of intuition. Like I hinted at, look at the history. [29:51] And that actually made it think into the right direction. And then Tesh didn't add look at the history. And then it said, no, everything works fine. So there is still like intuition and [30:01] it's still a skill it's still a skill [30:05] It is a skill for sure. Yeah, it's not. [30:08] Yeah, it's absolutely a skill. There's no magic prompts that does everything like [30:13] It is about using it the right way and using it to its strengths, for sure. [30:19] Yeah. Natasha, how have you found this all? Because I know, you know, Kieran is like a long time like Rails, like expert.

30:28-32:14

[30:28] person who's just like an incredible programmer. And I think you're a little bit earlier in your programming journey. So what has that been like to come to every start working on Quora and start working on it in this way? [30:40] Yeah, no, this has been like [30:42] incredibly eye-opening, I would say, because honestly, my experience with programming is that two years ago when ChatGPT came out, I thought, okay, now it's perfect for me to teach myself programming and build that SaaS application that I always wanted to. So I taught myself programming using ChatGPT from the very first day. So I have gone through all the transitions. I went from ChatGPT and then when Cursor came out, I shifted the workflow to Cursor. [31:12] And, you know, I was always thinking like, okay, I am... [31:15] at the forefront i don't know any of my friends who are doing so much with ai and i'm at the forefront and then i join uh emery and start working with kieran and kieran is at a whole [31:29] in our meetings he's like nervous uh writing code he's never typing he's always speaking into the uh computer and um uh so i was like okay i need to log that into the flow um and then uh even when [31:45] karen actually pushed me into using it and uh clearly it is now the the way uh to program like me and karen both of us like we haven't even touched uh windsurf or cursor uh in the last like three weeks or so um or even if we do touch it like it's it's usually uh just because we want to read something it's it's basically like we're using it because we don't have vs code on our computer like it wouldn't matter if it was vs code like the older vs code or cursor windsurf because

32:15-33:46

[32:15] All the AI stuff is happening with Cloud Code now, and it's really fun to have... [32:23] be in this position where the entire coding landscape just changes completely every three months and you realize like nobody's at the forefront. [32:33] I got to say, I'm jealous of you learning to code right when ChatGPT came out because I learned to code from books like 20 years ago. BHB4 for dummies. Yeah, like basic. Learn basic in 24 days. Like Sam's teach yourself basic or whatever. Dell 5.5. Yeah. And also, it's so funny for you to say, like, I thought I was sort of at the forefront of AI coding and then I joined every and started working with Kieran. [33:03] I don't know if like there's a scene in Star Wars, the prequel episode one, where like they're they're under the water and and they're like being attacked by a sea monster and it looks like they're going to die. And then another bigger sea monster comes out and just like eats the eats the one that's killing them. And Qui-Gon is like, there's always a bigger fish. There's always a bigger fish. [33:26] And yeah, Kieran is the bigger fish. But I feel I feel the same. [33:33] say that about me, but I'm like, I'm, I have no idea what I'm doing. Like I need to, like, I'm running behind. We need to do like a million more things. So that's just the reality of the landscape. Like there is always more.

33:46-35:28

[33:46] But it's really about practice. Like you should practice using AI. You should push yourself every day. If you don't, like you'll miss... [33:54] very cool stuff. Yeah. Well, what are I guess I'm curious, like, [34:00] personally and also for people in the audience, like, what are the problems with this, right? So basically, it sounds like you're moving to... [34:07] a form of coding where you don't touch the code. [34:10] you're one level above. And so what are the problems that come up with that and how are you solving them? Like what are the new engineering practices that you need to incorporate in order to make sure that things go well? [34:22] For me, like the most important realization for me has been like this thing that I always keep going back to, especially with fraud code. I read this in that management book, like high output management, which the CEO wrote, like, [34:40] 50 years ago and the first chapter he mentioned something like how um uh in in any production process you should fix any problem at the lowest value stage and uh i just can't stop thinking about that uh statement because um because ai and plot code can now do so many things for us it has become really important to focus on the earliest part of things so what i mean by that is um [35:07] When we see that, you know, when we are using the workflow that Kiran just showed to create a GitHub, like a very detailed GitHub issue, then it's very tempting to, like, start another cloud code to ask it to just, hey, go now work on this GitHub issue and fix it. But that's actually going to be a problem because...

35:28-36:52

[35:28] There are chances that, you know, the plan that Claude was able to give in that issue, it wasn't the direction that you wanted to go. And you want to catch that before you ask Claude to go and implement the solution and then you want to fix it over there. [35:46] That makes perfect sense. I really, really like that idea. The thing it reminds me of is just... [35:52] it's like all this stuff is like it's like a lever and like the further out you get on the lever like the more power you have but also um the more power you have to go in the in the wrong direction like every little inch makes a big difference at the end um and uh and so trying to catch it earlier i think is the thing that makes sure that you're not shooting off into space or this list lever metaphor is totally breaking but like you know what i mean like if you if you point [36:22] Like one inch means thousands of miles of difference. [36:28] And so I guess the same thing is true with AI stuff. And I think that's actually a good lesson for me because I tend to want to rush through the planning stuff. It's just hard for me to look at a document like that, like the thing that caught his writing, and concentrate on it. [36:46] How have you guys found that? [36:49] Yeah, it's kind of boring to read most of the time.

36:57-38:40

[36:57] But you can make it more fun. Like you can say just like minimal, minimal. This is too much. But then the thing is then it misses things again. So it's actually important. So... [37:08] For code, I like it to... [37:11] focus on user stories or like asking questions and answering them. So let's say like, hey, what are some questions a good [37:19] PM would ask about this, like that we should consider and give like two options. Like that's it's more fun to read that than like week one, we'll do this week. Two, we'll do that. Like that's it's like. [37:30] PRDs are boring and you can make them a little bit more fun or give more examples or like you can shape that research. And that's normally what we do in the in the human review step. It's like, do we see any red flags? Uh, [37:47] Do we need more stuff to be added? Because it will save so much time. Yeah. That actually reminds me of something that we're finding in another part of the business. So Danny, who's been on this show, is the GM of Spiral. And inside of Spiral, we're building a writing agent. So you can think of it sort of like hot code, but specifically for writing tasks. And I think there's something similar about that, where sometimes you want that writing agent to shift into like an interview mode, where it like tries to understand more [38:17] are and what you want rather than just like spitting out a bunch of stuff that you then you have to read through. [38:22] And it sounds like there's maybe something missing here in quad code or these sort of coding workflows where it would be really nice instead of having to read that like long document, it's finding ways to like ask you questions so that the thing it outputs is more likely to be right without you having to read through the whole thing.

38:41-40:12

[38:41] Yeah, absolutely. That's an interesting idea for a custom command. Click here. We should totally try that. Yeah, for sure. Like this is something we should automate and make better, like for sure. And at the same time, it it knows a lot because it has access to your code base and your style. And like that's that's very powerful. So like, like you have the code base and it's actually pretty good. [39:06] doing it. Like, I think [39:08] In addition to like making it very good at the beginning, I think just boring traditional tests and evals are very important as well. [39:19] Because how do you know... [39:21] what you did is actually working well you can open a console and click through it but like why just have it test it write a test for it uh like just just a bare minimum smoke tests are great where you just see does [39:36] does it kind of work because otherwise it does way too much but it's a very good way to have it iterate and fix things by itself and we haven't tried it as much yet but we we use the figma mcp where we say hey implement this from figma and then [39:55] Now, there is like you can have Puppeteer take a screenshot for a mobile version and then say compare the two. Like we haven't really tried it out, but like we want to try more of that out. So there are these checks in place, tests in place that you normally do. [40:09] do manually.

40:12-41:58

[40:12] And the same for prompts, like evals for prompts. So I kind of think of an eval as like, [40:21] writing a test for [40:22] code in eval is a test for a prompt. And what I've seen last week as well, I had clalled code [40:30] run an eval and then say, actually, [40:33] It feels... [40:34] four out of ten times. I said, run it ten times. Does it always pass? No. Four times it doesn't. I said, oh, look at the output. Why didn't it call that tool? It was a cool tool call test. And it says, oh yeah, it wasn't specific enough. And I said, okay, just [40:51] keep going and change the prompt until it's passing consistently all the time. And it did it. Like I just walked downstairs, got a coffee, walked up and... [41:01] That was it. So evils are also very powerful because... [41:07] they will tell you if a prompt works and [41:10] Similar to writing code, Tess says your code works. So leaning into those more boring traditional ways of, [41:18] is also very powerful. [41:20] That makes sense. I have a thought. And because one of the things I think is really special, and I think, Natasha, you're in this boat, too. So you tell me if I'm wrong. But one of the things I think is really special about you, Karen, is that you just test everything. So like you've tested every single agent. Natasha, have you have you used like a lot of the agents as well? [41:41] Now that's care. [41:42] Okay. Well, I think we could still do this. I think it'd be kind of fun. I want to spend five minutes with Kieran doing a S tier through F tier ranking of agents. And so what I'm going to do is I'm going to share my screen.

41:59-43:36

[41:59] And I'll call out an agent and then you tell me where it ranks. [42:05] Are you game? [42:07] Yeah, let's do it. Okay, cool. [42:10] Let's see. [42:15] Let's do cursor. [42:18] Yeah, so it's fun because cursor, like what cursor? Is it Clot4? Is it Max? Is it... Cursor on the best possible settings. [42:29] And is it the background agent or is it the... [42:33] Okay, cursor, traditional, best possible setting clause. Like, that's the confusing part about cursor and windsurf. Like, they're like a million versions of it. And like, why don't you just have the best version? And that's what I love about certain agents. They just say, look, this is the best agent. [42:50] So that's why it wouldn't be the best. I would say... [42:55] A. A? Okay. Kurser is very good with cult four. All right. Windsurf. [43:05] C, because they don't have... [43:08] Claude 4. It's ridiculous because three weeks ago they would be A. [43:15] And now they're not. Wow. Wow. Okay. Because I switched from windsurf to... [43:22] or from Cursor to Windsurf, like a few months back. But I switched back. [43:28] okay so we've got windsurf is a windsurf is a C cursor is an a a

43:37-45:08

[43:37] Let's see, Devin. [43:43] It's a B. B. [43:45] Why? [43:46] you [43:47] Um, [43:48] It's... [43:50] It's not as integrated. It's a little bit hard to set up. And the code quality is... [43:57] Like, it's not as well-rounded as... [44:01] cursor or Cloud Code. I don't know if they use Cloud for the background, but it's not as usable. [44:08] as the others. [44:10] Charlie. [44:13] Charlie is... [44:16] like for code reviews. So we use Charlie for code reviews mostly. So [44:21] I haven't really used it as an agent as much. I think Charlie as an agent is B, but it's A as a code reviewer. Like I really... [44:32] like the code reviews it does. So that's interesting. Like it's really good at something. [44:37] And then what about Friday? [44:39] I put Friday... [44:45] Higher than cursor, maybe between S and A. [44:51] And it's funny because they don't even use Cloud 4 yet. They're still... Wow. They're still working on how... [44:59] how they like really make it work well. It's 3.7, but like why I like it there, it's definitely different than Cloud Code, but Friday,

45:09-46:49

[45:09] has a very opinionated way of [45:12] working and [45:14] And I love their opinions and it really works well. And it just does it like you give give an issue. They make a plan, you approve and it does it. [45:24] it creates a pull request. And I've seen it do this stuff that, [45:28] I couldn't do with cloud code. Like, for example, implement this Figma design. It just one-shouldered a Figma design for the assistant and [45:37] And I've seen moments where it, [45:39] more multiple moments like that where it did things are like, wow, okay, this [45:44] I taste the future, which is really unique. And it's a small team as well. So really cool. Codex. [45:52] This is... [45:55] B for me. All right, code X is a B. [46:00] co-pilot. [46:04] uh i haven't used copilot [46:08] you never used github go pilot no i mean i used it three years ago but i like no i i okay let's be fair i tried it maybe a half a year ago and after one second i stopped using it [46:24] where do you rank it uh d like it was not agentic but i mean i should try the new version [46:34] For sure. We have not tried. Yeah, we haven't tried the agentic copilot. So that's not totally fair. But okay. Are we missing anything? I feel like we are. Claude. Claude code. Well, obviously Claude code. But I assume S tier, baby.

46:50-48:20

[46:50] We have factory as well. Oh, yeah. Where do you rank factory? It's interesting. Factory with certain things is like better than any others. But it's not my style. [47:04] is for more enterprise-y people that are very nerdy and want like absolute bangers of... [47:11] code? [47:12] And it's actually good, like multi-repo stuff like that. It's a little bit hard to use because it's on the web, but also local. So [47:20] I write a B. [47:21] Maybe a little bit of BELO, Codex, and DevInn. [47:25] Yeah, but it's like there is a use for it for sure. There's something going on. There's something good there, but it's maybe not for us. It's not my thing. Yeah. Yeah. [47:34] "emp" also. [47:36] M. What's that? Amp. A-M-P. Amp. Oh, Amp. Sorry. How's that? I would put it... [47:44] Taaaaaaaaang! [47:48] s tier under claude code between whoa another s tier yes all right why it's it's very good at just getting work done [48:02] the air economics are pretty good, good tools already. Like, like people, people use that tool that [48:08] build its [48:09] They're dogfooding. Like you can feel from ClotCode and AMP, they're developers that love agents and they're just building the best thing and they're trying new things out. [48:19] Um,

48:21-50:10

[48:21] So yeah, that's it. Let's see... [48:24] Thanks. [48:25] This is exactly why Kiran is the big fish. [48:28] I mean, you're stringing them together. Like you're using Cloud Code and Friday and other stuff all at the same time, which is, yeah, the thing that is really cool. [48:41] Yeah, like how I think about it, like I'm thinking about it more, it's like you're interviewing for a role and you find a developer to solve a certain problem. I think it's similar with coding agents. Like Friday is good at like doing UI now. So if I need UI work, I will go to Friday. If I need to do research, I go to Cloud Code. Yeah. [49:03] And yeah, if I want a code review, I use Charlie. Like, [49:08] It's fun and agents work together. You don't need to have one agent. We have clalled code. And that's because Charlie like works in GitHub. So you can just like CC Charlie and Charlie will do the code review on the PR. [49:21] Yeah, so we use GitHub and pull requests and normal developer flows. So [49:27] humans can hook in. So we can hire someone that's very good at specific thing and review code and then Claude's code will just do the work. But it's very powerful because it is just an ecosystem that we [49:40] refined over like 20 years or whatever like and it works so let's lean into that and that's probably why copilot will probably be [49:50] fine, since it's in DARE already. Wait, you actually did that recently. We had some infrastructure things. We've handled tons and tons and tons of emails at Quora, so we had some infrastructure issues to work out. And I think you brought in someone who's a real expert and then worked with them in a specific agentic way that,

50:11-51:43

[50:11] you got what you needed from them, but it was less work for them. [50:14] Yeah. Yeah. So like there was no issue yet, but we wanted more visibility in in delivery of like the most important things. And like I'm not very good at it or like I know stuff, but like let's bring in someone. And what we did, we just had a conversation like a two hour call and I recorded everything. And at the end, I just fed that into Claude and say, OK, can you make. [50:41] two issues, research issues from this. And like 10 minutes later, I said, OK, here are the issues. Can you review them? And he was like, holy what? What? Like this guy, like he's not an AI skeptic, but he's like, he's very good at what he does. And normally what he does, AI is not good at yet, because like there are things AI is not as good at yet. But he was very impressed with it and he had like very good [51:07] comments on it to iterate over it. And what we basically did, we just iterated more quickly through ideas because we had something to talk about. And then I said the next day when he was like, did the human review, let's go. I just used Cloud Code to implement it. And we sat down and did the code review. So it's like, [51:29] It's just accelerated what would have taken two weeks maybe is now in like a few hours, which is really cool. [51:36] I love it. Well, there you have it. You've got your tier list of agents. Claude Code takes the cake. We've got AMP coming up.

51:44-53:10

[51:44] coming up in second and GitHub Copilot, unfortunately, bringing up the rear, but with room for improvement once we try out their agentic capabilities. Anything else you guys want to say or talk about before we end today? [52:00] Everyone should use CloudCode or try it out, even if you're not technical. Subscribe for their Max or Pro plan. It's only $100 per month. You have unlimited access. If you're skeptical about being technical, that it's very easy. And I've seen people, a friend of mine, he used Cursor. And I said, just use CloudCode. It's better. He was like, how much better can it be? And he said, yes, it's better. And he rebuilt everything he did with Cursor, VibeCoded. [52:30] into Cloud Code. And he's like, yeah, this is great. He felt that next step. And everyone should try it and really push [52:38] push their tools. Yeah. [52:40] Mm. [52:41] Nitesh, any other words of wisdom? [52:44] just be sure to check the AI's work at the lowest starting stage. You want to catch those problems early. [52:52] Yeah, that's a great one. And also use Quora. Quora.computer, check it out. It's pretty awesome. We're shipping new things all the time. Thank you both for coming on. This is a true pleasure. I cannot wait to see what else you cook up over the next couple months. And we'll talk soon. Thank you. Thank you so much.

53:18-54:01

[53:18] Oh my gosh, folks. You absolutely, positively have to smash that like button and subscribe to AI&I. Why? Because this show is the epitome of awesomeness. It's like finding a treasure chest in your backyard. But instead of gold, it's filled with pure, unadulterated knowledge bombs about chat GPT. [53:41] on the edge of your seat. [53:42] craving for more. It's not just a show. It's a journey into the future with Dan Shipper as the captain of the spaceship. [53:50] So do yourself a favor, hit like, smash subscribe, and strap in for the ride of your life. [53:55] And now, without any further ado, let me just say, Dan, I'm absolutely hopelessly in love with you.

Want to learn more?

Ask about this episode