Transcript

Cheap execution and building all the candidates

0:00 · This is maybe a whole like a somewhat contrarian view to a lot of people in AI. I actually don't think that the future is going to be hyper personalized software down to the point where everyone is running their own version.

0:09 · Like I actually think it's going to be quite hard for all of us to have our own internal chat to wall. Silicon Valley overall is undervaluing the local computer. And my default argument for that is always how come we're all using MacBooks and not like an iPad or a Chromebook. And now when I think about clock is this entity that is supposed to be very useful to you like tremendously useful to you. I think that entity needs to have access to all the same tools you have access to. Otherwise, it's going to be hemstrong in like all these complex ways.

0:41 · Hey everyone, welcome to the Len Space podcast, our first one in the new studio. Uh this is Allesio, founder of Colonel Labs, and I'm joined by Swixs, editor of Laid in Space.

Intro in the new Kernel studio

0:50 · Yeah, so nice to be here. Thanks to uh TJ, Allesio, Ellen helping to set everything up. It looks beautiful. We even have the logo outside. Yeah. It's like really nice when you walk in here as a guest, you're like, "Ah, this is a serious production." You like feel it immediately.

1:04 · Yeah. Felix, you've been you're you're currently product manager of cowork or uh really technical and lead. Yeah. The the identities are kind of vague. Member technical staff.

1:15 · I know member technical staff is like the official title we'll carry around forever.

1:19 · Yeah.

1:19 · I recently kind of wanted like we've been kind of obsessed. I've been using it a lot even for managing latent space like uh co-work helps me upload videos and like title things and like edit and everything. It's it's like really amazing.

1:32 · Cool. He said multiple times co-work as a GI in the group chat.

1:36 · Yeah.

1:36 · Yeah. Yeah. So, so we have a second uh we have a second channel uh for the inspace TV. Uh and I uh and uh we basically this is our discord meetup.

1:46 · Um and I I we have like cloud co-workers might be AGI. I don't know if we we have uploaded it yet, but one of the sessions was like a like a cloud co-work thing. I would have to see I would love to see it. Like I'm so curious. Like one of the most fun parts of my job is to like constantly see the weird things people use co-work for because it's obviously like very hard for us to actually design for specific use cases. We do, but like every single person who's like most amazed is usually amazed about a thing that I didn't even expect coowork would be good at. Um we have a new designer and it's one of the first small tasks I was like, "Hey, we need like a new emoji for coowork for our internal stack."

2:20 · It's like a pretty small thing. I was like, "Can you please do it?" and he drew an SVG and just gave it to co-orker was like, "Can you animate this emoji?"

2:26 · And now it has like this beautiful loopy animation. Um, and I mean I think obviously this goes down to like it turns out you can do more things with code than you expected, but it it's like that kind of stuff that is really fun to me. So long story short, I would love to see like the kind of things you're doing.

2:41 · I'll pull it up. I'll pull it up. Yeah.

2:43 · Uh but before we get into it, I I think always want to start with like a top level what is cloud co-work for people who haven't heard of it, haven't tried it out. Okay. Uh real quick, cloud co-work is a userfriendly version of cloud code. So the way it basically works is we have cloud code and for us fairly impressive agent harness that over December we noticed more and more people are using either even though they're not technical they they're not at home in the terminal or they are at home in the terminal but they started using cloud code for non-coding workloads right like managing expenses or like filling out receipts or organizing a knowledge base like there was a big obsidian moment that a lot of people liked and we wanted to capitalize on that but also bring bring this capability to people who are not terminal native and who might not know how to like brew install something. So co-work is cloud code running in a virtual machine with a little bit of padding a little bit more guardrails making a little safer a little bit more convenient for people who don't want to first open up the terminal when they go to work. It's interesting uh that it's kind of pitched that way as a more user friendly thing because I always feel like it it to me I I treat it as like well I'm familiar with cloud code like we we did a cloud code episode a year ago but this one is like even more power user tools because it it kind of integrates much better with like cloud and chrome and uh in all the all the other tooling but like may maybe that's like a perception thing right like no honestly I don't think you're wrong this is like a thing I've been thinking a lot about for like the last two weeks so people when they say user friendly is like oh it's the dumb down version but no actually this is the superset. Yeah, like I think a similar thing happened a similar thing happened to me about 10 years ago like maybe 12 years ago when I was at Microsoft and we started working on on electron and like browser based technologies and crossplatform stuff and one of the first use cases was visual studio code which used to be a website and the initial narrative was oh visual studio code is is like a more userfriendly version of visual studio but in a similar vein I think there were some voices saying oh this is not for serious developers like we're not going to use this right for like anything and I think in the end what happened is people have different stories about why Visual Studio Code became such a big thing but my personal my personal belief is that the hackability and the extendability is like played a pretty big role right you can hook in Visual Studio Code to like almost any workload it's so easy to hack on so easy to build extensions for it and I think co-work might be hitting a similar thing where it's very easy to extend and it's very easy to bring into your workflows uh so the convenience I think is a bit of Yeah, it's obviously the thing we strive for as developers, but I think the way people find value in it then is by properly mapping it onto whatever they actually have to do in their job.

What Claude Cowork is

Why user-friendly can be more powerful

5:29 · So, end of last year, you see the spike of like non-technical usage in clock code. What's the design process to say we should make clock code work? Because I mean, you built it in only 10 days.

How Anthropic built Cowork

5:38 · Um, I'm sure there was some discussion before on what does easier to use mean?

5:43 · you know, like making making like a desktop guey is obviously one way to do it, but like there's a lot of nuance in the product. Like maybe talk people through what was like the trigger of like we should build a separate thing.

5:55 · We should not build like a different clock code thing and then maybe some of the more interesting design decisions that maybe you didn't take. Yeah, I think at anthropic we've been thinking about ways to move people who are comfortable with using cloud to answer questions and bring more the power of like this thing to now like execute tasks for you, right? Can like solve problems for you, can like build things for you. How do we bring that capability to people who are currently mostly comfortable with like a like question answer paradigm within the chat? And we've had a lot of prototypes around that going back as far as like easily a year and a half. like we had a lot of people working on that. Um, and internally anthropic is a very prototype demo first culture. We have a lot of like internal prototypes that don't reach the public. And what coowork actually became is like we sort of picked the right pieces out of the many prototypes that we had, right? And that's that's maybe also like I think an important qualifier whenever people mention this like 10day number. I do think it's important to me to mention that we didn't start with scratch. There was like a lot of stuff already happening, right? like and I think it's important for people to remember that when you build a website you use react you use like a bunch of other things and this is like a similar scenario with like a lot of pieces we already had um and in terms of decision path I think we live in like an interesting new world where execution is actually quite cheap.

Prototype-first product development

7:12 · Mhm.

7:13 · So maybe maybe what you would do that's so crazy to hear it's wild right you should be ideas are cheap execution is the hard part. No, but like the we we used to live in this world maybe where you would take a product manager and the product manager would go to a number of potential customers and in this like very low bandwidth way would try to try to like tease out what are the problems they're having, what are they willing to buy. Um and then maybe what can you build to like address that need and then you go back and you like draft a spec and you think about it and then like you make a design and you execute it. We internally had anthropic up now probably much closer to the point we're like don't even write a memo just like build like let's build all the candidates very quickly like let's just build all of them and then pick the best ones.

7:56 · I think the the decision that is most impactful both for the product as well for the users right now is like the way we put value on your local computer. I think that's a big decision point. A lot of people have thought about should this thing, whatever it is, should it ultimately run in your computer or should it run in the cloud because they're big trade-offs, right? I guess like if we solved O, it would be easy to do in the cloud. But I think like the fact that I can just download any file from anywhere and then put it in co-work there, it's like a big unlock. Um, I mean it's interesting you mentioned reusing certain pieces. I think this is something I've been thinking about even with cloud code, right? the price of like writing code is going to zero blah blah blah but it actually seems like the value of having some sort of platform substrate is like increasing because as you build these new things you can kind of plug them together.

Why local computers still matter

8:45 · Yeah.

8:45 · So I almost feel like when people are saying oh the value of a lot of software is going to zero because you can recreate it to me it's almost like the opposite is like having an existing platform to build on top of is like even more valuable because they can kind of bolt things on.

8:59 · Yeah.

8:59 · You have obviously MCPS, you have skills, you have like obviously the models which is a big part. All these things kind of come together. Do you feel like that's a valid way to think about it where people should invest even more in kind of like these primitives to rebuild on or are you like recreating a lot of it each time because like things change and it's easier to rewrite than reuse, you know? I think I think you're right.

Skills, primitives, and platform leverage

9:22 · I think you're right that the holistic platform is really useful and this is maybe a whole like a somewhat contrarian view to a lot of people in AI. I actually don't think that the future is going to be hyper personalized software down to the point where everyone is running their own version. Like I actually think it's going to be quite hard for all of us to have our own internal chat tool and like if I want to talk to you like how is that going to work right in the in the context of co and how we build it. I think it's a bit of a combination like the the execution that gets cheap is not necessarily rebuilding all the primitives. I think our priority there's also not a lot of value in it.

9:56 · So for inance my team did not think about rebuilding plot code. We like very much started with the with the core thesis of this should be cloud code and then we'll like build things on top of it. The part of the execution that gets a little cheaper is like how do you take all of these Lego pieces and put them together in a way that makes sense for users and is like actually valuable.

10:14 · You have so many different approaches now in terms of what kind of what kind of things do you actually elevate to a primitive? Do you strongly believe that all your products should be built by just combining primitives with the body also available? you keep some things internal. Um, and I think that's still evolving, but I think what's probably going to go away is like I'm not sure if it's going to fully go away, but I'm going to say I think for me personally, I will probably no longer try to come up with a really good product without testing out with people. This is not a new concept, but wherever you used to have to make costly decisions around do we pick technology A or technology B or do we like um build it this way, build it the other way, I really strongly believe now you just build all of them and try them out with a small focus group and then whatever whatever is better is what you go with, right? And that that is probably quite different even from how we maybe worked a year ago, right? Like I think I think this happened very recently.

11:14 · Yeah.

11:14 · I started building something in on Electron since you're here. Coincidence, but then Electron and like SQLite are like there's like some issues that like between development and like building anyway. And I was like, let's just rebuild the whole thing in Swift and just recreated the whole thing in Swift and it's like it's done, you know, that didn't take any effort. I I I don't even know Swift. But yeah, exactly. I was like, I'm I'm not reviewing it anyway. Whatever. You can write it whatever language you pick. But the important stuff that I did was not write the electron bindings. I was like the logic of what happens in the app you know and then the model is like yeah I can just recreate the same thing as with yeah I I think you still want especially for people who are doing like high performance software or like really complex software uh you still want like some view of the architecture uh but you can use markdown for that right yeah uh you don't actually have to read the code again I'm still like on a sort of like a definitional thing um can we build a good mental model of cloud core work um this is what I have right like you said it's like fundament cloud code we don't want to touch it there's the cloud app there's cloud and chrome I think you guys do something different in planning but uh I've been talking with Tariq who's on the cloud code team and you guys are he's like no we just expose planning maybe you can clarify like what are the major pieces that people should be aware goes into co-work like okay I think you basically have them so you can you can take planning more or less out I think there's a few things that are really valuable in co-work um the virtual machine is probably the most powerful thing so we currently run like we're We currently run like a lightweight VM and we put cloud code into the VM and we do that for for um a number of reasons. Safety and security is a big one. But even if you even if you ignore for a second safety and security and you're just like okay yolo I want this thing to do whatever it is quite powerful to give cloud it own computer that is like generally a good idea and in terms of architecture and UX and everything else that we've been working on anthropic it often is quite useful for you to like anthropomorphize um cloud aggressively and just be like this is a person like what would you do if you give if you had a person right and the analogy I've given my dad this morning who is still like quite insistent on using chat even for coding things is if you were a developer and your employer told you that you don't need a computer, they're just going to like send you emails with the code and you send emails with code back. Like that maybe work for PE files in the back, but that is not very effective.

Cowork’s architecture: VM + Chrome + system prompt

13:36 · Um, so what we can do with the VM is because it's a it's a Linux system, cloud code has more or less free reign to install whatever it needs to install.

13:43 · You can install Python, it can install Node.js. We do have strict network ingress and egress control. So you can still as as a user in like plain human language make it clear to to the entire system what you're okay with and what you're not okay with. But in no point do we have to ask a real person like a like a person who might be in marketing or a lawyer. I don't have to go to a lawyer and be like are you okay with me installing homebrew?

14:05 · Yeah.

14:06 · Right.

14:06 · Because the implications of the question and the answer are complex and nuanced and like not not easy to reason about. And this gives us a lot of abstraction that makes cloud very powerful. Now then around it we we do probably have a number of things that also keeps growing almost every single week that you're probably noticing that make coach maybe better for certain tasks than just cloud code on its own.

14:29 · But most of those actually live in the system prompt. They're about like what can we infer about the work that you do?

14:34 · What can we what can we introduce into the system prompt to make that more effective.

14:37 · It's of course like very tight integration with cloud and chrome. You're noticing that a lot of people, especially as the models get better, a lot of people throw up their hands when it comes to MCP connectors in this era.

14:47 · I'm not going to I'm not going to go through like 25 MCP connectors, click off everywhere, and then like half of them don't let me do things anyway. So cloud and Chrome is quite powerful because we can just talk to the cloud and Chrome agent and that'll just do things for you. Yeah. So one example right in MCP honestly I think that the state of MCP is kind kind of like really hard to integrate. Um I need to I needed to add uh Figma MCP to the coding agent that I use.

15:14 · Yeah.

15:15 · Uh and but I didn't want to read the docs so I just had called to it and it's it's great at reading docs and same same way I had to set up like a Google cloud um account for some project I was working on and get some API key somewhere and Google cloud is famously super hard to navigate. So I just didn't want to deal with any of it. I just used Clockwork.

15:34 · Within the first week of developing on co this happened very very quickly. Um I caught myself like starting to use co-work for coding tasks which is not ostensively what we built it for right we don't need to. But I found myself um I found myself like on our internal internal tool that we have for to collect crashes and just like debugging information and I found myself like picking out the ones that I think we can easily fix versus the ones that might be like kernel corruption or something else on the operating system. And I found myself sort of picking these out and then just telling Claude, "Go fix this bug." I was like, "What am I doing here?" Go one level up. Tell a cowork I want you to go to all these crash tools.

Felix’s own bug-fixing Cowork workflows

16:09 · I want you to find all the bugs that you think are fixable and not like an operating system crash. And then I want you to tell another claw to like fix all of that.

16:17 · Um, and that's that's that's sort of another cloud. So you can spin up another instance or uh it currently what I do is um and this is a bit of a hack but I tell it to use clock remote to itself.

16:31 · Yeah, that's interesting. So you basically take if you if you imagine like a dashboard with like 20 bucks you this is remote control or clock or remote.

16:40 · Sorry, I just wanted to confirm what the way I'm using it is I have co-work and I'm telling co-work here's where I normally go every morning to find the latest bugs. go read the entire bug list, separate out which ones are fixable, which ones are not fixable, and then for the fixable ones for is this almost loop for each bug, write a markdown file with a prompt. And then for each markdown file that is a prompt, start of a cloud set.

17:05 · So natively, cloud code has this concept of sub aents. And this is basically a sub aent, but you're not using the sub aents functionality.

17:12 · I'm not using the sub aents functionality. And the reason I'm not is because I'm firing that off as a cloud code remote task. Yeah, that's kind of nice because then it can just fire it off. I can go to my next meeting and in cloud code remote now the work is happening.

17:25 · Yeah.

17:25 · You see like you're already starting to use the cloud over your local machine and I think this is one of those things where like well shouldn't just everything just be cloud first, right?

17:33 · Ah this is such a good group. I'm like so bugs I have so many thoughts about okay so I generally believe that Silicon Valley overall is undervaluing the local computer and my default argument for that is always how come we're all using MacBooks and not like an iPad or a Chromebook. um that there's like still value in in having a local machine. And now when I think about claude as this entity that is supposed to be very useful to you, like it tremendously useful to you. I think that entity needs to have access to all the same tools you have access to. Otherwise, it's going to be hamstrong in like all these complex ways. And there's there's sort of two approaches we could take. We could say, okay, we're going to like one by one chip away at everything that is at your computer and move it into the cloud.

Local-first agents

18:16 · That's that's one way to do it. Um, and I think other products have taken that path. I personally, this is a very personal opinion, but I personally for the amount of tools that I use, just don't have the patience to give another tool like permissions to every single thing and keep those permissions up to date. The second thing that I'm still grappling with, and I don't have a good answer for anyone just yet, but the second thing I'm still grappling with is what does it look like for someone to slurp up your entire work and put that in the cloud? Like if I just as an example, like if you could click a button and I just clone your entire computer into the cloud, is that something that you would want? I'm not totally convinced yet that all everyone will.

18:57 · Mhm.

18:57 · And that is sort of like upstream of all the technical issues we're going to have because like in general I think the world is not ready for this kind of stuff. Like I'll give you one quick example that would probably be very easy for us. So as a desktop app we in theory with your permission can do a lot of things on your computer including reading your Chrome cookies if we really want to do right. We could take your Chrome cookies you would have to decrypt them for us but we could put those on the cloud if we really felt like it.

19:23 · Pretty easy solution. That would be super cool. we could just be like, "Oh, we can do all your tasks in the cloud." Now, um, a lot of websites, banks included, if if they see the same authentication from like two different locations, will just lock down your account and now you have to go to the branch and be like, "Okay, I'm with I'm here with my passport."

19:38 · How do you actually know that? Wow. You know, as tired as well are of the term agent for the agentic future, I think there's a lot of stuff that sort of slowly needs to catch up. And until that's the case, the way I as someone who's working on claw can make cloud most effective is to like put it where you're working. Anything else with our mental model? So like basically like uh part of me also just want like the more I understand how it works, the more I can use it to its full potential, right?

20:04 · Yeah.

20:04 · And so what I'm get hearing from you is you told me to delete the planning thing. You're not doing anything special on on the that's only exclusive to cloud co-work.

20:12 · We have some tricks but they're sort of like change week over week. We eval coowork maybe against different use cases then he would evil clock code right how do you think about it this way okay so like clock code is not clockwork yeah so cloud code is like quite optimized for coding tasks and we mostly evaluate whether or not we're getting better or worse depending on how good it is at like a typical sweet job and cloud co-work on the other hand we evaluate more against typical knowledge work the kind of stuff you would find in finance or in like maybe like in like a legal office um my personal use case is always like managing my things like managing my puzzle mortgage or something like that, right? Or like we w planning for me and my family. Those are the kinds of use cases we eval co-work on. And what you might be picking up on is like the subtle changes we make to the system prompt, what we put in the system prompt, how we steer clot with the tools we give it. Um like either it be better in one or the other direction where there's a trade-off. Trade-offs exist a lot. Cloud code will be better for code and clot co-work will be better for non-coding tasks. Will those gaps still exist in the next few generations of models is like a little unclear to me though. Yeah, because right now these like hyper optimizations we make I'm not sure for how long they're still relevant.

Evals, planning, and knowledge-work optimization

21:24 · I think what I was referring to was also it it just uh qualitatively felt different when I probably it's just all prompting and I'm reading too much into it but like that the fact that it comes out with like a ninestep plan. I can edit the plan and give feedback and and and see it execute the plan. Yeah. felt more long range than in cloud code. But maybe that already existed in cloud code and you just built a nicer UI for it.

21:48 · It's kind of both. Um like if the cloud code people who built the planning functionalities would sit here, they would probably say yes, we have all of those things in clock code and they do. Um I think people tend to give co-work tasks that are maybe of a longer time horizon.

22:03 · I thought he's so long. Yeah, that's like one thing, right? You're just like that the the chunk of work tends to be maybe a little bigger. And then the second thing is that because the word when it gets longer it gets a little bit more ambiguous. We do tell co-work to make heavy use of the planning tool or to make heavy use of the ask user question tool. Right? We do wanted to come up with like different scenarios of okay tease out what the user actually wants. Don't go off to work for like four hours and then come back with the wrong thing and you're probably picking up on that.

22:30 · Um I wish I could tell you I like built this magical thing and it's like there's some secret sauce on No, no, no. I mean it's just clarity is good that you know engineers just want to know so they can they can plan around it and then I think also for me um I'm realizing I have to switch to my my other machine because this is a new machine that doesn't have my session but uh yeah the the the planning is really important for for me to like approve or like to see whether it's like it's right the ask user question is so beautifully presented I mean it al also available in like cursor and and in cloud code but like I I think like it's so nice to see that it like It's kind of for me like to understand that it gets me. It gets what I want to do.

23:09 · Yeah. Yeah. Prior art.

23:10 · Just on the topic of evals, when you say eval, I think people are very vague about what it means. Is it just like vibe testing or do you have like automated programmatic evals of cloud cowwork? When we say eval uh what we really mean is that we essentially take the entire transcript including all the tools that claim ultimately to it and we then measure what are the outputs depending on what we tweak. Right? So we do run that a lot. We use that in training. Um we use that in in like if you sort of separate out post training from like the scaffolding around it.

What Anthropic means by evals

23:45 · Co-work sort of exists in the scaffolding space but obviously we also train on it a little bit. And so when we say eval, we mean given the certain transcript, what do the outputs look like? Including the file outputs as well as like the actual token outputs like the ones that you see in the chat window. I'm curious um how much of the failure modes are the model intelligence versus like the usage of the end tool to put the intelligence in. Like the wall planning is like a good example, right?

24:10 · It's like one thing is to come up with a plan. The other thing is like make a nice spreadsheet that kind of runs you through the plan. Like how have you seen that evolve? The thing that I grapple with a lot is that whatever scaffolding you come up with, I think we still have a bit of sort of like model overhang where the model is dramatically more capable than what users end up using it for. And I think part of that is that we're just not getting the model all the tools to do all the things it's theory capable of, right? That's like one thing. Um, however, whenever you do put the scaffolding, that's sort of wondering at what point at what point will that scaffolding go away and like how much you invest in figuring out what the right scaffolding is, it's kind of up to it's a little bit of a bet, right? And one thing that I as an engineer quite enjoy is that like working in the topic and working at a frontier lab, I maybe have a little bit more insight into what's coming coming down the chute in terms of like what's the next model, what is the model capable of, what it's good at, what is it bad at. And I'm I'm increasingly wondering is the right thing for us to like really invest too much in sort of these like scaffolding corrections where the model might otherwise not misbehave but just not do the thing that you want or is it to just like give it as many capabilities as possible try to make those safe so that the worst case scenarios like not as bad as it might be otherwise and then just simply wait a second for the next model drop. I'm personally currently more leaning into the ladder. I think we're going to see a lot of like applications and companies that do very impressive things with AI that in the short term might seem very effective because they're very specialized to individual use cases. But I think once models get better at generalization and get better at like those specific use cases without being super guided on those. I'm not sure how long that's going to stick around. And you can kind of kind of already see this in like skills and NTP servers, right?

Scaffolding, tools, and why skills matter

25:54 · We we've already seen sort of this like slow shift from MCP service to skills and like maybe a good example is Barry who made skills. He was initially hacking on something that honestly looked a lot looked looked a lot like what co-work does today. It was sort of thinking about what if co-work but for like people who don't want to build code and um he too did that as a prototype inside the desktop app. One of the first use cases we thought of were okay what what are like coding like use cases that could really benefit from graphical interfaces and like from being a little separated from the actual underlying code and everyone comes up with the same answer is data analysis whereas like how many users do we have today how many like it's always data analysis and I think the thing that ultimately led to skills is that we wanted to connect this little prototype to our data warehouse and the team very quickly We discovered that like instead of building a custom tool for the thing to talk to our data warehouse, they just like made a markdown file like dear Claude, if you want to get data, here's the end point. Here's what the API looks like. You figure it out and then hand over control.

27:00 · Yeah.

27:00 · Yeah. Also just like maybe go one step up in the layer of abstractions, right? Just like instead of instead of telling the thing here's a CLI, please call this CLI or here's an MCP, please call this interface shape, just like this is the endpoint. if you want to know something if you post here maybe you can do post SQL it's going to be okay and that ended up being so effective that they started trying the same pattern of like just giving the model a markdown file that describes whatever it needs to do that the whole thing eventually became skills and we're like we should package this up this is a good idea yeah um we've had Barry Mahesh uh on on our conference and uh he's definitely got a good idea there yeah I wanted to show you that how I've been using cloud co-work This is was my favorite part.

Demo: YouTube uploads and self-generated skills

27:45 · This is so this is like me. Uh this is how we run the discord. Uh we literally uh at first I didn't trust cloud core. This is my very first usage.

27:54 · Okay.

27:54 · Right.

27:54 · So then I was like okay I will just try to manually download from zoom all my recordings and upload it to YouTube because this is a very laborious process. I got to click click click YouTube um isn't super user friendly. uh and it just did it and then I was like actually you know even the download from zoom part I should also put into cloud co-work and then I did it right here's a bunch of and it starts compacting here and and it it starts to even be able to do things like look through the individual frames of the video to name the video so I can upload it automatically and this replaces my job as a YouTuber we will forever appreciate your creativ but then by the way it compacts and makes makes like a new thing Right. So I I don't I don't have the initial initial thing, but then I asked it to make its own skills so that it so that something that's repetitive and one-off and human-guided becomes more automated and I can use the skills independently and reuse them. Uh and it obviously can write skills and that goes into context and skills at the bottom here which is which is so nice. Um so I have all these skills that I now sort of do on a weekly basis. I know you've released scheduled co-works which I haven't done yet but of course they should try them. I I think this is like so wonderful and fun for me to see because I think one thing that is very fun for me about skills in particular is that they're so easy to make. Like anyone can make a skill. Like a text message could be a skill and they can be so hyperpersonalized to you and this is like sort of the substraction layer, right? Like um I I'm just guessing but I assume you're very good at your job. You've probably given this thing some guidance about how to do it, right?

29:29 · I I just said wrap everything up into into a skill, right?

29:32 · Yeah.

29:32 · And then and then I was like actually sometimes I might need to break uh things apart because some parts fail or some parts might be needed in individually. So I told it to split one skill into three skills. So it's like a skill splitting thing and then there's like a parent skill that just orchestrates all of them if I want to use that. But like um I think that's that's like really good. Uh and uh there's there's one more part which is the uh Google Chrome thing that I told you about where I'm like okay you know what's better than uploading using cloud coork to YouTube like actually looking at the docs to like programmatically upload to YouTube and then putting that in a skill and I've never done that before. I don't want to deal with Google cloud so cloud coork does it for me.

30:11 · That is really cool.

30:12 · So so I I just I don't care. I just like let it do it thing. I don't it doesn't really matter.

30:17 · That is really cool. And then you I assume paired the skill just with the script that it's built.

30:20 · Yeah. And then I just update update the skill.

30:22 · A that is beautiful. Yeah, that's wonderful.

30:24 · It's kind of like a skill like basically I think like the way that people ease into cloud co-work is like take a knowledge work task that you would normally be clicking around for and then uh try to turn turn that and then you do the okay well what if you went further okay and then what if you went further what and you sort of expand the scope of co-work as you gain trust with it and and also teach it how to replace you.

30:46 · Yeah, it's like a little bit like playing Factorio but for your own life. Like you say, you start really small.

30:51 · Yeah. You start automating something really tiny and like once it clicks, you keep adding onto this like automation empire. Just like make your life easier and easier. My favorite skill has been um every single morning Cobberg starts looking at my calendar and make sure that there's no conflicts because people tend to schedule a lot of meetings sometimes last minute or sometimes miss it. often painful and a lot of products have existed like that a lot. I've written in the custom prompt there. I haven't made it a skill.

Calendar automation and cleaning your desktop

31:20 · Um honestly should. Yeah.

31:22 · But I've given it like pretty clear instructions about okay here are some people if they book over other meetings I'm probably going to go to their meeting like if Daario schedules a meeting, right?

31:30 · Not try to reschedule Dario, right? Um, and I think there's some other rules in there about like what kind of meetings I care more about, what kind of meetings I care less about, what is okay to like maybe punt like when I want to be when I want to be working, when I don't want to be working. And it's those really small things that I think kind of click with people. Right when we launched co-work, I think one of the user phrases that went most viral on Twitter X was clean up your desktop, which is of course silly. That's such a thing, right? Like you don't need a model to clean up your desktop. Not really. Um like this like clean out my desktop.

32:04 · Yeah, exactly. Yeah, I need to I need to choose my desktop, right? I guess give it access to my desktop.

32:09 · Yeah.

32:10 · Okay. Uh okay. This is very scary. Go do it.

32:14 · I did I did it with my downloads folder. It was like you have so many term sheets and there's like eight copies of your rental lease for your office. I was like all right, like don't yell at me.

32:23 · It's like it's such a small task and like I I would never go out there normally otherwise and tell people I've built a product. they can organize your photo for you.

32:32 · Um because it feels small, but I think to your point like Oh, here's here's the here's the ask user questions.

32:38 · Yeah.

32:38 · Uh beautiful, right? Is it obvious junk?

32:41 · You probably shouldn't click that.

32:42 · No, if it's not done, as long as it's reversible, I don't maybe. Yeah.

32:48 · Uh yeah. No, I have a I have a typical everything is super messy folder. So, yes, I think this this is super helpful.

32:54 · So, this is a pretty simple task, but I Okay, here it is. Right. Here's the progress. I don't see this in this. I'm like, this got to be something different than uh than cloud code because I'm like we do Yeah. That's we do system prompted. We're like, all right, we want you to think about like this task and method. Yeah.

33:12 · And then I can I can I can do like little suggestions for for for these things. It's beautiful. Look at this. I I can I can like say like, oh, don't do that. Don't do this. It's amazing.

33:21 · I'm so happy you like it. Um I mean the other way around like we're part of the cloud code team. If you would like this in cloud code. Yeah. Yeah. Yeah. Uh so so yeah, I mean uh this is really good. Obviously I'm like kind of raving about it. Uh you know I have other things like sign up for PG&E.

33:39 · So if you can do phone calls for me that would be great. Um I I do people have done that. Obviously you can't do that natively but people have done that with like various other providers.

33:47 · Yeah.

33:47 · Uh and then this is like signing up for the Figma MCP. Um I I really am trying to do like everything um data analysis as well. I do think um oh design to code uh very very good right so like here's a Figma file I'll take it and then this is where like a lot of other tasks is like knowledge work like replace my manual clicking but this is no I would normally use cloud code or cloud code for this but because I perceive that you have better chrome integration I I think you can actually do a better job of this and I this this is oneshotted my uh conference website that's pretty cool like at some point I would love to like hear how you feel about code in in the desktop app which is like I never use which is the the same team same team so I use the cloud cod in terminal which I I perceive to be the default way of cloud coding so one thing this has sorry I'm just like I'm not I'm not here to wrap all these products so can I talk about other stuff like I'm not sure if people out there want to like hear me advertise my stuff for like an hour but um this thing has like a built-in browser which is a thing a lot of products have a built-in browser and I think giving claw eyes into like what you're actually working on makes it so much more effective and that's probably what you've seen in co because it can see Chrome, it can like debug the DOM, it can like see things. Um, that does make it more powerful.

Browser context and why DOM access matters

35:04 · Yeah.

35:04 · So, so I think uh my mental model was kind of broken because I only use cowork because I thought it had a a browser thing in it, but I understand that the cloud code app or the app version of cloud code does have a built-in browser. I've seen I've seen this preview thing.

35:18 · Yeah.

35:19 · I just I've never used it.

35:21 · But in the end, in the end, you sort of die by hard.

35:23 · Yeah.

35:23 · You basically get the same thing, right? like the the the additional skill that you're describing is cloud is better if it can see what it's working on, right? That's that's sort of like the summary here. And like whether it's using your Chrome or it's just like making up its own little like browser, it doesn't really make a bit difference because either way it's going to see what it's working on and that just makes it much better and then you don't have to run QA for your cloud.

35:45 · Why doesn't it pick up my existing cloud code sessions? Because I I mean obviously I've used cloud code but excellent question. Um don't have a good answer other than like we're honest.

35:54 · Yeah. Yeah. Okay. This is what the open team does, right? Uh, cool. I I I don't have other like I I just I I do want to expand people's minds and also maybe show people if they haven't really done it, but like I I think it's very interesting how I sometimes use this more than I use I mean I use DIA, right?

36:10 · Um I and I use uh I've used like all the other agentic browsers and Enthropic didn't have to build an agentic browser because you just had cloud co-work and that's enough.

36:20 · Yeah.

36:20 · I also think like maybe integrating with number of excellent browsers out there is like currently on my personal priority list a little higher than like trying to rebuild a browser from scratch.

36:31 · Yeah.

36:31 · You know, never say never. But I think going back to this idea of like we want to plug this into your entire existing workflow. I think our goal is actually to not replace any of the applications you have on your computer, but instead like work really well with a new workflow, make the new one. Yeah.

36:46 · Yeah.

36:46 · It seems that nowadays especially on the browser most of the innovation is like user ergonomics. It's not really like the underlying browser engine. So I feel like to call it doesn't really matter if it's like via or Chrome or Alice whatever.

37:01 · Yeah.

37:01 · We want to we want to meet you wherever you are, which is like like obviously I would say that but it's also just genuinely true because I don't want to restrict my potential user base artificially by saying okay like I'm going to start building for the people who are willing to switch browsers right that's such a like you know like many lawsuits have been filed over who gets to the browser and like a lot of money has switched hands over the question of like which browser is default and which search engine is default within the browser. other um I just want to build for Yeah. I want to build for Swixs essentially. Like I want to want to I want to build for people who have a number of annoying tasks that they feel like maybe clock could do it for them.

37:43 · Yeah.

37:43 · What do you think about skills portability? I think there's been one thing I use another thing called Zo which is kind of like a cloud computer plus agent and I have a skill to add visitors to the office. Yeah. So whenever somebody has to come in after hours they need to check in downstairs. Um, but I want to like text the thing.

Skills portability and plugins

38:02 · So, it doesn't really work in in co-work. But now that skill is in the Zo harness and it's not in my co-work thing and then if I make a change is I got to I got to sync them. How do you see that going? Like I see memory as like claw personal kind of like I don't necessarily want my memories to be cross thing.

38:20 · Yeah.

38:20 · But I do want my skills to be cross agent that I use. I think with MTPs people did the same thing. It's like, oh, MTP gateway, MTP registry. I don't really know if that's like a business.

38:31 · So, I'm curious like if you've had any thoughts in the area. I think for me, this is sort of where I go back to the really basic primitives for us. Skills are file-based instead of like this complicated thing that exists inside a bliss somewhere that is like super proprietary. I'm really leaning into the idea of like it's all just files and folders and that makes it very portable in its own right. We do have skills as part of this container format which we just call plugins and plugins are available both for cloud code and cloud code work the same format and you can install plugins this works in code today you can basically say I'm going to add a whole like just a GitHub repo as a skills marketplace or like a plug-in marketplace and that's how we're doing pability I think we have a lot of room left to grow in how do we make it easy for people to know that they can write skills how do we make it easy for them to just like share a skill with because obviously all the words I just said, right? Like I'm losing most of the knowledge worker base out there. I start with saying, "Oh, you can connect to GitHub repo." It's not exactly how most people will end up working in like a general knowledge worker space. Um, but I think there's something there. And another thing that's there that I think has not really been properly explored is the the the combination of which part of the skill is very portable and then which part of the skill is like very personal to you, right? And I think that's something we haven't really saw here as an industry.

39:53 · It's like which you want to introduce more structure to the skill or have always have like public skill private skill, you know, pairs.

40:00 · Yeah. Yeah. Kind of. I think there's like a like easiest way to do this would just we do like use string interpolation or something, right?

40:08 · Insert username here. Insert like phone number. Insert like known folder locations. That kind of stuff. Um that's probably clunky. That's why we haven't built it. Um, but I do think someone is going to come up with like an interesting way to keep everything we like about skills. The portability is just a file. It's just markdown. It's just text honestly, right? Like a text for words. The complete lack of structure, which means you don't need any kind of tutorial to write a skill.

40:35 · Just like explain it to Claude the way you would explain it to me and Claude will probably get it before I will get it, right? just like for booking a flight, tell Claude how to book a flight the same way we're telling him somewhere I just started about today, but combine that with a very like personal thing. Um maybe we'll stick with the booking a flight example. I don't actually think AI should be booking flights. I think the tools we have is yeah finally somebody says it it's the default demo that everyone was making.

41:03 · I'm like I hating against like booking demos. It was not a good showcase.

41:07 · Yeah.

41:07 · I'm like I just want to book my flight myself. Um, I think there's a lot of things that have a personal and a non-personal component and that's maybe why people reach for flight booking because some things are very universal, cheaper flight is usually better, right?

41:22 · Like few people try to book the most expensive flight and then some things are quite personal about like what times you prefer, which seat you prefer, which airports you prefer. Combining that in like a skill format that is actually portable, compatible, easy to understand for people, I think that would be very exciting. We just haven't figured it out yet. Yeah, I think the text part ever I think everybody by now has some sort of like cloud file thing either Dropbox, Google Drive, whatever.

41:45 · So it feels like in a way it should basically like sim link my skills into all my agent harnesses just keep those in sync like we have internally this like valuable tokens repo which is like all the commands and sub agents. Good.

41:58 · Uh, and then I built like a TUI where you can start and be like, you know, install this command and the three sub agents into this agent in this folder and just copy paste this. It doesn't do anything. Literally like CP the file into that. But I feel like there should be something similar where like whenever I go into a new thing, it's like, hey, here's like the link to exactly the cloud folder and just bring down these skills into this. Like today, it doesn't quite work like that. Like if I install a new agent, I cannot I have to like copy paste all the skills and I don't even know where they are. That's like the big problem. It's like where do I find them?

42:32 · Um so I'm curious like in the future like that that almost feels like my personal productivity thing will be my skills.

42:38 · Yeah.

42:39 · It's not really the product that I use because everybody has access to the same product.

42:43 · But today there that just looks like copy pasting MBA files.

42:46 · I think so many things. I I really like thinking about agents and LLM just as like another coworker. So many attempts have made to build documentation companies that are like, "Oh, we're going to solve all your documentation problems." Um, I myself like spend a little bit of time working in Notion, right? I'm like deeply familiar with the concept of let's get everyone on the same page, right? And what you're basically saying here is you want all your agents to be on the same page about your preferences, about the skills, about the way they ought to work and like how they ought to execute. And I'm not sure what the right thing is going to be. If it's going to be some some company that can say, "All right, we're as an independent body.

43:23 · We're not trying to like push into any particular product. It's our job to be like the skill authority and we provide I don't know, we're going to be the Dropbox of skills and we can just sim link us into all the products they want to use. I'm not sure that's going to be viable business, but as as an idea, it would be cool, right?"

43:40 · Yeah.

43:40 · Yeah. I think so many things are just going away as businesses. It's like how am I supposed to do it? I'm not even asking somebody to make a product about it like yeah I want to personally know and there's things like you said it's like you almost want a skill and then interpolate it between personal and work. So if I'm booking a flight for work is different than I'm booking of flight personally.

44:00 · Yeah.

44:00 · In some ways but like a lot of the scaffolding is the same you know I mean as an engineer I will tell you like you know technical person to technical person I will just be like siblings.

44:10 · Well that's what that's what I do with cloud.MD and agents.mmd. It's just the same as fellow sim length and so it's like that works but it feels like yeah I don't know maybe we could always go one level up you can always tell co-work a problem and then co-work will solve it for you does make the simlings that's like one way to do it that's true that's true all right everything is called coowork uh potentially spicy question for both of you uh which of these industries will go away okay so what Felix was saying before is interesting there's basically like the short-term pressure of like we need to turn these tokens into valuable things which is I should build the last mile product that harness the model and then there's the question of like long-term which ones are going to still be valuable and I think you're kind of seeing this today with like uh you know the coding space in a way it's kind of like everybody's moving up and up in stack because you need more than just turning tokens into code I think search like enterprise search is kind of seeing the same thing like with glean and like all these different companies is like at the end of the day if co-work is the on doing all the work.

Which AI categories survive?

45:13 · The search itself is like such a small part that like I don't know if I'm really going to pay that much money just to do search. It's almost like everything is like a co-work vertical.

45:24 · So like how much can co-work first party support?

45:28 · Mhm.

45:29 · And how much can it not? I think for a lot of these things the planning thing that you were showing did the the planning the planning.

45:36 · Okay.

45:36 · Yeah. Yeah. Like that's one thing where like most of the value that these agents provide is like they're better at planning for specific tasks and have better tools for it.

45:44 · Yeah.

45:45 · But I think the models are now moving in that direction and they have the right harnesses and they're on your computer.

45:51 · So for me it's almost like if the end customer trusts your startup to be the provider of that task result then I think that works. This is uh something that this is a short spike that we're we're working on. Uh yeah, I think look, I I'll I'll tell you this like I don't think I'm the best person to like actually estimate which industry is going to be hit the hardest, but I do think that at Anthropic as a group of people, we're deeply worried about the impact that the tools are going to have on the labor market, especially for like junior employees that because I think I think it's only honest to say that when we talk about automating a lot away a lot of the work that we personally find annoying that we maybe think it's not the best use of our time. In a lot of industries, that kind of work would have been given to a junior entry- level employee, right? And I think it's it's only it's only right to be really worried about that and like worry what that's going to do in particular to people who like enter the job market.

Junior jobs, simulated work, and labor disruption

46:55 · I have a solution for that which you make them you create simulative jobs for them.

47:02 · Okay.

47:02 · So this is this is like half joke half true. So if you think about software engineering, when you're like a junior engineer, you work like one, two, three years. And in those three years, there's like maybe like a handful of moments where like you really learn something and then a bunch of other days where like you're not really progressing.

47:18 · Yeah.

47:18 · I think now we can use AI and these models to actually like shortcut these careers and almost like simulate the early years of your work and like just make them like super dense in like these learnings. is like, hey, we're working on this feature which is like a distributed system and you need to learn this thing that might take three months at a company and so you take three months here. It's like we're just simulating the whole thing.

47:41 · It's actually not a real thing and in one week we kind of speedrun through the whole thing and you kind of learn your lesson from there and we kind of repeat that in like one year you basically get like three years worth of like projects and experience.

47:55 · Yeah.

47:55 · I think it's harder for like things like sales or for things like, you know, marketing because you don't really have a way to get the feedback loop. But I think a lot of it, it sounds kind of silly. It's like you're making the new a fake job, but it's almost like you go to college, right? People pay to learn how to do it. And this might feel similar where it's like, hey, we have the Jane Street simulator. It's like, you want to come work at Jane Street?

48:18 · We'll just put you in the simulator for like three months and you'll come out of it. It's like, you know, I'm ready. So there there is an aspect here. I'm not an expert enough to like actually know what what is going to happen to marketing or legal or finance, right? Like I don't work in those jobs and I I don't think I should talk about them. But I am an engineer and I think I have a pretty good idea of what engineering is like. And I think one thing we're sort of seeing is that as a company and also as as the public, we're like deeply worried about entry level, but we're also seeing more senior engineers accelerate it. We feel like they're more productive. They they actually increase the value they provide. And a thing that I'm thinking about a lot is the fact that even before all of this happened um I've always had a lot of respect for the University of Wateroo and the the new grads that have joined my teams as from coming from University of Wateroo always felt like more ready than new grads who like literally spent their entire time at the university regardless of how good but never actually had to work inside an environment where you have to ship things that eventually will be used by users. And I'm I'm I'm German. I like initially went to German university. And I think the the the like information systems programs there tend to be very theoretical, right? Like I often give people the example of like trying to become a doctor, but you first have to do four years of biology. And as a result, when you get a new grad, you sort of have to teach them what it's like to actually build products and to work in a company and like work with other people. And like some people will have different opinion and like how do you do all of those things? And the University of Wateroo, it seems like they just spend half of their time. I don't know if it's true, but I think it's a year, right? They spend so much time part of your job curriculum to do spend a year in internships.

49:57 · Yeah.

49:57 · They just like go from company to company. They show up on your team is like a junior engineer who's been to like 20 companies. Not really, but like it seems like a lot of my new grads have also briefly worked at Apple, Google, Tesla. Yes. And uh there's a common meme where they like collect all these logos like infinity stones but and they always put it on LinkedIn and it's very unclear that they were an intern like Yeah.

50:18 · Yeah. Exactly. But it does actually make them so much better compared to other new grads. And I wonder if that's a useful model maybe for the future when we also have to like crunch down the amount of time you have as a junior employee because the value you have as a junior employee is going to like be impacted. My sort of pro- young people take is that there you're more uh you have higher neuroplasticity. You can learn more. You have less pre-existing biases. And what I is assume it's true for you what OpenAI often says is that actually it's the the younger like fresh grad engineers that use codecs or their coding stuff uh more innovatively than the uh experienced engineers who have a set and preferred way of doing things.

50:59 · Yeah. As I talk to people, I I had similar experience.

51:01 · Yeah. So maybe you're more AI native and therefore you're you you get cut. But like I think the problem is you don't need that many of them.

51:09 · I mean Anthropic is on the record of saying we do believe that the impact on the market is going to be sizable and we do not think that people overall are ready right and we do actually think we should probably talk about it as a society much more. Yeah, I'm not sure that I'm like the individual that can add like anything useful there, but I think as societies with economists and governments that need to wrestle those questions in a way that is probably more meaningful than me wrestling with them, we're probably not doing it enough.

51:38 · Yeah.

51:38 · Well, we'll try to educate. And then I I think also just releasing frequently as as as you guys do or probably maybe too frequently uh is helping people to adjust over time, right? But rather than one big bang thing, there's like sort of this gradual takeoff that people are living through that waking people up, right?

51:57 · Yeah.

51:57 · And I but I think a lot of us like wondering at what point do we actually have full takeoff, right? Like at what point is there we're all sort of expecting this like big bang moment where things will acceler accelerate so quickly that it becomes a self-reinforcing loop.

Gradual takeoff vs big-bang takeoff

52:08 · Mhm.

52:09 · And at that point it's sort of like off to the races and there will be no more like slowly catching up. You not just have Claude being so good at everything.

52:16 · Yeah. It's when co-work is training models. It's when it's looking at tensorboard and weights and bias values and training things.

52:24 · And like we can all debate like how many years it's away, right? Like some people make a bet around like maybe it's 10 years away, maybe it's a year away. Um I'm not entirely sure where where I come on this line, but I'm not entirely sure that ultimately it matters all that much whether or not it happens in four or five years. If we have a decent amount of certainty that's going to happen, it's probably something we should wrestle with.

52:41 · I wanted to talk. So, by the way, the the scheduled task complete uh the the clean my desktop task complete and it did it organized by file type which okay, but you know, I was trying to get it to do more sort of thematic like read the file, understand what it's about, group by uh the the topic rather than the file type, but I mean you can just follow up and have it do that.

53:01 · Oh, like it it is proposing, right?

53:03 · Yeah.

53:03 · So, it's got some like topical things, but uh yeah, I could probably do better like Yeah. So, like I probably need to give it a skill to read video files so that it understands here's how I like to honestly though like um I see that you're using open 4.6, right? Like my recommendation for people is increasingly don't worry about it anymore. Just like tell it what you wanted to do and it's probably going to figure out a way to do it.

53:25 · Okay.

53:25 · It might not be the way that you like necessarily or the way that you've gone about it.

53:29 · Videos deeper, but we're outsourcing organizing all of it. So, let's fight.

53:34 · Yeah. Yeah.

53:35 · I'm honestly like so curious what Cloud is going to come up with.

53:38 · I'll kick that off. I wanted to also just talk about the the overall uh you know you talk about data analysis, you talk about like uh your your personal finances. You also said uh which by the way for us is very timely tax season right like use cloud core for tax season. It is not responsible for any mistakes but might as well right like it's it's free knowledge work for you.

Finance, taxes, and enterprise verticals

53:57 · So I just like I think cloud for finance is a big deal. Um and this is definitely like in that mix. I wonder is it like do you is it a separate team? you talk to them. How important is it? Right? Like because you can also natively output Excel files now.

54:11 · Yeah.

54:12 · Just talk about the finance effort.

54:14 · We care about the verticals quite a bit. So we do have a dedicated vertical team. We also have a dedicated enterprise team. And there was this engineering not sales.

54:21 · It's engineering. Yeah. Yeah. It's engineering. So we do have people who sort of come to work every single day and they they ask themselves how do we make co-work extremely effective for people in those specific industries? How do we make it easier for them to understand? How do we make it easier for them to plug into this and like sort of get the same value out of it that software engineers get? I think it's no real surprise that software engineers ended up being sort of at the forefront of the entire AI moment because so much of it is this like Rube Goldberg machine where like we're already used to automating things, right? Like it's part of our job.

54:49 · Yeah.

54:49 · So we care about it quite a bit. I think it also like really matches what we see cloud being very good in as a model. I think it provides tremendous amount of value to those customers in particular because we can do so much with the amount of data they have. Those are like data heavy industries. They're industries where correctness matters quite a bit.

55:08 · So for I've used it to analyze my business. I just can't show it.

55:12 · That's two sense. I had a similar question about taxes. Like I did tweet I did tweet about the fact I did tweet about oh co is doing my taxes. This is honestly incredible. And um it's like annoying because like this is so cool but I'm not going to Twitter is maybe not the audience that needs to like see my tax return.

55:30 · Yeah, but here here it is. It's reading all the videos. So, it's like Yeah, it's getting more.

55:34 · Yeah.

55:35 · How did it actually do it? I'm actually curious.

55:36 · Oh, usually it just like takes a screenshot and then it reads the screenshot by Vision.

55:41 · So, this is what I do for my my Zoom upload thing, right? Because I I have paper club sessions that I need to upload to Zoom and I wanted to automatically uh title them and do show notes and everything. So, just take screenshots and try to try its best. Yeah, it wouldn't benefit from transcribing which it's doing by it's operating by pure vision now, but it's good enough.

56:01 · And then I do have to call out to Nano Banana to do images. So unless you guys do images for me, uh I have to call other people images.

56:10 · We're aware we're it's just like so fun for me because like this is the thing that I'm increasingly doing like increasingly curious about Claude's creativity and like figuring out what it's great.

56:19 · Claude's approach is like certain problem.

56:20 · Yeah.

56:20 · Vision for everything is is like the the superpower, right? like you know and computer use you guys were the first to do computer use right and when it was launched I was very unimpressed I was like it's slow it's unreliable it's wild how much better it was one year ago yeah I know like it was barely usable yeah I I remember it was barely usable but isn't it wild how much better things have gotten over like one year we went to the anthropic office because uh for the launch event for computer use like there was like this hackathon and like nobody hacked on computer use but I See, I I I don't know if you're okay with me saying that, but I did see briefly that you do have like a like an automate Mac OS MCP server installed, right? Do you use that ever?

Vision and the improvement in computer use

57:02 · What? Sorry, which one? Where?

57:03 · Um, if you go to your settings Oh, settings. Okay. Uh, where? Sorry, this one.

57:08 · Yeah.

57:09 · Yeah.

57:09 · Um, I noticed that in your connectors.

57:11 · Uhhuh. Uh, I probably set it up one time, but I don't use it actively.

57:15 · Okay.

57:15 · The a Mac automator. Yeah. Yeah.

57:17 · So, so I Yeah, this one I really wanted to just automate everything in my thing. I didn't find I didn't find it super reliable.

57:23 · Okay.

57:24 · Yeah. Why?

57:26 · No, no, no question at all.

57:27 · Cloud is much better at writing Apple script and executing its own Apple script than relying on these uh third party tools.

Why Claude writes its own scripts

57:35 · Yeah.

57:35 · Uh so I've increas I initially installed IMCP and like all these other FCPS that people built and but now I don't use any of them anymore. Like just just let cloud write its own thing.

57:46 · It's going to be more custom made. we keep going up the stack, but using computer use is like a fairly interesting area to me. And it's like also interesting in the sense that I don't think we're far away from I don't think we're far away from cloud being very effective at like using your computer and not just a theoretical computer.

58:02 · Mhm.

58:02 · What's the relationship between the user and the computer? like it there were some tweets about how huge some of the VMs that cloud cowwork creates are like 12 15 gigabytes and people complain but at some point it's like if you're using the computer you're taking action on is this just your computer and I'm just looking at it you know it's like I I think that's why people like the idea of like the Mac Mini and the open claw or whatever on it because it's like it got its own home you know it's doing its thing I'm doing my thing I think there's some kind of like not like race condition but it's like okay if I kickstart this task ask now I can't really use the computer you know because co cowork is doing things on it and it's kind of awkward like yeah I'm not sure I I do think it's a super interesting area because I I can maybe tell you like some of the things I thought about that I think actually bad idea so when when we initially started working on cowork I I did have some dreams about what would it look like for cloud to have its own cursor could be cool right like it's a computer we can write code we can touch everything like who says that computers need to have one cursor we could do a second cursor But that actually breaks down quite a bit. Even if you go and like present cool dreams to both Apple and Microsoft, you're like, wouldn't it be cool if um it breaks down quite a bit because so many of our models on a computer are built around this idea of like there's only one thing working on it. There's like a foreground app, a background app. Cloud and Chrome can work in the background, but that's like within one application, but the operating system layer that is a lot harder to implement. So, I'm I'm still grappling with what what does it mean for claw to actually act on your computer? Is the right format for claw to have its own computer that you set up and maybe every now and then you like zoom in and you play with it or is the right format for claw to just like wait until you're stepping away for a little bit and take over while you're gone or is the right move for claw to just like have his own computer in the cloud and like whatever you want cloud to do you have to set up yourself, right? There's like a there's like a number of different options. Um, this is a thing I think about a lot like what is the relationship between you and your computer and you and your data on the computer because how intimate that relationship is kind of depends on the tool and the thing that you're looking at, right?

Should Claude have its own computer?

1:00:14 · Like we're quite comfortable sharing some things, very uncomfortable sharing other things.

1:00:18 · And I think whatever product is going to be successful, we'll have to deal with those like with those different things.

1:00:26 · But you probably even if cloud was capable making a determination would you want cloud to make that determination in the first place? It's tricky Barry because it's like it's more than just privacy. It's like almost intimacy and it's like tricky to reason about in a way that will make everyone comfortable. Yeah, I could see, you know, a virtual box like actual virtual box app where like you run the VM and then you have like a screen within the screen, you know, you can put it in the background, but then you can like jump in the screen and like that's not a bad idea. Yeah.

1:00:56 · You know, like I mean I used it, you know, people used to do it virtualizing like Kali Linux in a Windows machine.

1:01:03 · Yeah.

1:01:03 · And like you just jump in and then you jump out, but it's like it's not like a dual boot. It's like within the thing. The problem is that you need twice the amount of RAM, twice the amount of, you know, it's like it's kind of taxing on the machine.

1:01:15 · But I think that would be cool kind like see, you know, the little quad window. I can see his desktop. Look how cute it is clicking around things.

1:01:22 · I was going to bring up he's the original machine in the machine guy because he has the uh Windows uh Windows 95 project. Where's Where's the Windows 85 project at?

Windows 95 in JavaScript

1:01:30 · There's probably some on my GitHub, right?

1:01:32 · No, no, no, no. I I It's like the first thing you see is this one.

1:01:35 · Nice.

1:01:36 · Yeah.

1:01:38 · Yeah, exactly.

1:01:39 · That was honestly a very fun project though. Like obviously I didn't I I should say this just so that no one gets the wrong impression. I did not write the actual the actual obviously I didn't build Windows only 5 because I was a child but also I did not build the actual engine that is capable of like simulating an x86 processor and JavaScript and WM. Um that's a tool called V86 which is very cool and everyone should try. But this came out of a this came out of like a debate we had at work where people were like they often are in the end debating the merits of electron and whether or not we should be building software in JavaScript. Yes or no. And I still am very upset that I can run all of Windows 95 in JavaScript and launch Microsoft Excel inside the virtualized JavaScript Windows 95 machine and do things that pro I can do that entire chain faster than I can do a lot of other things in like traditional SAS applications. Mhm.

1:02:30 · And this is sort of like a like a performance rampage that I went on. So I mostly built this as a joke for some of my colleagues at Slack. This took took like one night. Um what?

1:02:41 · But then that I it was it was not hard to do. It was all the hard work is in V86. Like if you go to the repo, it's going to say like 99% of this work is done by by um a guy who goes after the by the name Copy. His name is Fabian.

1:02:56 · Yeah.

1:02:56 · Um cool. I think you're you're kind of back on the Windows grind because you're building out the Windows support. Uh I thought there was some really cool technical stories to tell. Uh and it gives people an appreciation of like well here's how hard it is and here's how important how you invest in the sandbox. So maybe this is like a good opportunity to talk about some of the details.

1:03:16 · Oh yeah, the the VM honestly is like so cool. There's a lot of things we dislike about the VM, right? Like there's a lot of things that are real tradeoffs and you want to know why you're making those trade-offs. Um, and you're right, like a lot of people write me like, "Hey, how how come cloud is taking up 10 GB?" I could say on that part, it's not actually taking up 10 GB. It's just like a way that Mac OS displays bytes. It's like wrong. But the way we actually write it to disk is by we collapse the empty space in the image. So, it's not actually taking up 10 gigs. But that's a technical differentiation that's probably not going to matter too long.

VM tradeoffs and sandbox design

1:03:47 · To me, the the the how come it takes too long to start. Yeah. It's like 30 seconds sometimes. I don't know. Oh, it should be faster than that. Whatever. It's going to be 10, but it feels like 30.

1:03:58 · Yeah.

1:03:58 · Like even either way, like whatever it is, it's going to be it's going to be slower than just running code directly on your computer, right?

1:04:05 · So, the trade-offs are real. But what we're doing on Windows, we're using the Windows Windows uh host compute system.

1:04:10 · It's the same thing that WSL2 runs on, like the Windows subsystem for Linux that I think a lot of developers appreciate quite a bit. Yeah. Um, and it's it's pretty cool because we sort of like have to separate out which system space this virtual machine runs in in, who gets to talk to that virtual machine because obviously you give this virtual machine this amount of power. How do we optimize not just the connection between the two systems, but also how do we make sure that random other application doesn't get to talk to cla we do some pretty interesting things.

1:04:39 · Um, last week we started writing a new networking service, a networking driver that optimizes how claw talks to the internet. If your company is doing like weird internet things like packet inspection and like like you know taking apart as a cell inside your company, I think there was probably like a very small easy version to build off cowork that is much simpler but also breaks on most most users computers and this one is quite nice because it works on most users computers. Um, and the default example I always go for is I I really want this to be highly effective on like a on like a machine that most people pick up and that machine will probably not have Python. It will not have Node.js.

1:05:16 · And even if I just take away those two things, cloud is going to be so much less effective on your computer.

1:05:22 · So what do you do? You don't even I mean maybe require people to install node in Python.

1:05:28 · Oh, like you mean for like a what does the future look like without a VM?

1:05:31 · No, no, no. So, so like like you said, right? Let's say target machine is whatever is a default spec windows laptop.

1:05:37 · We do this which is quite cool. So on on uh Mac OS we use the um Apple virtualization framework which is pretty solidly optimized. Like it's good stuff simple API call right?

1:05:48 · Yeah it's like super simple.

1:05:49 · I I saw the code recently and I was like that's it? What the we do you once you start like shipping production code on it you start adding like all of these edge cases you're it ends up being a little longer but um I think Apple really cooked with a virtualization framework and it's very very good it is very fast it's very reliable and same on Windows the the host compute system I think WSL2 as well is maybe one of the diamonds within Windows it's like one of the few things that developers universally rave about is very very cool and like hooking into the same subsystem makes a lot easier for us to say we don't really care how locked down your computer is.

1:06:25 · Maybe it's like your employer's computer and your employer has decided that you get to install nothing.

1:06:29 · Mhm.

1:06:30 · Not trusted. But it's true in a lot of environments, right? Like even at Anthropic, um our IT department controls what kind of software you install, which is like a pretty common experience for many companies. Um, and this gives IT departments a decent amount of like it makes their job so much easier because we can say you can separate out CL's computer from the user's computer. And then for CL's computer, what you probably care about is data loss. You care about like a potentially hostile actor. You care about maybe data being exfiltrated. And once you control the network and the file system layer, you don't really care necessarily anymore that cloud might be writing super useful Python scripts. What worries you about the fact is that like once you install Python now anyone can do anything on the computer but once you put that in a VM that risk really goes down.

1:07:17 · Yeah. So that's why we jump through all of these hoops.

1:07:20 · Yeah.

1:07:20 · I think you you had a different uh tweet about this. Um but it's it's almost like people have also approve exhaustion like it's like you can't approve every single commands like sometimes by by default some of the the CLI I think even early cloud code uh you have to approve every single command.

Approval fatigue and safe delegation

1:07:37 · Yeah. and and like it's so so there's this sort of dichotomy between either approve every step or dangerously skip permissions.

1:07:44 · Yeah.

1:07:44 · And actually sandboxing is like kind of like the middle ground.

1:07:48 · Yeah.

1:07:48 · I do think I do think it it's maybe on us as like the industry to come up with something better than oh this is super safe as long as it doesn't do anything, right? If you want this to be useful then you have to like approve every single step of the way. And like computer use is a good example. The only way to make computer use on your host like super safe, like really super safe is probably if you approve every single action, right? Like models like I would like to type the word L. You're like, okay, that seems fine cuz I know I know which like cursor is focused.

1:08:17 · Yeah, it's not function if you don't delegate.

1:08:20 · Yeah, exactly. You need to like properly delegate. You need to be able to like delegate and walk away and trust that this thing is not going to like mess up automatically. And I don't even think we need to build perfect systems. I don't think we need to wait for like 100% model alignment. We can rely on the same Swiss cheese model we've used in the industry for a long time, but I do think we need to like universally maybe eventually invest more. And that's what we're doing. We need to invest more in systems where we can say you do not need to approve everything.

1:08:45 · Speaking of Swiss cheese model, he just wrote a thing about this.

1:08:49 · Oh, cool. Yeah.

1:08:49 · Uh yeah. Um yeah, super cool. I mean, yeah, it's it's weird how like I guess usually I think safety and security is kind of like a boring word to to engineers. They're like this can be unsafe to me. Unsecure. But um I think achieving the right thing like you're going after a consumer/pro.

1:09:08 · Yeah.

1:09:08 · Yeah. Kind of like both. I think I I also want to capture people who would have no trouble using cloud code like yourself, right? But still find it maybe just convenient, easier. You're like, "Oh, cool. There's like the to-do list on the right. I can edit it. Those things are just easier to do if you have to.

1:09:23 · Yeah, but this is like clearly the knowledge work side. Cloud code will clearly capture the development workflow. But like I I do think like you have to sweat this like safety and security details in order for people to trust it. And like the even cloud and chrome like having the whatever API uses to do the background thing.

1:09:40 · Yeah.

1:09:40 · Um that's the only reason I use it is because otherwise I would have to just get a separate machine.

1:09:46 · Yeah.

1:09:46 · And just run it run.

1:09:48 · That sounds super annoying.

1:09:49 · Yeah.

1:09:49 · I mean like currently doing it but and I think I think also as developers um maybe we're we are more risk tolerant but we're also just like accepting we are more risk tolerant but I think we also just have like I don't want to say arrogance but like sort of the trust that if like the really bad thing happens we can probably fix it.

1:10:05 · I just tell Claude to like check with me before doing any irreversible action like sending an email or doing it permanently. It's good enough. But like not even cloud. I mean like simple things such as npm install like we're all running npm install with full user permissions and if it wants to like read SSH it will crazy that that is the default kind of yeah I know I agree I agree it's fine like I'm obviously doing it every single day. No, right. Like, uh, and I think obviously npm and GitHub 2 have like done a pretty good job maybe over the last couple months to like clean house and come up with like more specific tokens, but generally speaking, I think as engineers, we've always been a little bit more risk tolerant. And if you do a little bit of introspection and you ask yourself, is that how we should be doing things? You might not always come up with the right answer. And I think for models too, like my approach, like I'm not going to the the safest thing is to do nothing. We do want products that are quite capable, but to the extent possible, I don't want to ask you, are you okay with a script? Because I kind of believe that once it starts becoming a part of your workflow, you're probably not either either you don't have the skill to understand whether or not this Python script is safe or you're not going to read it anyway.

1:11:15 · Cool. I guess a couple parting questions. Uh what's the future of clockwork?

The future of Cowork

1:11:19 · I think we're still we're still such early days. We're going to keep shipping things that we're going to keep shipping things that um we're going to keep iterating on this thing like pretty quickly, but which I mean you can sort of continue to expect that every single week there's going to be like a small new feature if not a big new feature. Um I'm going to continue probably to double down on your computer and like making you effective in your computer, making cloud effective on your computer. Um we're starting grapple as we talked about today, grapple more with the question of like what does it mean? What does your computer mean? Does it have to be the one in front of you or like a VM on your computer or like a computer somewhere else? And then the third thing that I'm quite excited about is we're continuing to go up this hill climbing on slowly taking users who are used to asking questions and getting an answer to slowly teaching them to like step more and more away and let claw take over like bigger and bigger tasks and work both in time as well as in like scope. And I think you can probably see most of our investments on our feature releases to like work on both of those things like the ability to do more on your computer and then the ability to do it more independently and for longer.

1:12:23 · Does remote control work for cloud core work yet?

1:12:25 · No. Right.

1:12:26 · Excellent question.

What comes next for agentic knowledge work

1:12:29 · Coming soon. I mean that's an obvious thing if you want to keep betting on the on your computer. But to me like you know we we talk about like people are not ready this year like the there's there's no wall that's it's accelerating. to me like what will be we be doing differently at the end of this year that you know we maybe not even thinking about this at the start of this year right like I'm just trying to look ahead as to like what's like a good use case that you're we sort of aim towards so for example for the machine learning scientist it's always okay well I want AI scientist that can automate automate machine learning but like for for knowledge work I mean I can already you know get it to sign up for Google cloud to mean a GI because Google cloud but like what what is But beyond that, I don't know.

1:13:12 · I think it's basically the idea that like you still had to tell her to build your script, right? You were still kind of involved.

1:13:18 · Yes.

1:13:18 · In maybe a way that felt kind of magical to you, but like maybe to me on the other side as the person building this product still feels kind of heavy-handed. I see so much process that I'm like, "Oh, let me take that away from you."

1:13:28 · But like how do I just go I will continuously go will continue to go like further and further up the stack and make your life easier and easier.

1:13:37 · Oh, here's one. Right. Yeah. Watch. uh I you know I don't care about my own privacy or whatever or I trust cl I trust anthropic so just watch everything I do on a normal day-to-day basis at the end of the day tell me what you is cloud co-workable yeah I don't know I think the funny thing about a lot of these products is that like for good reason I don't enjoy I I don't throughout my entire career I've never like teased too much what I'm working on because I think you should just like yeah to release it yeah build the release it and then talk about it like I'm I'm not a big fan of that like vague posting amount work ahead of time.

1:14:10 · But the thing that is like always so fascinating to me is like both of you all multiple times today you've like mentioned things and like yeah that is obvious like very obvious okay that someone should be working on those things. Um and I think we're still in the space where if you look at cowwork the things that we will releasing will probably not be a big surprise to either of you. You're going to be like yeah obviously that's valuable obviously that we're working on those things and obviously that's good and useful. And the more I hit those those points, the more our features fit into that category, I think the better it is for us because then we don't end up building things that are too hyper specialized or too difficult on the style.

1:14:42 · Yeah.

1:14:42 · I think the hypers specialized thing is very important. It it keeps you like general purpose. It means you're not thinking too small maybe. I I don't know what the the word is.

1:14:51 · Yeah.

1:14:51 · Yeah. Exactly. It's like the whole concept that like at no point did we release you know there's no cloud code for NodeJS applications that use React and 10 stack and only those two things and like if it's anything else I know several startups like that I think there's probably like I'm not a VC I'm not an investor it's like hard for me to predict where the markets go but in terms of the building blocks that I'm interested in the electron is probably by far the most popular thing I ever built and um electron itself is like very abstractable and generalizable I had like so many apps run in it and I think it would have been hard for me to predict how many apps actually end up using Electron.

Electron, Chromium, and desktop software lessons

1:15:26 · Um, and what would have been even less useful for me to predict this and what those apps do. I distinly remember Bloom coming out and being like that is cool. Like you were a camera in a little circle in the corner. That is really smart.

1:15:39 · That's an electron app. Yeah. Yeah. Or at least was. I'm not sure if it still is. It was for a while. Or like one password has so many interesting things, right? It it's it's it's a level of the stacks that I'm quite comfortable with and whenever I give other engineers advice, it's actually that layer that I think is most valuable to invest in because the tools of the layer are not that good but that's where you get the most leverage for like the future in general.

1:16:01 · Just quick tangent on electron cuz I always wonder this uh have you looked at Tori?

1:16:05 · I have. Yeah.

1:16:06 · What's your take? You know, my my my my view is like most things should be Tory by default unless you really need the full power of Electron. But yeah, I can give like my take on I can give my big take. Why do we ship an entire version of Chromium inside the thing, right? Like why do we do that?

1:16:22 · And um people ask me this question a lot because it's like very counterintuitive.

1:16:26 · Wouldn't it be much easier to use the web views that are on the operating system? Wouldn't it be much easier not to have to do that? And the answer is yes. And like obviously I did that once upon a time. I did that. There was a version of the Slack app that used just the operating system that we use. Wait, did you did you start the Slack app?

1:16:42 · I would Well, team effort in Yeah, but I was I was there and we built the Slack app. Yeah, it's crazy. Um I mean obviously get the Electron guy to do it, but Well, but this is an interesting point.

1:16:52 · Like by the time by the time I joined Slack, they already had an app that was built with something at the time called Macap. It was a little bit like the same app thing for mobile. it just used the operating systems web views. Um, and that didn't work for like so many reasons. Um, and they were like, "All right, maybe we need like bigger guns.

1:17:10 · We need like take more control of the rendering stack." And there's there's a few things I always mention here. Um, I think if you're building a small app, just going with the operating systems, spread view is perfectly fine. If you're building an app maybe that doesn't have too many users who will like cry bloody murder if it doesn't work, that is fine.

1:17:27 · The reason to go with your own embedded rendering engine is because, and this is still true in 2026, the operating system rendering engines are not that good.

1:17:36 · They're just not that good. Both Microsoft and Apple are trying to move away from that. They so far really haven't. The only way to upgrade those is to upgrade your operating system. So, if you're say a Slack and you have a critical rendering bug in WK Webb and some of the other web view options, your only recourse is to tell your customer, "Oh, sorry, you're too poor. You didn't buy the latest MacBook. Unacceptable.

1:17:59 · Unacceptable to user. Unacceptable as a developer. So, you sort of need to like go down the stack and like find the best rendering engine, then put it in your app. Why Chromium? Even though it's very big, Chromium is by far the best thing.

1:18:11 · Like I I often like to remind people the Unreal Engine. You want to render some text, they use Chromium. Like Chromium is part of the Unreal Engine for same purposes. Chromium is very very good. I think it's like one of the marvels of engineering. It's very hard from we're in San Francisco right now as we're recording. Most of the people in the city are web developers. It's hard for me to like overstate how magical it is.

1:18:36 · They can run se like rendering a YouTube video dynamically negotiating a bit rate figuring out what to do about your extremely broken hardware driver. Actually, this is a fun thing. Um, you can enter Chrome colon whackwack GPU.

1:18:54 · Okay. And if you scroll down a little bit, these are all the enabled workarounds because something is going wrong on your computer. If you're doing this on a Windows computer with like a GPU that is not the most popular GPU, it will be much longer. And all of these are usually just there to make sure that if I say as a developer, I want a red pixel to appear here, that that actually happens. Chrome is such a marvel because it works on all the machines that a user might throw at you and it's going to work fairly reliably. And if it doesn't, they will probably fix it within 24 hours. I see. So this is the super operating system, right? That that works everywhere.

1:19:29 · Yeah.

1:19:30 · Right. Okay. Yeah.

1:19:30 · So a lot of the magic of Electron is honestly just that it makes it very easy for you to ship Chromium in a way that serves you exactly and your use cases.

1:19:38 · Exactly.

1:19:39 · Our next interview is with Maran Dre.

1:19:41 · Yeah. Who had the phrase like desktop OSS are just poorly uh poor implications of the the actual OS which is Chrome which like actually works everywhere. And this is this is the platform where you ship apps.

1:19:54 · I think the wild thing is that like as engineers we so often sort of assume that the platform like the layer below us is like super stable and then you talk to those people and they're like we're also just like guessing. Um uh and I had like a distinct moment at Slack where one of our customers at Sack was Nvidia and for a while I really put GPU developers on this pedestal in my head and I do think they're still probably much smarter than I am but I was like hardware engineers who built the chips who then like built the drivers their work must be so much harder than mine they must be very good and we had like one bug in Slack where like if you had a YouTube video in Slack it wouldn't quite render why like it would have these weird artifacts And um that ended up being a chromium bug and I ended up on this like giant thread. So I got to see a lot of the source code and they also are just like common to do. We don't know why this is weird but if you flip this bit things work you know this is just like happening at every layer of the stack.

1:20:48 · Maybe the uh you know the the end of year AGI prediction is that cloud can build chromium.

1:20:55 · You see you see you laugh now but like you know someday it's it's starting to get pretty good like it used to be completely useless. um mostly just like overwhelmed both with how hyper specialized tools are inside the chromium repo like for for a long time that Chromat would like sort of reinvent all the tools because none of them were capable of handling Chrome.

1:21:15 · I think the EGI moment I'm kind of waiting for is at what point are we going to say Electron is probably no longer necessary because you can just build fully native apps in Swifty. Yeah.

1:21:25 · like not just in Swift because this is one thing like it's pretty easy if you I think our current models are quite capable of taking an electron app and replicating it swift are they going to be capable of like building an app that is actually more performant uses less memory all of that stuff um is going to go into the same hyper optimization that developers have done for like a long time we're not quite there yet where I can like point even our best models at a thing and say just replicate this in native code make no mistakes ultra think right we're not quite day yet. Um, Ultra Think is back today is back. Yes.

1:21:59 · Okay. Or thing for like days. This a pretty long time for Bor, but he worked on Ultra Think for days.

1:22:06 · Yeah.

1:22:06 · Why? Just it's just a prompt.

1:22:10 · I'll let it more goes into Yeah. Okay.

1:22:12 · Another question I had is like co-works.

1:22:15 · So, if I have my cloud co-work like what's kind of like the multiplayer mode? I think sub aents is like single player split up the context.

Multiplayer agents and coworker-to-coworker workflows

1:22:24 · Yeah.

1:22:24 · And the multiplayer cowork is like my colleague has some file on their machine that I want to know about or I want to know how their task is going to then update my thing. Like is that interesting? Is that something that makes sense for you to build or for like it's like super interesting to me? It it almost goes back to like some of the scaffolding where I'm like okay are we going to be end up are we will we end up building scaffolding that will just go away? And like a question I have here is at what point do we just assign these things like their own Gmail account. We just give them their like Slack handle and then they will just like use the same tools we humans use to interact with each other. You mentioned our finance people. They've been working pretty hard on very good office integrations. And I think for a while we like we built so much tech around claude leaving useful comments inside a Google doc and now it just does it just like leaves a comment in your Google doc and that's how you interact with it. Maybe like the similar thing where I still have open questions around what is the best interaction mode. Is it for us to build something super custom for co-work agents to talk to each other? Or is it okay, let's just jump straight to the finish line and say we we're just going to give this thing if you use Slack at work, we're just going to give this thing a Slack handle and that's going to be the way it's like multiplayer capable.

1:23:39 · They communicate with each other. Yeah.

1:23:42 · like you know as as a fun project I built this thing called pi q which basically takes any repo and the pi agent coding agent it puts it in a VPS and then there's a public web hook where anybody can submit a coding task and then there's a dashboard in which you review the task pi pi q yeah you basically get all these like tasks anybody can submit a task and to me it's almost Like in the organization of the future, it's like the sales people are talking to the engineering team that is talking to the marketing team to the product team and all these co-work are going to like ceue up decisions for other people to approve in a way.

1:24:28 · Yeah.

1:24:28 · you know, and I'm kind of curious what that looks like and like how do you how do I give my coworker the ability to build approved task without asking me?

1:24:38 · Yeah.

1:24:38 · And how to decide which one I need to review, you know, because for some of these things it's like, you know, you want to change the color or something. That's kind of like a branding decision. Or another one is like, hey, your thing is just broken.

1:24:49 · It's like this is like how you fix it.

1:24:51 · And Cloud can actually review whether or not that prompt matches what it's trying to do. Today everything is still very it's like multiplayer within the single player you know I get spin up many of them but like how do I get multiple people to hand off to each other things using their particular context yeah and for both of your co-workers to like talk to each other right right yeah hey we got an episode today can you like have you you know or yeah this is like a uh I know we're like running out of time here but like we we previously talked about sharing skills and I did have this question of like what if your co-work would just like ask the other co-workers if they have a skill for this task. doesn't too, right?

1:25:27 · Like, okay, so skill transfer.

1:25:29 · Yeah.

1:25:29 · Like, um, and again, this maybe this maybe goes back into the territory of like building something very powerful and building something creepy often goes hand in hand. Um, because I could tell from the reaction that my fellow engineers had that this is probably not what we're going to do, but like we have Bluetooth LE, right? Like I this computer can figure out that it's sitting right next to this computer. So, you're probably working on the same thing.

1:25:53 · Um, will you see that in co-work?

1:25:54 · probably not. But um there's like I think really creative solutions to problems that we really haven't tried yet.

1:26:00 · Yeah. Yeah. Yeah.

1:26:02 · Excellent. I guess the the last thing is en anthropic labs. Uh I always have this mental model of a model lab versus agent lab. And this is basically anthropics internal agent lab which cloud code uh is now under right. It's part of the whole org.

Anthropic Labs and closing thoughts

1:26:15 · I mean people are so funible right like okay this is just I don't know how I don't know how real this is. I don't know.

1:26:20 · No it's a real team. It's a team. Um the the last team is primarily working though on things that you don't see in public yet. Um they're trying like really wild out there ideas that seem quite improbable. Um the mad science but you you're are you officially under this thing or No, we're we're cloud code is now cloud code is like a fairly big group where I don't actually know how many people we are like I remember yesterday coming into our weekly cohort meeting. I was like woo this is there a lot of people here. Um, but we still have a lab team and we actually made the lab team a lot bigger. Mike just joined the labs team as a as an IC, which I think is very cool and very fun, but they're they're working on things that you have not seen yet that are extremely out there and probably half broken, right? Like the sort of the idea of a lab stream is that it should only work on things that make really no sense for anyone else to work on.

1:27:09 · Okay.

1:27:09 · Well, looking for exciting things from there, but thank you so much. I know we're out of time, but uh appreciate you're joining us. I appreciate Cloud Co-work. Everyone go use it. Uh, it is the closest I've felt to hi this year.

1:27:20 · That's so nice of you to say. Thank you very much.

1:27:22 · Yeah. Thank you for your time.

1:27:23 · Yeah.