Table of Contents
Transcript: AWS Security And Monitoring With Rodrigo Montoro
Host: Hi everyone. Thanks for tuning into our Scale to Zero show. I’m Purusottam, co- founder and CTO of Cloudanix. Today I’m excited to talk about like thread detection, threat research in the aws or in general cloudland. And to discuss on that, we have a special guest, Rodrigo Montoro. Rodrigo is the head of thread detection research at Clavis Security and has more than 22 years of experience in It and computer security in general. For most of his career, he has been working with open source security software like firewalls ids, ips, incident Detection and Response, and Cloud security. Rodrigo is also an author of two patents in the detection field. One to discover malicious digital documents, and the other one is around analyzing malicious http traffic Rodrigo, it’s wonderful to have you in the show. For our viewers who may not know you. Do you want to briefly share about your journey?
Rodrigo: Oh, sure. Thank you for having me here Puru. It’s a pleasure to be here.
And like, I’m a Brazilian guy. I live here in the south of Brazil. I live in a place like I have a place here, like Floripa. Florianopolis is the capital of Santa catarina state. And the last like 22, maybe 25 years, I’m working with computer stuff, like technology mostly. I start like in most of I think of my age, like with scissor Jamie, like working on provider internet providers and managing the infrastructure. And so I started to move to secure it, and so they moved me to secure it.
And so after some time I move it to research because I’m the kind of guy that I like to understand this stuff really deep. And after that, like the last three years, maybe that three years, something like that, most of my time I’m playing with the AWS Security. Like, start with studying, understanding, discovering something in this kind of stuff.
Host: Okay, that makes a lot of sense. Yeah. We are excited for the discussion. So the way we do the podcast is we have two sections. The first section is focused on the security questions, and the second one, the fun one, is the rapid fire. So let’s start with the security questions.
So when it comes to like, thread detection or threat research, there are many areas, right? There is bug bounty programs, there are pen tests, there is vulnerability research, thread research, et cetera.
So when it comes to cloud, let's say maybe you stick to aws for now, right? What areas does your work cover?
Rodrigo: Okay, I mostly work, let’s say on the booting side at Cloud is currently mostly we have a security operations center. So our main goal is to create detection. And to create detection means I need to understand the aws ecosystem, see how stuff works, see how attacker could take some path. And path means like the actions that they could use this action, plus these actions. Plus these actions. So you get access to something and I used to study and research mostly like what I used to call uncommon service. Like what is considered uncommon, right? Because we’re talking about around 300 service for aws. So there is amount of opportunities and there is a bunch of research on some specific service. If you look like for S3, for ectu, for lambda or this kind of service. There is a bunch of research people trying to abuse. And what I try to do is to look through service that’s not very well known or not very well, use it around the road and try to figure out if the company used this service, what could I get from here to get the data? Because the data mostly will be in the same place, but I just tried to figure out different paths. Most of the time I try to understand, I’m testing, I’m clicking, I’m trying to see what’s being generated, what’s missing and this kind of thing. Most of the time that’s what I do.
Like when I do the research, I try to build the possible paths in different service and see what I can get from there. And when I get something I try to share.
Host: Okay, that’s very interesting because generally folks focus on the common services, right, like S3 or EC Two. But you are taking a different approach that you are focusing on uncommon ones because that’s where you will have very less documentation even from the cloud providers perspective and also in general. Right. The follow up question on that would be,
how is it different than let's say the regular red teaming or blue teaming activities that is performed?
Rodrigo: Okay, well, everything started I was doing this, so I talked to Ashish, in person here, talking like some conference, and I started to figure out because I was very excited with the current tools and detections that we have around. And so when I talked to him and that brought this word and common service to my mind and I started to figure out, okay, I was analyzing I used to play with Cloud spoiled it’s one of open source cspn, right? And I sort of figured out, okay, I’m finding a lot of stuff using cloud supplies like misconfiguration and so on. But if I look at the number of service that they are covering, we are talking about almost 100 service and we have 300 service and so most of the service we are not looking for and so that’s why I start to look on this or try this kind of approach. Let’s see like okay, let’s be like more pessimist and see like the other part that nobody is looking for because probably there is something, there is some opportunities, maybe not, but maybe yes. And talking about the red team and the bluetooth part, I think most of the time the red thing is something more like I’m talking about based on my concept of redsing and bluetooth, the red team for me is something more strategic. Like I’m not trying to find a sequel injection on your application, I’m not trying to see if you have some vulnerability. What I’m doing is I’m going to test some scenario or some techniques or some steps and see like if my defense and my blue team we are going to detect.
In my opinion, it’s something that privilege I read know about that I know how to attack, I know that if I succeed here, I need to detect here. What I’m trying to do with my research is to say that in the bluetooth side we use it to be more responsive about the offensive research. Like they discover something, we try to detect, we try to block, we try to mitigate this kind of thing. And so I used to say that my research are kind of making blue team more proactive. Like I’m trying to figure out the paths to create a detection before..
Host: somebody else do that someone outsider takes advantage of it. Right?
Rodrigo: Yeah, exactly. I think that’s my approach. Like I’m trying to be more Bluetoon active on a proactive way. Instead of response, do I respond to someone else research, I try to do this part of research because the nice about aws, by the end doesn’t matter the way the attacker are coming, like when they try to execute something, it should be logged in the same way. They probably have much more ideas about the way they are going to attack me, but by the end the action is the same. So I try to figure out which actions and try to create the actions.
Host: Yeah, that makes sense. Speaking of, like you mentioned about this a couple of times. When it comes to aws, there are many services. So each cloud provider has published a shared responsibility model and as part of that they cover some of the areas even from a security perspective like the core infrastructure, hardware or network. As a customer, let’s say I’m trying to deploy my workloads and I’m responsible for my security of those areas, right?
The threat research that you do, does it apply to both the cloud providers capabilities and also the customer's workloads or it's just focusing on the customers workloads?
Rodrigo: Well, most of my research are based on the functionality, the features of the provider itself. I don’t have any research. Like some researchers, they do some cross tenant parts.
Right. I’m using my service here and I didn’t have this kind of research. And what I have about the shared responsibility model, it’s something that I really take a look. I have a research that I did a presentation last year. It’s the default. The Full truth of aws share responsibility model. It’s where I try to point, like there is a bunch of improvements that aws developed from the last years, but it’s not enabled by default and that’s the point.
And so you need to know to understand that it’s not an aws fault, but it’s the default. And so you as a customer and the part of your responsibility is to know that. And so you could mitigate these kind of problems and improve your security.
For example, like that three buckets, for example, there is a bunch of leaks and interface start with open to everyone, right. And they release like in the past, I don’t remember when, now you have a flag that this flag will block the possibility of this bucket become public. Like you could claim you with a palace, but until now it’s disabled by the full dis flag. If you use entire farm, for example, it’s disabled. And so a guy could make your buckets public because it’s disabled. And now at least for this, they are moving to this enabled by the fall, like in April. That’s cool, that’s quite cool.
But there is a bunch of small details in name service that you need to fix and this kind of thing is really important to know. As I said, I’m not saying that’s the aws fault, but since it’s not, anybody must understand that.
Host: Yeah, so it makes sense that each capability comes with some default options enabled. It’s your responsibility based on your workloads, you change the defaults to be more secure. So one of the things that you highlighted multiple times is that aws has let’s say around 300 services, right? How do you decide which one to pick and how do you prioritize those services as part of your threat research?
Rodrigo: Okay, that’s a good question. Well, when I start this kind of approach, everything started with some needs or some read or something you’ll see around.
I start with upstream. Like the very first service I started to look for was aws upstream. And why? I started with upstream because I wasn’t a call with, it wasn’t a customer, it was a company calling to my company about an incident problem. And they mentioned oh, we had a problem with our app stream and because they have a roll attached, administrator roll, attach it, I started to figure out like what the heck is upstream? It’s something there, they install it. That is a true instance. And so I started to figure out saying oh my God, it’s a service. And so I started saying, okay, I understand how it works.
So because my very first thing was like, oh, if they are our customer, probably not the technique because it has no infrastructure. And so it starts to look, understand the actions, play and do all the things. And so this is a kind of path that I start with upstream. But I had two ways now. First we used to map the service because we have a bunch of cloud trail coming to our security operation center and there is a variable called event search. And so we try to see the service that most of our customers are using. So if there are some possibilities about spray detection and things like that. And there is a nice research from noah, from aromatic and that he talks a lot about pezero because when I action has a pezerole it probably could be more dangerous because we talk about permissions related and this kind of thing. And so there is like I think almost 100 service probably has actions that has Passrole and Passrole is not logged. There is no events name called Passrole.I really like the list that they release because there is a bunch of actions and the opportunities and my app stream is there the upstream that actions that I research is there but it’s what’s coincidence like I start because of the call and when I saw the noaa research I figured oh, it’s here. So that’s a good point to start because everything that has permissions related attached it probably could be dangerous because permission is a huge problem for everybody.
Host: So I think you highlighted two things which were key, right? One is always like you go based on which services are getting used and based on that you prioritize. And the other thing that you highlighted is that you have enabled sort of cloud trail logs so that you get notified into your sock system, you know which services are getting used and you are also looking around others research so that you can learn what type of logging should be captured. So that that helps you with the prioritization as well. So that makes a lot of sense.
So now I’m just trying to think, let’s say I’m a user of, let’s say, aws, and I’m trying to build and deploy workloads. So as part of the building phase or design phase, we spend a lot of time on architecture to hash out the capabilities of the services. And also how do they interact between each other,
how should organizations think and what stage should they think about security when they are doing the architecture?
Rodrigo: Okay, yeah, that’s a hard question to answer to be something like I think there is no receipt, right? There is no magical receipt. What I like for me the perfect perspective I like to think about the treadmill. You need to understand where is your attacker, where is your data or your information that needs to be protected? And the path from the point A to the point B in this path we talk about, could be something on premise, could be something based on cloud could be something based on application with a flaw. Because sometimes they don’t need to find a flaw at the cloud provider. They could get the information using different stuff. It’s not all about but based on the tread modeling I think you have a lot of opportunities like for prevention, guard rails detection, monitoring, this kind of thing and based on that I would build something that I call like okay, what are the quick wings like? What’s something easier for me, like I have almost ready to go and I just do in the field time because something is pretty trivial. For example, if you never run a cspn, it’s pretty trivial to use something like an open source, like to start and you run and you find a lot of misconfiguration, you have a lot of stuff to do and some misconfiguration are pretty easy to fix, some are not too easy. And the second problem is about the permissions, right? The permission is really, really important to have the least privilege.
So when something leaks, the danger is not that big. If you have permission, it’s going to be a problem, right, but by the end championships, right? Usually they say defense wins championships and offense win games.
They could maybe get the first access. But if the attacker didn’t get the information that you’re trying to protect, you lose the game, but you win the championship, right? Mostly I think that’s how I would try to work for aws. There is something at aws that I think everybody should use. That’s my opinion, right? But there is the conditionals. All right, let’s come back a bit. We have the permissions, right? We have like almost 14 actions splitting on those 300 servers something. And the permissions are splitted in five different access level that you have the reads, the list, the tagging and so you have the permissions management and you have the right in my conception, everything that has permissions management and right.
You should have a conditional and I think the easiest conditional and where it’s very effective, it’s like the IP service. And so you could use this key with this policy, but it should be from my vpn or from my batch and holder or somewhere else. Because if the keys leak, because the key will leak some stuff, right? If the key leaks, at least they could not use that because they need to be from that service IP. And so you lose the game, but you win the championship because they are not going to use. I think there is a lot of small tricks in my opinion that it’s not difficult to apply and could save a lot of problems that you’re probably going to solve. But I think that’s not a great framework, actually.
Host: That’s a great framework that wherever there is high impact permissions you are assigning like permission management or rights, you should always have some sort of restriction as you highlighted it.
Maybe IP or VPN or something like that. So that you know that even though attacker gets access to the access key or something like that, they are sort of restricted to a particular set of ids
Rodrigo: because there is like some small details. For example, it’s all details that I used to say a lot. For example, mfa, right? You create your user there. Pure and grid. Give purdue administrator access. When you go to the console, you log in and you add the password and so the mfa and boom you log in and so on. So you create a key because most of the time you are using terraform or some tools or something else.
But the key what is use links, it’s not the user and pass, it’s the key. But the mfa is enabled just to the console web console. The key is not used in mfa. And so there is a bunch of small details that could improve a lot. You are secured just understand the default. Because a lot of people that I explain that they say but I use mfa but mfa is just from the console. There is no mfa because you need to do the conditional asking for mfa.
And so you need to run the comments for the sts and points. And so you bring back the Asia key, the temporary keys when you type the correct mfa. But nobody does that. Not nobody. Most of people doesn’t know they don’t do that. And so like if you improve like IP serves and mfa it’s a big chance. Like your keys are not going to be used anywhere.
But okay, there is a lot of things that you could not do that maybe but there is another conditional step like that. But yes, I think it’s a cheap easy way to kind of create some protection for you. And so after that you start with csm I am permission. The more interim part. Like it’s easier because you just need to grab the information and have the detections, right? And so it don’t depends on other people work. But that’s it. That’s the three errors.
Threat modeling, restrictions and this kind of thing.
Host: Yeah, so I really liked those two like the restrictions part, the conditional permissions and the threat modeling. Rather than just looking at what permissions you are providing to maybe one service, look at the entire attack path or the threat modeling for that service which can help you prepare in a better way.
Rodrigo: There is a company, I think it’s custom Cloud. They release like some public s three thread modeling. Just for s three service. And just this specific it has like 106 page just for I’m not going like you don’t need to be that paranoid for the first interview. But you need to understand that there is a bunch of opportunities for stuff like that, right?
Host: So the scale is way too big than what you can imagine, right? Like just for S 3 there are 100 pages or threat modeling can imagine for 300 services, right?
Rodrigo: Because everything is connected and it’s pretty hard. I’m developing a training here and I was looking like for the networking part, like because we’re talking like the control plane part, I have action, I attach administrator access, I elevate my privilege, these kind of things. We were talking about that.
But there is some ways like these actions could open paths like on the network side and I was looking for them vip seat period. And so I grabbed all the actions there doing gripping for siri and so I saw that Game Lift that I have no idea what Game Lift does. It’s here, like all my tabs to be analyzed, but it has the same action. Like create a peering with a zip C. And so I figured out maybe I could from Game lift, create a peering with another across account or maybe another vip seg and so move to the other side. And so action that I’m probably not looking for. Like if I look on my clothes and say what’s going on? I have no idea.
Never looked for that. There is too much integration than some details that probably could take someone else to I didn’t pass that. I’m just saying what I saw like two days ago. It’s in my mind to take a look, but I need to stop and figure out this game list is the kind of uncommon service for me. I probably if I ask a bunch of friends who are using Gamelifts, nobody will raise the hand.
Host: No, that absolutely goes to your philosophy of looking at uncommon services, right, because maybe nobody uses it, but it still has a vulnerability through which you can get to the data. Right?
So one of the things that as you pointed out, right, attackers try to exploit some vulnerability somewhere so that they can get into your cloud environment and start getting access to the data. Right? And one of the longstanding and popular vulnerabilities is the EC two instances which has Imdsv One enabled which gets exploited quite a bit. And I think the V two was released sometime in 2020. So can you share what Imdsv One is and how do attackers exploit and how can we address that? Why is it taking time and stuff like that?
Rodrigo: Well, that’s a good point. That’s one of like when I was talking about the default truth, that’s one of the points. Like it like creates the metadata v two to protect for server side requests for your attacks.
Like most likely because of the Capital One case, like it was emblematic because that occur. What is metadata? Metadata? That’s crazy p like 16925, 416-925-4254 that catch a bunch of information about where the security group, the ami, you’re using these kind of things.
When you attach a role to the instance, the credentials of the role are cached directory.
If you are from the instance, you could access that and grab that information. And that’s what’s happened with captiva, right? It was there that they could do the territory, modify the host header and so have access to the metadata and grab the keys and use it. For example, if you have some restriction, like I could grab the keys, but I could not use the keys. Right? And after that they probably because of this they released the metadata v two. And so the metadata v two. You’re not just pointed to that IP and the path of that, you need to get a token modify more headers. And so if you have a flaw in your application, it’s a service request for sure.
Probably you could not abuse that. But it’s disabled by default. Like if you go creating instance by default, it’s use the metadata you want. And so as I was talking like before, the solution is there. They have something that’s more secure that could improve your security, not more security to improve your security. If you have a natural problem, it’s going to protect you from depending it talks about application. If I have access to the shell of the machine, it’s the same, it doesn’t matter.
I could abuse in the same way, but it’s disabled by default. And so why do you go into that kind of check that csp has? For example, like you have a bunch of instance the instance because there is a fuse there that oh, I forgot the name of the field. But if the token I think required token something like that is optional or required on the describe of that, it’s optional. It’s because you’re using metadata v one required because you’re it. And so you could just go there and find the instance that is not working. You could go there and enable the metadata v two. And the best way to avoid creating machines, you could create a service control policy on the organization level that deny like a machine being created using not using the metadata v two, but by default like it’s metadata v one.
Host: Okay, that’s a very good advice, right, where maybe leverage SCP to restrict because by default aws doesn’t enforce that. So that’s one way to secure your prevention.
Rodrigo: I think the best way is to prevent because if you depend everybody else, trust everybody else that they are changing the automation kind of thing, it’s probably not going to work. And the second you could add at the cloud trail, when you have the run instance part, you have that information that metadata one or two is being used and so you could trigger alert that something is not on compliance, but that’s more active. Right? That is the prevention. Because I don’t see any reason to use metadata v one instead of v two.
Host: Makes sense. So I have one question on how you do your work, right? There are many open source thread research related platforms like Sadcloud, you highlighted this cloud code or hack the box or cic Don’t, et cetera. Do you use any of them for your research or do you have any favorite tool that you use for your research work?
Rodrigo: Well, the way I do my research since I’m trying to find new stuff like different ways, most of things are not there.
I use a lot of club goats, I like claw goats a lot because I like to explain how to view because one of the problems the marks sometimes they have no idea how dangerous is that misconfiguration or that permission permissions, excessive permission because it’s new to everybody, right? It’s new, it’s something newer. And so like oh, you have metadata v one or so, they have no idea. But when you configure cloud goals showing, like the ssr bridge and so, like, from here I came through here, I devoted to the counter plane. And so I could now because you did too permissive juice role, I could mend your control plane. It could access your data, this kind of thing. And so it helps a lot, like for analysts and to generate the information to circle. That’s why we’re looking for this action or these actions plus this kind of behavior I really like this kind of there is a nice truth from Epic set cloud is from Beshop shop but I am vulnerable that creates like I am bad practice and so that’s pretty interesting too.
Okay, so what we’ll do is we’ll tag the repo as well when we publish this video so that our viewers can also go to that and use that for their own threat detection. Okay, but yeah, you are saying something. No? Yes, but most of my time what I’m trying to do is let’s try to see the uncommon service and so maybe in future could become part of this. But for now it is not part of this.
I’m always looking for something and when I look to something, I try to figure out like, oh, how can I lose that? I was looking for roles anywhere from a M so that’s something not that new but it’s new I just cover it right? You’re used to look like oh they are trying to do the privileged collation they attach user palace attach role paul is because I could attach some policy different and so I saw the rules anywhere they have up to date profile. That seems like because I’m just read, like, in this morning, it seems like you date the permissions of the role, and so, like, add some other privileged collision points. Like, I need to create some detection. Like the same for sso dented center. Like when you’re doing the privilege escalation at dental center, you’re not looking for attached user policy or something like that. You have the permission add permission set. I don’t know exactly the name.
Different interactions. There is a bunch of things that do the same problem and we’re probably not looking because it’s not that common.
But yeah, that’s it.
Host: No, that makes sense. And I think you highlighted a key point here is that- IAM and when it comes to Identity Center, they are sort of handled slightly different because there is permission set and all of that defined and then they also map to im policies. So it makes it even more complicated. So the library that you highlighted right, the im vulnerabilities one.
that maybe our viewers can use that to analyze some of that.
Rodrigo: Yeah, that’s more to play to understand. I think my best advice is always start an account, add a budget alert to make sure that you’re not going to pay like $1,000 bill, right? And you start to click there’s a bunch of free tiers, a bunch of trials that’s the best read, the base, the theory and so go to practically and I think the best way to learn makes sense.
Host: That’s a very good advice and that’s a great way to end the questions section as well.
Here are a few important points that stood out from this discussion.
First one is before using any new service or thinking about any new service, thinking about security for any new service cloud practitioners should do threat modeling exercise to understand the attack paths because that helps in defining the right set of permissions policies for the service.
Second one is to limit impact of any attacks always apply restrictions on high impact permissions like write or put permissions in the im policies and for that you can use conditional clause like conditions clause where you can restrict it to a vpn ips or your particular ips or IP certs.
The third one is for imds v one vulnerability define policies in scp to disable creation of any new EC. Two instances with imds v one enabled
Host: The first question that we have is what advice would you give to your 25 year old self starting in security?
Rodrigo: I think like I’m I used to like to understand things and that’s that’s the kind of advice that I give to most people and I think I maybe could try to do better. It stood the base stood up the base of things like stood the operation system, stood the protocols, understands how fing works and kind of things and so move to something else because currently I think people are going to get on the top like too fast. And I think if you work the basics you work like developer or system administrator like DevOps now the different names kind of thing, you’ll get the basic and so your basis will be so very rare structure and so you could put a lot of information on top of that pretty easy. And so you’ll be a better secured guy, you’ll be a better DevOps guy, a better code review guy, this kind of thing. If I could add something like 25 years ago for me to be stood more the basics I used to stood your basics, right.
Host: Yeah, I know the basics, makes sense. The next question is a one liner quote that keeps you going.
Rodrigo: I think it’s from lin storehouse, from linux. And once he said less talk, more action. And so I like that. Let’s do more things and not just talk about things.
And there is stock cheap shown the code right and I think less stock, more action is something that I really like because the road is full of good ideas. But if you don’t pick one and try to do that, it’s just ideas. Right?
Host: Yeah. Execute in a way. Right. That’s very valid. The third one is, what’s the biggest lie you have heard in cybersecurity?
Rodrigo: It’s secure by the phone.
Host: That’s very much in line with the theme of the episode as well. Right?
Rodrigo: Yeah. Because I heard I was in a conference and there was a research from Microsoft that time, and why are we moving to Cloud? I’m moving to Cloud because it’s more secure. Oh, wait, that’s not true. It could be more because I think it could be more secure than never, if you understand, because you have more control about everything api based, and so you could control and know everything is going on. But if you don’t know, you have the old problems because you have patch problems, you have code problems, you have all these fees, and you have a bunch of new problems. Like, you have the control plane, you have the logs disabled by the service by default, and so it’s much less secure.
Like, if you don’t really know, I think this approach, like, oh, it’s more secure, it’s better. This kind of thing that like the vendor speech most of the time. That’s terrible. Yeah. No, you are absolutely spot on. Right. Most of the folks think that we move to Cloud and we are secure, but that’s not the case.
Host: Right. You have all these bells and whistles that you have to go through to make sure that you are secure. Yeah. I think you could be, like, as secure as you probably never be on premise, but you need to really understand, really manage. You have all the nice team, all the things.
That’s a great way to end the episode. Thank you so much, Rodrigo. It was very insightful. I could learn a few things from the episode as well, and I’m hoping our viewers will learn something as well from this. So thanks for coming to the show.
Rodrigo: All right, thank you. Thank you for having me here. That’s a pleasure.
Host: And to our viewers, thanks for watching. Hope you have learned something new. If you have any questions around security, share those at scale to zero. We’ll get those answered by an expert in the security space. See you in the next episode. Thank you.