Building Better Data Networks for the People, CPO of Graphiant
Sep 5, 2023
Listen to Crafted, Artium's podcast about great products and the people who make them.
"People using this isn't the thing I'm thinking about. Human beings can tolerate high latency and delays to a certain level. Machines cannot. AI cannot, IoT cannot."
As Chief Product Officer of Graphiant, Ali Shaikh is building data networks that are flexible, secure, and ready to meet the demands of the next generation of people – and devices. And just as the move to the cloud-enabled founders to launch their startups more quickly – because they didn’t have to worry about servers and rackspace – Graphiant’s “as-a-service” network has the same potential: to free founders up to focus on what they really care about, not setting up VPN tunnels and worrying about secure connections.
Full transcript below — but we recommend you listen for the best experience.
Ali: People using this isn't the thing I'm thinking about. Human beings can tolerate high latency and delays to a certain level. Machines cannot. AI cannot. IoT cannot.
Dan: That's Ali Shaikh, Chief Product Officer of Graphiant. Graphiant is building data networks that are flexible, secure, and ready to meet the demands of the next generation of people and devices that will connect to them. And just as the move to the Cloud enabled founders to launch their startups more quickly, Graphiant's as-a-service network has the same potential.
Ali: If the network was also infrastructure as code, then I could rethink how connectivity works entirely. That's the future I see, where just like we have made computer storage accessible, network will be accessible.
Dan: And to build that new network, Graphiant has embraced a culture of rapid iteration.
Ali: Nothing is going to be perfect. So trying to build a perfect product doesn't make sense. The only product that is perfect is a product that can continually evolve.
Dan: Welcome to Crafted, a show about great products and the people who make them. I'm your host Dan Blumberg. I'm a product and engagement leader at Artium, where my colleagues and I help companies build incredible products, recruit high performing teams, and help you achieve the culture of craft you need to build great software long after we're gone. What is Graphiant and what is the future that it enables?
Ali: Graphiant is fundamentally a networking as a service company. We came into existence to solve one very particular problem. That is that the world of connectivity is becoming dynamic, arbitrary, away from the kind of fixed modes of connectivity that we have had for many, many years. A lot of it because of Cloud, a lot of it because of IoT, AI, you name it. So the nature of connectivity is changing. The architectures that we've had so far weren't designed for this new world that we're entering. And so we created Graphiant to create a new mode of network connectivity to enable the use cases of the future. And that's just the mountain level picture of what Graphiant is. Around 2013, our CEO of Graphiant, Khalid Raza, he was the one who founded a company called Viptela, which pioneered the software defined wide area networking capability to start thinking about how do we bring the internet into the fold? Because now we have mobile banking, we have apps, we have iphones, the whole internet revolution in a way had happened where Web 2.0 had really evolved, the Cloud had evolved, the app stores had evolved. So we wanted to bring the internet into the mix. And at that point in time, it was just a shift in terms of, yes, we have private connections, let's add internet connections into the mix as well, and make sure we secure them.
Dan: Fast forward to 2020 and the landscape had shifted even further.
Ali: The same application could be accessing Amazon, could be accessing Google. So the complexity of our demands from customer standpoint, from an application space standpoint, skyrocketed. And that's where our Founder Khalid said, ok, we need a new protocol. We need to innovate in terms of rethinking how anything can connect to anything without having to think about what is the physical cable in the ground. It should be completely virtualized and should be completely programmable. If you had to deploy an ATM in early 2000's, you would have to get a connection from AT&T or Verizon, a specific private connection, because that's the only way you could secure it. You get to 2014, 2015. You still get that connection, but you augment it with an internet connection. And you use sd-wan technologies to do that. You come to now, 2022, if I want to use dual LTE 5G Broadband, all of the above, and be able to connect without having to redesign my whole network, that is what the Graphiant architecture is for.
Dan: So fast forward a bunch of years and people are using Graphiant. What is the future that this enables if you can now deploy IoT devices and all sorts of things and all sorts of contexts in ways that previously it was just harder?
Ali: So the funny thing, fast forward many years, people using this isn't the thing I'm thinking about. Human beings can tolerate high latency and delays to a certain level. Machines cannot. AI cannot. IoT cannot. Their expectation of network performance, just because of how those applications get designed, is very different. And they, unlike us, aren't accustomed to making a phone call to tech support to get a change control in place to get something changed for them. Things are expected to be programmatic. The network, for historical reasons, as it exists today, is quite rigid. It isn't designed for an AI to drive itself, to set metadata into packet headers and reprogram the traffic path on demand. That doesn't exist yet. That is what I see as the future, where there's six billion human beings on the planet. There's going to be twice, thrice, and number of times more. Artificial non-human entities on the network that are consuming data, generating data, and dynamically want to program the path that the data takes.
Dan: How did Graphiant come to be?
Ali: How did Graphiant specifically come to be? Viptela gets acquired in 2017 by Cisco for roughly $610 million. And so a lot of our folks, the assets, the technology, gets absorbed into Cisco and gets incorporated into their networking ecosystem. And things are going really well, and the technology is seeing good adoption. Customers are using the technology, and Cisco being Cisco continues to do its business, right? But for Khalid, the problem was becoming very obvious that we had a challenge of still rigid topologies. Architecturally, we were still using fixed connectivity models, even in the virtualized space, which is we were pinning up tunnels, VPN tunnels, from every point to every other point. And we were wasting computer in setting these up. So we were not very green efficient. We were quite rigid, and we weren't set up for a programmatic future.
Dan: The widespread practice of creating VPN tunnels everywhere was becoming a significant burden.
Ali: It's almost as if. I want to drive from New York to Pittsburgh, but I'm being forced to go to Atlanta. That's the world we live in. So even though I have to drive and I want to drive, I'm being forced down a path just like the Airlines are. So one of the most interesting experiences, especially over 2018, 2019. There's a lot of companies that start to take on the Cloud-first approach, which is we're going to move everything to the Cloud. Ok, fantastic. But the network that I set up for those companies isn't set up for those. So for example, this company, a lot of their applications were being delivered out of Miami. It's like, ok, everything is coming back to Miami. But now you want to go to Amazon US East 1. Well, that's in Virginia. How are you going to do that? Well, we're just gonna point all the branches in that location. Well, not all your applications migrate instantaneously, even just a lift and shift effort. Forget rewriting application, just a lift and shift.
Dan: Yeah, at first I love that expression, lift and shift as though it's so easy.
Ali: Exactly. It's like even that, it takes time and the network is expected to just be up and connect even to your legacy environment, as well as your new Cloud environment and it should work seamlessly and it should have all the right preferences and the logic and safety, security, everything. Okay, how do you want to do this? Well, let's just set up a VPN tunnel. Okay, from how many locations? 1,400, Okay. And you want redundancy? Yes. So you want dual tunnels. Oh, but we also have dual devices at the location for redundancy. So you have dual devices, dual circuit and LTE, and you want tunnels from all of them going into Amazon US East 1. All right. You also then realize that Amazon is not going to let you build that many tunnels on a single VPN gateway. So now you need to start creating lots of VMs. Yes. But we also need a firewall. So you need a firewall for all of them. Okay. But now you also need a load balancer because now you need ALB to handle this. It's like the whole thing starts to take on a life of its own. And it's just at the heart of it is because we have to build a tunnel, a strict point to point tunnel that goes from one remote location to another. And because the protocols aren't smart enough to understand business logic, they are very much point to point defined. So you have no other choice. You deploy a whole bunch of VMs, a whole bunch of firewalls, connect ALB to all of those, connect to a transit gateway, set up a like a ridiculous number of IPSec tunnels, and that's your topology. And it is fixed. If two weeks later you ask me, well, we also want US West. I have to redo all of this. For US West. And it again, isn't just a copy, paste. So this is where you start to see this problem just magnify. And this company would say, ok, this is very expensive. And my response would be, well, yes, because the design that you want is trying to be both traditional as well as Cloud-first. And you can't be both simultaneously, or you're just going to increase the complexity in cost.
Dan: So if that same company came to you today, came to Graphiant today, how's it different?
Ali: I would say you don't need to build all these thousands of tunnels, because the protocol doesn't require you to. It essentially treats the entire backbone, the middle mile of connectivity, and lets you connect into anything. So if you have a remote location, it can connect into US East 1, US West, and your data center in Miami without having to build a tunnel per circuit, per remote location, times 2 for redundancy, times 2 for multiple regions, et cetera, et cetera. All of that goes away. The protocol will know that you want to connect to all of these things, and it essentially allows you to connect to them without building all these tunnels. So this is typically when we go into customers. My first question always is, how many tunnels do you have and do you hate managing them?
Dan: So you leave Cisco, you and Khalid leave Cisco, you join Graphiant, you found Graphiant, you start building. What did you build first and how did you decide where to start?
Ali: What did we build first and where do we know how to start? In a way, the way I described how these different layers work, we had to build all three of these simultaneously, the control plane, the data plane, and the encryption piece. And we had to do this simultaneously. The way we started to think about building this out was we would need to build three layers. One was going to be the edge itself. So the software that gets deployed at the branch, at the data center, in the Cloud, basically where the user is, where the applications are. So that would be the edge. And it would need to have a range of functionality and be able to do encryption and policy and all those kinds of things directly at the edge. We would need to build a core that would transform the entire internet backbone so that we could treat it in this programmatic fashion without having to build a multitude of tunnels in all directions. And then we would need to build a portal, a management system that would be able to give visibility control to the end consumers. So we essentially have three layers that need to get built. And we have to build them in parallel because they're all tightly integrated because it's delivered as a service. So for the portal, we knew that we were going to build a Cloud Native Application, a distributed microservices application. We need a particular talent set for that and different techniques in terms of how to build that layer. The backbone would be another layer. The core has to be built a very specific way. That's a different skill set. People who have built large scale networks, who have dealt with subsea cables, with peering points, with large scale routing domains, with providers. And then the edge would be almost a platform or appliance level code, which would mean, again, a different skill set. People who know how to build different sets of applications that will go directly at the edge. And so we start with the architecture for all of these three working in unison. And we spend a good number of months working on the architecture.
Dan: I know you believe in rapid iteration. Can you share an example of how you employed that in the early days or today at Graphiant and have incorporated customer feedback and maybe how that's led to a breakthrough or a new way of doing things?
Ali: So there's a number of things, and this is where the advantage of being a startup is quite handy. We get feedback, we make changes. We've probably overhauled the user interface several times now over the multitude of years that we've been in business now. So we, from a user interface, from routing protocols, from different capabilities, we have iterated very quickly and we continue to iterate very quickly. An example of that would be, we built different kinds of prototypes in terms of displaying certain kinds of information to customers regarding the routes in an environment. We received a lot of feedback from a lot of customers saying, this isn't as helpful. We understand the motivation of why you're building it, because now, because of splunk and data dog and dynatrace and so many other companies, there's almost kind of an expectation of certain kinds of visibility tools. So we got the feedback saying, we understand what you're doing and why you're doing it, but it doesn't really work for this use case. We'd like these kinds of things. And we changed it. Within a couple of months, we had new prototypes, showed it to customers, they liked that more, rolled it out. And we continue to make those kinds of changes. But even deeper into the stack, we'll look at things around as far down as this bit in the header. Do we need it? It'll change the overhead from 34 bytes to 38 bytes. Well, why do we need those two bytes of overhead? Let's see if we can get rid of it. Or does it have actual value? So there is layers to it. And for me, it always turns into nothing is going to be perfect. So trying to build a perfect product doesn't make sense. The only product that is perfect is a product that can continually evolve.
Dan: Can you share a bit how Artium helped as you try to figure out what information to present to users in the right way?
Ali: My other involvement with Artium started when I started to look around to think about different ways to design experiences. And there are two ways that Artium was very helpful. One was, I myself don't come from a design background. So one was just getting me more informed around design systems and what are the best practices for design systems and how to evaluate what a good design system looks like, make sense of those kinds of things. So one of the things that Artium helped with the most was helped me directly in terms of understanding what the new world of design looks like. And that's changing pretty dramatically as well. The second part was around creating prototypes and concepts around new experience structures, new modes of interacting with our interface. And the way I approached Artium was, let's look at very specific use cases that are that the network is making possible, but the end user isn't a network operator. How would I convey this kind of information, this information to someone who doesn't know about the tunnels and the vpns and the routing and the forwarding and the encryption? How do I do that? Even though a lot of my users will be very, very savvy, very experienced network operators, but there'll be many who aren't, and I need to find something easy for them. So Artium created a number of prototypes for us, helping us make sense of what kinds of patterns work well, what patterns don't work well, and if you're gonna launch a new product that maybe has less to do with networking and more to do with applications and marketplaces and just enabling new business opportunities, you have to think of it entirely differently than a technical product. So that's where Artium has been a very great partner in helping me in my own iterative cycle, as well as Graphiant's journey to create new things.
Dan: What are some of the things that you as someone not schooled in design, which other sort of network engineers, what's one or two things that they should really know or think about when it comes to how to present this kind of information.
Ali: There are three things that I will say. One that comes from personal bias and two that I've learned over the experience. One is, for some reason, we all, for a number of us, those who, I would say generationally, who came into existence before phones and ipads and everything else was available, hover over still seemed like a good idea to us. And I think the designers are pretty clear and mobile platforms very clear. That's not a good idea anymore. That's one. The second thing that I realized was, especially when you're specialized in a technology area, what is easy and simple to you is entirely impossible to grasp for someone else. And if your product continues to have that level of complexity, it's going to be a challenge. Now, the tug of war that everyone should mentally think about, especially if you're building the product, your most savvy and informed and trained and experienced operators and engineers, they will want the most complex version of things because they want all of that control. And when you give it to them, they will be the unhappiest people as well because you gave them too much control and now it's unwieldy.
Dan: Yeah, you didn't make decisions for them enough.
Ali: Exactly. The balancing act is so tricky. And this is why bringing them into the design process is, is very helpful as long as you're cognizant of those dual factors. They both want everything, but if you give them everything, they won't be happy about it.
Dan: Yeah. That's a great insight.
Ali: The last thing I'll say that this experience has taught me is that I have still a lot to learn and the things that are happening in, you know, Figma and Adobe and some of the things that their latest conferences, it is going to be a very interesting time in the design world. I'm just like a newbie dipping my toe in the water here. When the network works, you don't think about it. It's only when it stops working that everyone gets up in arms. And it's true from the largest company down to home users. Nobody in my house complains when the wifi is working fine, but that netflix stream blips for a second. I get a text, hey, is something wrong with the internet? It's like, that is the root of the problem. And it's how networking is. You are expected to have a flawless network, and you're only called when something goes wrong.
Dan: Yeah, I had to reset my router about a half hour ago, just before this call, because I don't know where the problem is. It's some gremlin, maybe you want to come to my house and I'll be figured out. I don't know. I don't know if it's the service provider, the wifi network inside the house. I don't know. Once a week or so, I have to reset it.
Ali: But see, I would say, just looking at that example, you're either the network engineer, designer, or the operator, or the devops person. And so now I would say to you, are you the person who looks around the house and goes, well, I need to make sure I have the right wifi mesh everywhere and all my devices are set up the right way with my wifi set up perfectly right? Or are you the person who simply goes, man, I'm only gonna go log in when something is not working. Just show me what's wrong, where the red light is and give me a button that'll fix it.
Dan: Yep. We were among the many who upgraded to a mesh network during the pandemic and that seemed to be good. And I don't know, there's some, there's some issue. The other thing, the problem is most acute actually, you'll appreciate this. I'm consulting right now to one of the largest banks in america. I'm tunneling in through their VPN and the wifi is generally fine in the house, unless I'm using that service, in which case it's completely insufficient. That's when I have to reset it because it just, it can't handle that, that strain of trying to get into the, to the bank enterprise land.
Ali: And there you go. Even at a single individual level, the VPN problem of artificial tunnels, creating inefficient topologies, not accounting for the distributed nature of work is a problem for the future. And this is just for human beings. Forget autonomous cars and generative AI and everything else. That's a whole other world. Try telling an AI bot. Yeah, you're throttled because your VPN tunnel is inefficiently set up.
Dan: Yeah, it'll turn you into a paperclip. Go a little further into the future. So there are self-driving cars, there's bots, like, you know, go 20 years, whatever feels like the right distance out where what you're building today is enabling. And tell me if this analogy stretches things too far, but like you talked about the early days of the Cloud and the Cloud has enabled people to launch new startups in all sorts of ways. They didn't have to, you know, create their own, you know, infrastructure in large part. It was just kind of turnkey. I use that in air quotes, but like explain how this creates something that is somewhat like that for future startups that haven't even been born yet.
Ali: That's a very good point to start with. It's the Cloud as an example. The Cloud is a perfect analogy for this. Infrastructure as a service enabled us to have access to computer storage in a way that would have otherwise been impossible for small companies, for startups, just that landscape of compute, right? Buying a server, rack stacking it, making sure it's cooled and everything else, we couldn't do it at our homes. That would be impossible. People's labs, people's garages, there's a limited capacity before it overheats and someone in the house gets very angry with you why the garage is overheating. This turns into why the Cloud allowed as many startups to exist as the Cloud became available. Because we could consume infrastructure on demand. The network is not like that right now. The network isn't on demand. I can't dynamically increase my capacity on the fly. I can't change quality of service on the fly. I have no control over any of that. I buy a connection and the connection is the connection I get. Anything that I need to do requires more software, more engineering, and typically only the big companies can afford that kind of stuff. But if it became as a service, it became consumption oriented, then suddenly I would have access to a programmatic network landscape that I could use to access private connections, secure data connections to anything and everything. I could launch new applications, new services. Imagine yourself having access to this kind of service and not having to worry about getting a new VPN access or getting approved for VPN access for bank one, bank two, bank three, right? Let's use the colors, right? Red, blue, and I guess yellow. And it's just like, you wouldn't have to deal with that. You are connected. And as long as the security policy is good, you are an authorized consultant for any of them, for all of them. You don't have to redesign your whole network. If you wanted to build a new application, you could build a new application and it is privately connected into red bank, blue bank, yellow bank, all of them. It changes that model because it enables private connectivity to anyone. The future that I see, just like we're seeing infrastructure as code, where infrastructure entirely, even within the Cloud, is treated programmatically. The applications themselves can dynamically adjust how much compute, how much storage they want. They can expand, they can shrink down, they can auto-scale, they can do all those things. But the network is still fairly rigid. If the network was also infrastructure as code, then I could rethink how connectivity works entirely. This means a future where there's autonomous cars, there is edge computing, there is AI, data is being generated not just in the data centers or in the Cloud, but data is being generated anywhere and everywhere. The transport of this data can programmatically be determined by the application itself. It can say, for this data, take this path, this quality of experience, this qos, make this behavior happen. That's the future I see, where just like we've made Computer Storage accessible, network will be accessible. Then you're not going to worry about LTE 5G Broadband. You won't. You'll have it all. Love it. Your phone and your computer will just decide what it wants to use, when it wants to use it.
Dan: I love it. Thank you so much for thinking about these things. Most of us don't have to. Thank you so much for your time today. This has been really fun.
Ali: Absolutely. Thanks, Dan.
Dan: That's Ali Shaikh, and this is Crafted from Artium. At Artium, we love partnering with visionary companies like Graphiant to help build incredible products, recruit high-performing teams, and achieve the culture of craft needed to build great software long after we're gone. You can learn more about us at thisisartium.com and start a conversation by emailing hello at thisisartium.com. If you liked today's episode, please subscribe and spread the word.
Ali: There's some very exciting things that I see happening there.