Ever thought, Why Netflix still use AWS kinesis data streams?

Yashraj Panda
7 min readSep 21, 2020

--

Netflix is the world’s leading internet television network, with more than 100 million members worldwide enjoying 125 million hours of TV shows and movies each day, including original series, documentaries, and feature films. Members can watch as much as they want, anytime, anywhere, on nearly any Internet-connected screen.

It’s not uncommon for competitors to do business with each other, when there’s mutual benefit.

To give another example from computing history, Microsoft made the Macintosh successful by creating the Mac’s killer app, which was Microsoft Word. Ultimately, Microsoft and Apple would become slightly bitter rivals, when Windows started competing with the Mac, but Microsoft still continued to develop products for the Mac.

I am a sure that the relationship between Netflix and Amazon is mutually beneficial. Netflix is one of the largest customers of AWS. If Netflix one day decided to just leave AWS, it would hurt Amazon’s cloud division a lot, so Amazon has every incentive to keep Netflix happy. In fact, I wouldn’t be surprised if Amazon made more net revenue from having Netflix as an AWS customer than they do from being in the streaming business. This is very hard to tell because most people who sign-up for Amazon Prime do so for the shipping.

Netflix also owns the most vital part of their infrastructure, which is the CDN. Infrastructure-wise, that’s their secret sauce. What Netflix has done by agreeing to use Amazon’s AWS is essentially to off-load the parts of their business that aren’t that special, so they can focus on the parts of the business that will give them the most strategic advantage.

I don’t think it’s really a question of having the right talent. You can always buy talent if you have enough money. It’s a question of priorities.

I’m not an insider, but from what I can tell, Netflix’s main priority right now is content. Without the right content, it doesn’t matter if they have the best infrastructure in the world. No one will use their service without content. That actually makes Hollywood as Netflix’s main competitor, because Hollywood owns most of the content.

Hollywood doesn’t have a good distribution mechanism. That’s where services like Netflix and Amazon Prime come into play. But the relationship between Netflix and Hollywood is still fairly one-sided, because Netflix doesn’t have a lot of leverage over Hollywood. That’s why Netflix is aggressively building up its own content library, so it will depend less on Hollywood. This is the same recipe that made HBO successful. All the licensing agreements and content production costs require tons and tons of money and resources, which Netflix would have less of if they had to spend more on infrastructure.

And again, Netflix still owns the part of their infrastructure that’s most important, which is the CDN. Furthermore, by diverting some of its infrastructure spending to a cloud provider like Amazon AWS, it is keeping the cloud nice and healthy, with plenty of cash reserves to continue to improve technology.

The one thing that Hollywood would love to see is for all these Internet clouds to evaporate, so that the world would be forced to return to the good old days of buying content on discs rather than streaming.

When you view Hollywood rather than Amazon as the main rival of Netflix, their choice makes a lot of sense.

Something else to keep in mind is that the only reason Amazon has a cloud service is for historical reasons. Amazon was built in the 90’s when there was no such thing as a cloud.

It made sense for Amazon to spend a ton of resources buying lots and lots of servers and data centres, so that they would dominate online shopping. But the main reason Amazon needed all that server capacity was for one time of the year, Christmas shopping. If people went to Amazon’s web site and it timed out because it was overloaded during these shopping spikes, people would get a bad impression of the company. So Amazon needed to have all of this server capacity, but most of the year, that capacity would be completely idle and wasted resources.

In other words, Amazon would have been the ideal customer for cloud computing. But since it didn’t exist at the time, they had to invent it. There is no reason for every other company to re-invent the wheel. Especially a company like Netflix that is trying to focus on just one business, streaming.

The Case Study of Netflix and AWS kinesis data stream.

Application Monitoring on a Massive Scale

Netflix uses Amazon Web Services (AWS) for nearly all its computing and storage needs, including databases, analytics, recommendation engines, video transcoding, and more — hundreds of functions that in total use more than 100,000 server instances on AWS.

This results in an extremely complex and dynamic networking environment where applications are constantly communicating inside AWS and across the Internet. Monitoring and optimizing its network is critical for Netflix to continue improving customer experience, increasing efficiency, and reducing costs. In particular, Netflix needed a solution for ingesting, augmenting, and analyzing the multiple terabytes of data its network generates daily in the form of virtual private cloud (VPC) flow logs. This would enable Netflix to identify performance-improvement opportunities, such as identifying apps that are communicating across regions and collocating them. The company would also be able to increase uptime by quickly detecting and mitigating application downtime.

Each log record carries information about the communications between two IP addresses. However, in a dynamic environment like the one at Netflix, where an IP address can float between applications from day to day or even minute to minute, IP addresses alone don’t have much meaning. “The data sources we had before we took on this initiative were one sided,” says John Bennett, senior software engineer at Netflix. “We’d know an application was connecting to others, but we didn’t know both sides of the conversation and how to optimize those communications or the placement of the applications on the network.”

Netflix set out to establish a new data source that could give it more insight into communication among applications and regions by combining VPC flow logs with application metadata.

Centralizing Flow Logs Using Amazon Kinesis Data Streams

From the outset, AWS enabled Netflix to experiment with different approaches to analysing its network data. “Early in the design process, the flexibility to try different ways of processing the data was important,” says Bennett. “We experimented with multiple designs and used many AWS products to get here.”

The solution Netflix ultimately deployed — known internally as Dredge — centralizes flow logs using Amazon Kinesis Data Streams. The application reads the data from Amazon Kinesis Data Streams in real time and enriches IP addresses with application metadata to provide a full picture of the networking environment. “Usually, we would put the data into a database, which would build an index to enable faster querying,” says Bennett. “Dredge joins the flow logs with application metadata as it streams and indexes it without using a database, which eliminates a lot of the complexity.”

The enriched data lands in an open-source analytics application called Druid. Netflix uses the OLAP querying functionality of Druid to quickly slice data into regions, availability zones, and time windows to visualize it and gain insight into how the network is behaving and performing.

Improving Customer Experience with Real-Time Network Monitoring

Netflix’s Amazon Kinesis Data Streams-based solution has proven to be highly scalable, each day processing billions of traffic flows. Typically, about 1,000 Amazon Kinesis shards work in parallel to process the data stream. “Amazon Kinesis Data Streams processes multiple terabytes of log data each day, yet events show up in our analytics in seconds,” says Bennett. “We can discover and respond to issues in real time, ensuring high availability and a great customer experience.”

Netflix is now able to identify new ways to optimize its applications, whether that means moving an application from one region to another or changing to a more appropriate network protocol for a specific type of traffic. “Our solution built on Amazon Kinesis enables us to identify ways to increase efficiency, reduce costs, and improve resiliency for the best customer experience,” says Bennett.

Although a streaming data solution is not new to the IT industry, it is an innovation in the networking space. “Netflix is heavily invested in AWS in part because it abstracts the underlying network, so we don’t have to deal with switches and routers,” says Bennett. “We’re monitoring, analyzing, and optimizing at a higher level of the stack — in ways we would never even consider if we were running our own data centers.”

Benefits of AWS

· Processes and enriches multiple terabytes each day, representing billions of events, with sub-second response times for analytics queries

· Highly cost efficient compared to competing solutions

· Freedom to experiment with system architecture to arrive at the most effective solution

· Data ingestion initiated with just a few simple API calls

· Highly elastic solution with close to 1,000 Amazon Kinesis shards working in parallel

--

--

Yashraj Panda

A B.tech undergrad, enthusiastic towards learning new technologies in the market and integrate the technologies with each other.