How Ryft Uses AWS’s Reprogrammable Chips for Big Data Analytics in Hybrid Cloud
New-generation FPGAs can be reconfigured on the fly, enabling accelerated search across on-premises data centers and cloud.
September 21, 2017
Field Programmable Gate Arrays, or FPGAs, have some big advantages. You get the data parallelism of GPUs but without the high power bill; you get ASIC-like compute efficiency but with more flexibility, because you can reprogram them. But programming them is complicated, and that’s always been a problem, restricting FPGA use cases to specialized appliances.
But that’s changing, as cloud giants integrate FPGAs into their data center infrastructure. Amazon Web Services’ latest F1 FPGA cloud instances, rolled out earlier this month, enable businesses to move those specialized-appliance workloads to the cloud. One such business is Ryft, which now offers its FPGA-powered search acceleration on AWS.
“We’re doing high-speed analytics for big data applications, and we use FPGA to get the performance, but we deliver that as easy-to-use abstractions,” Bill Dentinger, Ryft VP of products, told Data Center Knowledge. The company accelerates big data analytics by removing the need to index data companies capture before they can search it.
It can accelerate Splunk or SAP Hana in on-premises data centers or cloud services like Tableau or Elasticsearch. “We’re able to both search data without having to go through that [indexing] pipeline and to expand the capabilities Elasticsearch can offer,” Dentinger said. “FPGA allows us to add different search capabilities like exact search, fuzzy hamming search, or Levenshtein edit distance.” (That measures the difference in characters between one word and another, which is useful in everything from spell checking to comparing DNA sequences.)
Chips Reconfigured Mid-Workload
The energy efficiency and parallelism of FPGAs isn’t as important to Ryft as their programmability. The Xilinx FPGAs that AWS picked for its F1 instances allow developers to partially reprogram the hardware in the middle of a workload, which is especially important.
“In the big data world of analytics, the marketplace is changing dramatically all the time,” Dentinger said. “An FPGA comes with a certain amount of logic elements, and you program them to whatever functions you want to do.”
Ryft changes what the FPGA does from moment to moment, which is a new capability for these programmable chips, necessary to implement all its functionality. The new generation of FPGAs can be partially reconfigured on the fly, while the older ones could only be reconfigured once they had completed a workload. “Almost every Ethernet switch has an FPGA to change the network interface,” he explained. “Ehen your switch or router boots, it loads an image into the FPGA, but it doesn't change the image until you turn it off. We have these different primitives and search capabilities, and we need the ability to change the image of the FPGA not only because the market is doing something different, but because we’ve got so many things, they won’t all fit in the FPGA image.”
Reprogramming the entire FPGA to switch between search functions would take too long, but partial reconfiguration is much faster. “If we have the FPGA set up for exact search, we can reprogram it on the fly in milliseconds to do regular expressions. If you’re searching tweets for the words ‘happy hour’ in the latitude and longitude of say, Rockville in Maryland, you start with an exact search, but if it takes three seconds to load the new (FPGA) image to do a numeric search for [latitude] and [longitude], it’s already taking three seconds too long.”
Hybrid Model is a Must
Ryft is offering its service on AWS because customers have been asking for it to be available in the cloud, but it’s the hybrid option that may be the most interesting. By embedding the AWS Run command on its appliances in your data center, Ryft lets you create hybrid applications with images running in AWS. You’re still going to be searching data in AWS from AWS and data on your own servers using a Ryft appliance, but you’ll be able to link those searches.
“The compute in AWS will have data relevant to what you’re going to do there, and the compute in your data center will have the data that’s relevant there, and you’ll pick and choose which data you want to move to the cloud to be close to that compute,” Dentinger said. “But maybe you have human resources information and information about what devices people use in your business, and you put that in the cloud, and you do queries against that. If you want to know what IP address somebody sent an email with, and that was captured locally, you probably don't want that information to be sent to the cloud. So, you might be searching for my IP address in the cloud and then searching in the on-premise system to see what IP address I've been talking to.”
Ryft has to support a hybrid model, because customers aren’t going to move all their workloads to the cloud all at once. “We have to be very easy to use and have the ability to go from in-cloud to on-premises without customer knowing or caring. You can run Ryft on AWS with the same APIs and connectors that you use in your data centers, and that allows us flexibility.”
About the Author
You May Also Like