How FlashBlade Helps AI Expert Crater Labs Make Impossible Possible
To be able to tackle "moonshots" — seemingly insurmountable problems — for its customers, Crater Labs needed better storage to keep up with demand. The answer: Pure Storage's FlashBlade.
Crater Labs has spent much of its short life in constant pursuit of better storage. The company, which uses advanced artificial intelligence and machine learning to create valuable intellectual property for its customers, often deals with huge amounts of data on multiple fronts simultaneously.
"Our goal is to help customers understand their potential and get them to the next stage so they can maintain their competitive advantage," said Khalid Eidoo, co-founder and chief technology officer at Crater Labs. "We call these moonshots—projects that seem almost insurmountable. By tackling these projects for our customers, they can remain focused on more immediate deliverables."
The projects Crater Labs undertakes for its clients run the gamut. For example, some projects deal with multi-gigabyte video files, while another, done simultaneously, might require working with very small files of a different type. This called for a storage system that could handle any type of file and allow researchers to perform any type of operation, without performance bottlenecks.
When Eidoo and Alexei Gavriline, Crater Labs' president, founded the company in 2018, they originally chose a storage infrastructure that consisted of several large storage arrays directly attached through multiple controllers. Within a few months, it became obvious that they had gone down the wrong path. Researchers experienced both controller and disk failures, especially when business ramped up more quickly than expected.
So the team switched gears, moving to a network-attached storage (NAS) solution based on traditional hard drives. That too, couldn't keep up with growing demand.
"When we started out, we only had a few terabytes of data across our client base, but then a few clients came on board that had dozens of terabytes of data," Eidoo said. "The NAS solution just couldn't scale when we tried to handle multiple large projects. We found ourselves constantly having to tune our file systems to get reasonable performance."
At one point, Eidoo said, many of the company's highly paid researchers spent days just waiting for storage availability, even though the team had plenty of GPU and CPU availability.
It was back to the drawing board for Crater Labs. This time, the company developed a complex matrix of the capabilities it would need. The evaluation matrix didn't specify any particular type of storage; Crater Labs was open to any type of storage that fit the criteria.
On the performance front, the company was looking for a storage infrastructure that wouldn't require file system-level tuning. It also wanted something that would allow for multiple concurrent reads and writes without taking any type of performance hit.
Because of Crater Labs' fast growth trajectory and the amount of data it often deals with, "we needed the ability to scale and not think about it," Eidoo said. "We needed to make sure that our storage solution would always be able to stay ahead of that curve."
Other important considerations revolved around ease of use, the ability to work in hybrid cloud environments, low administration overhead and solid support.
After evaluating products from five vendors, Crater Labs settled on FlashBlade, a flash-based file and object storage platform from Pure Storage. It hit all of Crater Labs' hot buttons, from zero touch scalability to linear performance.
Supercomputer-Level Processing
The new storage infrastructure has allowed Crater Labs to be much more efficient, take on more work and even save its clients some serious money. For example, the company can now run several large projects in parallel, something that was difficult or impossible to do before.
"We have some really big projects involving things like large-scale video analysis and bias detection, and we couldn't run those projects simultaneously because of storage bottlenecks," Eidoo said. "Now we have dozens of projects of that magnitude running in parallel, and we see no performance drop-off to our GPUs or CPUs as a result of bottlenecks from the storage."
It's also simpler to use. "Creating a snapshot on that file system is literally like three clicks, and you know you have reliable snapshots being produced on a regular basis. So you don't have to worry about data integrity at all," he said.
Plus, the new setup makes it easier to push data residing on FlashBlade that won't be used for a while to AWS Glacier, freeing up more space on FlashBlade. "That's something we can do now with two lines of code," Eidoo said. "Before, that it was a whole IT DevOps ordeal."
The new storage infrastructure also has enabled the company to build solutions that were previously too expensive, both for Crater Labs and its customers. For example, when Crater Labs takes on an AI project that involves hundreds of terabytes of data residing in the cloud, it is cost-prohibitive to download everything from the customer's S3 bucket or work directly with it in the cloud. To solve this problem, Crater Labs uses FlashBlade in an object pool setup where it acts as a nearline cache.
"Because we can communicate with FlashBlade over an S3 protocol as well as S3 in its native protocol, we can transparently cache information from S3 into our FlashBlade, perform all of that data preprocessing and training on those segments of data, and then pass those back up to the client's S3 bucket when finished," Eidoo explained.
With this method, customers aren't incurring tens of thousands of dollars per month in data transfer costs.
"The real beauty for us is that our researchers don't have to change their code to be able to point to different locations of files," Eidoo said. "As far as they are concerned, there are just two URLs. One is to the fast storage, which is FlashBlade, and the other is to S3. From our clients' perspective, that's great because when we deploy a model and provide them with a pipeline, they can either work completely in the cloud or implement their own FlashBlade-based solution as a part of it. Then you can play with how much data you keep in the cloud versus on-premises and look at optimizing cost, completely transparent from your research or development team."
Now that FlashBlade has proved itself and is working well, Eidoo hopes to be able to expand his business.
"There are a few areas of projects that we have always wanted to do that we couldn't even fathom doing before," he said. "We're hoping to get into genomic computing, helping build really massive scheduling simulations across supply chains. Normally, these were the kinds of distributed computing applications that required supercomputers, but I actually see a pathway now given how FlashBlade works within our environment."
About the Author
You May Also Like