Doing Data Science in the Cloud with Domino

Print Friendly, PDF & Email


As a practicing data scientist and big data journalist, I often find myself down in the trenches on pursuit of new trends, products, and services. Earlier this week I attended a local machine learning meetup group event and I came away with a real gem. The presenter mentioned in passing a new cloud service called “Domino” and I rushed back to my office to learn more. I wasn’t disappointed.

Domino Data Lab is a new cloud service that allows you to run R, Python, and Matlab code in the cloud. With the byline “Data Analysis, Accelerated” the company also offers features addressing the need for automatic version control and collaboration for data, code, and results.

Domino lets you easily move your data analysis to the cloud without making any changes to your code or configuring any servers. This makes it easy to get long running, memory intensive jobs off your computer and it puts nearly infinite computational resources at your finger tips. By doing your runs on Domino you also get version control for your data, code, and results. You can easily role back past results and share your work with others. Domino uses high-end hardware to run code for data scientists on fully managed hardware. The company handles configuration, machine management, data transfer, and security.

We want to empower data scientists, to unleash higher quality analysis and a faster pace of work. Today, we think too many data scientists are held back by the quality and ease-of-use of the tools available (e.g., there are lots of people who can do quality analytical work but aren’t technical enough to setup their own infrastructure),” said Domino founder Nick Elprin. Before starting Domino, Nick spent seven years designing and building analytics software at a large hedge fund.

Installation is quick and simple, requiring client-side software that’s compatible with Mac OSX, Windows, and Linux. Pricing includes an introductory free account, as well as monthly accounts ranging from $9 to $499 depending on your needs.

Use Case Example

Alex Bond is a post-doctoral research fellow based in Canada. For his latest personal project in avian ecology, Alex is examining millions of bird records for spacial and temporal trends. With only his personal computer at his disposal, Alex uses Domino to run R scripts essential to his research. Once running, Domino adds horsepower to Alex’s research. He can run scripts anywhere, not tethered to university computing resources, and without limiting himself to questions answerable with 8 GB of RAM.

Once you get the feel for it,” he says, “Domino makes it easy to upload code and get results. My project would have been impossible without Domino. A run that crashed my local machine took only four hours with Domino.”

Private Clouds

Domino Enterprise can be installed inside your private network, on top of your own compute resources. This allows you to keep all data inside your network and to leverage your existing hardware, while taking advantage of Domino’s great functionality for job distribution, version control and collaboration.


Sign up for the free insideBIGDATA newsletter.


Speak Your Mind