Sign up for our newsletter and get the latest big data news and analysis.

Interview: Stephen Goldberg, CEO of HarperDB

I recently caught up with Stephen Goldberg, CEO of HarperDB, to discuss how his company was founded to deliver a simple solution that could be used by any developer of any skill level without sacrificing scale or performance. He also comments on how the HarperDB database solution is being used for IoT project development, app development and enterprise data warehouses. Founded in 2017 and headquartered in Denver, HarperDB’s founding team has spent many years working in enterprise architecture, software integration, software development, and software sales. Stephen has previously founded two startups, and most recently was CTO at Phizzle, Inc. managing product, engineering, product marketing, and support. He has worked at companies large and small including Red Hat, Inc. where he led Infrastructure for their Global Support Services division. Stephen has 4 pending patents, and has been a speaker at both Salesforce.com’s Dreamforce as well as SAP’s Sapphire.

insideBIGDATA: Tell me a little about HarperDB. When was the company created and how does it address current challenges, such as footprint, cost, and actionable insights, in the database industry?

Stephen Goldberg: When my co-founders and I realized the database industry was so fragmented with overly specialized solutions, we knew there had to be a different, better way. We were frustrated that to achieve high scale transactional writing while achieving real-time advanced analytical capabilities, like entity extraction and sentiment analysis, we had to use several complex systems and an enormous amount of compute. We were in Palo Alto for a project where we began to brainstorm the concept of a single solution that could handle the transactional volumes of a NoSQL data store but also have the reporting capabilities of a traditional RBMS. We spent that night developing the idea: a centralized datastore, something simplistic and modern, creating a small footprint that even a junior developer could spin up in minutes. HarperDB, an enterprise class database written in Node.js and capable of running on the edge, was founded one year later in 2017.

As I mentioned, IoT lacks a database solution that is simple, secure and scalable, and covers the entire data value chain from ingestion on the edge to actionability in the cloud. Organizations know their data is valuable, but even after building out complex and costly infrastructure to manage that data, they’re finding it’s still not actionable. HarperDB’s database utilizes an HTAP (Hybrid Transactional/Analytical Processing) model, powered by a patent pending data storage algorithm that ingests both unstructured and structured data into a fully indexed, single model data store. Both NoSQL and SQL capabilities are provided natively in real-time, and there is no increase in the storage footprint. This means that developers can focus on development while interacting with their data in real-time, without having to worry about overly complex data models, configuration, or database management.

insideBIGDATA: You recently released a new edition of your solution called HarperDB Studio – what prompted the creation of this tool and how has it changed the way businesses interact with, and analyze their data?

Stephen Goldberg: We’re really excited about the release of HarperDB Studio for several reasons. Gaining actionable insights into real time data is becoming increasingly complicated for developers and end users alike, and so it’s more important now for the entire business to have access to data reporting and visualization. To meet the demands of today’s workforce, we released HarperDB Studio as an open source tool that provides an easy-to-use graphical interface for both data scientists and administrative users to access data insights for faster, better decision-making. HarperDB Studio reduces cost and complexity and makes real time analytics achievable and accessible.

DBAs can now have access to advanced reporting and analytics capabilities, including sharing across the entire company. For example, users can provide real-time graphs or link to a chart that shows live streaming data of what’s occurring on the edge, updated in real time, without any coding. While HarperDB itself is designed for developers, DBAs needs a way to interact with the platform that doesn’t require code, and this tool fulfills that for management, administration, and analytical needs. It’s the final step in shortening the data value chain and makes it possible to achieve real time insights into big data at scale.

insideBIGDATA: What are some of the key features that set HarperDB Studio apart from business intelligence dashboards such as Qlik or Tableau? How are these features used by businesses looking to acquire more control of their database?

Stephen Goldberg: While not made to replace BI tools such as Qlik or Tableau, HarperDB Studio is more of an ad-hoc tool for real-time processing and visibility directly on the edge, by any member of the business organization. This allows DBAs to regain control of their database, make decisions in a moment’s notice and communicate those actions effectively across the business through the use of our graphical user interface. When designing HarperDB studio, we targeted key features that include:

  • Search & Graphing Capability: Quickly turn SQL or filter searches into live graphs and charts which can be shared with business end users through web links.
  • Schema Management: Visualize the Dynamic HarperDB schema, manage schemas, tables, and attributes. Developers, DBAs, and Data Scientists can make sense out of unstructured data in real-time gaining clear insight into their data value chain.
  • Security Management: Manage users, roles, and access. Control field level security.
  • Advanced Log Management:  Gain access to HarperDB logs for easy management and searching capability without code.

insideBIGDATA: You mentioned that HarperDB Studio is an open source project, can you talk about why this was so important and the benefits an open source solution provides?

Stephen Goldberg: We are big believers in the power of Open Source, I am a former Red Hatter. That said, we felt the need to ensure the viability of HarperDB as a company and continue to build great products for the community long into the future. It was the right decision to make the core product proprietary. However, much of the promise of HarperDB and simplifying the data value chain will be achieved within the HarperDB Studio. We feel that the studio will ultimately set the direction for HarperDB as a company.  As a result it was an important step for us to offer the many benefits an Open Source project brings including cost efficiency, agility, transparency and community – to name a few. Open source code grants full visibility into the code base, with ongoing community discussions shedding light on how to adapt, improve and innovate the technology. By fostering creative development through the use of an open source code, our community of users are able to continuously introduce new innovations and advance and direct the vision of HarperDB through the power of Open Source.  We have already actively engaged a community of developers who are beginning to take over the Studio.  HarperDB will remain an active contributor, however we are already transitioning the leadership of the project to the community.

insideBIGDATA: Why is it important to have more visibility and control over edge computing environments? Do you think we’ll continue to see similar solutions being released in the database industry?

Stephen Goldberg: As the world continues to become more connected with the onset of IoT devices, the volume of data rapidly multiplies. While many business are attempting to collect and utilize this data, research from Forrester indicates only 1% of IoT data collected is ever used. Currently, the true promise of IoT, allowing for real-time actionability of IoT sensor feedback is not in place. Allowing visibility and control over edge computing environments unlocks the potential of the 99% of IoT data that goes unused and creates opportunities for DevOps, DBAs and data scientists alike to tackle big data challenges. With the ability to visualize the HarperDB schema, manage tables and attributes, and quickly turn SQL or filter searches into live graphs and charts – you’re enabling database transparency across the business and ensuring every member is empowered to make actionable decisions. It’s likely we will continue to see these capabilities in future solutions as data analytics moves from the cloud to the intelligent edge and as more members of the business are involved with Big Data collection and analysis.

 

Sign up for the free insideBIGDATA newsletter.

 

Leave a Comment

*

Resource Links: