Sign up for our newsletter and get the latest big data news and analysis.

The Downside of Converting Full-Text PDFs to XML for Text Mining

To get the best results from text mining projects, researchers need access to full-text articles. However, when researchers obtain full-text articles through company subscriptions or document delivery, the documents are often provided as PDFs, a suboptimal format for use with text mining software. The burden is then on researchers to convert the PDFs to XML. But that can be inefficient and costly. Read on as Michael Iarrobino, Product Manager at Copyright Clearance Center, explains the pitfalls of converting full-text PDFs to XML for text mining.

Key Challenges for Commercial Text Miners

Researchers use text mining tools to extract and interpret facts, assertions, and relationships from vast amounts of published information. Mining accelerates the research process. However, despite the many benefits of text mining, researchers face a number of obstacles before they even get a chance to run queries against the bigger body of literature. Read on as Michael Iarrobino, Product Manager at Copyright Clearance Center, explains the key challenges for commercial text miners.

The Exponential Growth of Data

This is the first entry in an insideBIGDATA series that explores the intelligent use of big data on an industrial scale. This series, compiled in a complete Guide, also covers the changing data landscape and realizing a scalable data lake, as well as offerings from HPE for big data analytics. The first entry is focused on the recent exponential growth of data.

The Advantages of Mining Full-Text Articles over Abstracts

Given their easy accessibility, many researchers use article abstracts to identify a collection of articles for use in text mining. But, while abstracts provide some valuable pieces of information, there are major advantages to taking steps using and mining full-text articles instead. Read on as Michael Iarrobino, Product Manager at Copyright Clearance Center, explains the advantages of mining full-text articles over abstracts.

IoT Analytics – Part 6

This is the sixth and final article in a series focusing on a technology that is rising in importance to enterprise use of big data – IoT Analytics, or the analytical component of the Internet-of-Things. In this segment, we’ll provide a series of “best practices” and “lessons learned” for what companies are seeking from deploying IoT analytics.

insideBIGDATA Guide to Deep Learning and Artificial Intelligence

The insideBIGDATA Guide to Deep Learning & Artificial Intelligence is a useful new resource directed toward enterprise thought leaders who wish to gain strategic insights into this exciting area of technology. In this guide, we take a high-level view of AI and deep learning in terms of how it’s being used and what technological advances have made it possible. We also explain the difference between AI, machine learning and deep learning, and examine the intersection of AI and HPC. We present the results of a recent insideBIGDATA survey, “insideHPC / insideBIGDATA AI/Deep Learning Survey 2016,” to see how well these new technologies are being received. Finally, we take a look at a number of high-profile use case examples showing the effective use of AI in a variety of problem domains.

Data from Uber Movement Means a Bright Future for Cities

In this contributed article, Harry Glaser, CEO of Periscope Data, discusses the newly announced knock-out platform from Uber called “Movement,” which will offer access to its data around traffic flow in scores where it operates, intended for use by city planners and researchers looking to improve mobility.

From Small to Big Data, Adopting the Advanced Analytics Mindset

In this special technology white paper, From Small to Big Data, Adopting the Advanced Analytics Mindset, you’ll learn how to help data teams — analysts, scientists, and managers — to collaborate on data projects. One of the key success factors for these teams is to allow analysts to work on Big Data as easily as they do on smaller data with Excel, as well as to help them find new use cases specific to the data available and the tools at hand.

Advanced Analytics, The Modern Marketer’s Best Friend

In this whitepaper, you’ll learn how advanced analytics has the potential to transform the ways in which segmentation for marketing purposes is accomplished. It starts with a look at traditional segmentation methods and then moves on to exploring how advanced analytics (model-based segmentation) can change the game. Then you’ll explore a few marketing & analytics use cases in various industries. Lastly, you’ll examine the methodologies needed to implement model based segmentation in the real world.

The 5 Key Challenges to Building a Successful Data Science Lab & Data Team

In this special technology white paper, The 5 Key Challenges to Building a Successful Data Science Lab & Data Team, you’ll learn how a Data Lab establishes an effort to answer business needs by making sense of raw information. Data labs are intended to create critical mass within the organization that enables them to reach the level of innovation required for new data-driven products.