Matthew Cook -
Software Money Pit Blog
    Strategy & Management
    Trends & Technologies
    Case Studies
About Matt
Buy Matt's Book
Matthew Cook -
  • Software Money Pit Blog
    • Strategy & Management
    • Trends & Technologies
    • Case Studies
  • About Matt
  • Buy Matt’s Book
Trends & Technologies

Big Data: Correlations, Not Cause-and-Effect

February 18, 2016 by Matt Cook No Comments

Image by Marcos Gasparutti, CC license

In their recently published book, “Big Data: A Revolution That Will Transform How We Live, Work, and Think,” Viktor Mayer-Schonberger and Kenneth Cukier say that big data will provide a lot of information that can be used to establish correlations, not necessarily precise cause and effect.

But that might be good enough to extract the value you need from big data.

Three examples from their book:

  1. Walmart discovered a sales spike in Pop-Tarts if storms were in the forecast. The correlation was also true of flashlights, but selling more flashlights made sense; selling more Pop-Tarts didn’t.
  2. Doctors in Canada now prevent fevers in premature infants because of a link between a period when the baby’s vital signs are unusually stable, and, 24 hours later, a severe fever.
  3. Credit scores can be used to predict which people need to be reminded to take a prescription medicine.

Why did the people involved in the above examples compare such different sets of data? One possible reason: because they could – relatively quickly and at low cost – this was made possible by superfast data processing and cheap memory. If you could mash together all kinds of data in large volumes – and do so relatively cheaply – why wouldn’t you until you found some correlations that looked interesting?

You can begin experimenting – a process I endorse — with Big Data. You need three basic components:

  1. A way to get the data, whether out of your transaction systems or from external sources, and into a database.
  2. Superfast data processing (a database with enormous amounts of RAM and massively parallel processing). This can be had on a software-as-service basis from Amazon and other vendors.
  3. Analytics tools that present the data in the visual form you want. Vendors include Oracle, Teradata, Tableau, Information Builders, Qlikview, Hyperion, and many others.

Correlations are usually easier to spot visually. And visualization is where the market seems to be going, at least in terms of hype and vendor offerings. New insights are always welcome, so we shall see what sells and what doesn’t.

The assessment from Gartner seems about right to me at this point in time: that big data is both 1) currently in the phase they call the “trough of disillusionment;” and 2) promising enough that its use in BI will grow sharply.

Share:
Trends & Technologies

What Is In-Memory Computing?

June 5, 2015 by Matt Cook No Comments

Image: Memory Bus by ARendle, CC license.

In-memory computing, also known as massively parallel computing, is composed of two things: 1) huge amounts of RAM; and 2) huge amounts of processing power.

In-memory computing is another technology leapfrogging the traditional data warehouse. An in-memory architecture uses data that is in the main memory (also known as Random Access Memory, or RAM) of a computer, rather than data on a hard disk.

Data retrieval from a disk is the slowest part of any analytical query, because the software has to “find and fetch” the data you want, and queries accessing very large amounts of data just can’t be done in a feasible amount of time.

You’ve probably already experienced this. I work with people who launch some SAP queries that take an hour or more to run. These people would like to query even larger amounts of data but don’t even bother trying because they know SAP might just stop in midstream or take so long that the information isn’t worth the effort.

An in-memory setup eliminates “find and fetch” because the data isn’t even stored on a disk; it’s available right there in the main memory of the application, which means it is available for selection and use in your inquiry.

It also means that the way you collect, sort, analyze, chart, use and interpret data should change dramatically – from a fixed and limited process to a more natural and iterative process. The in-memory technology makes it possible to gather information in a way that is a lot like your normal thought process.

Your brain is like an in-memory computer. To make a decision, you first start with the information you have in your head. Then you gather what is missing, using the web, asking questions, reading the newspaper. Your brain immediately processes each new piece of information and sometimes in seconds you’ve made your decision.

This new paradigm – massive data storage connected to super fast computing power – will change what we ask for. No longer will we ask for a report on sales by customer, by date, by region, by product. Instead we will want every single piece of data related to any sale of anything to anyone, say, for the past two years–every single invoice, credit, return, price, discount, the person who sold it, the commission paid on it, the color of the product, the shipment date, delivery data, invoice payment amount, date of payment – everything. This will become the expectation in all areas of an enterprise.

Amazon Web Services (AWS) is one place to secure this type of environment.  The costs for 20 to 40 terabytes of storage is about the same as the monthly rent of a Manhattan apartment.

Share:
Trends & Technologies

Internet of Things: Three Practical Uses

June 2, 2015 by Matt Cook No Comments

Yes, your new fridge can be an internet-enabled Thing, and you can text it to check the beer supply, possibly avoiding a stop on the way home (although, is it possible to have too much beer?)

Alas, which of life’s many difficult hardships will technology eliminate next?  The smart fridge is cool, but it’s about as necessary as a lawn ornament (no offense to law ornament fans).

What about the breakthrough, make-the-world-a-better-place uses for IoT?

In business, I see three promising areas:

Inventory: Good, cheap, durable sensors attached to inventory could cut losses and improve  accuracy.  RFID isn’t good enough in many cases, although that is changing: Airbus uses RFID tags to track thousands of airplane seats and life vests, and a major Japanese clothing retailer has applied RFID tags to everything in its stores, including inventory, hangers, and merchandising displays.

Retail: Already some stores are starting to use sensors to detect when inventory on the shelf is low.  If the trend continues and accuracy is good, this could be a revolution in retail inventory tracking, which is currently done by scanning UPC codes.  As the costs of sensors drops, more and more (lower value) products can be included in this type of solution.  Some hotel mini-bars now sense when items are consumed, eliminating the need to count, write down, and key in how many drinks and snacks a hotel guest had. 

Machinery diagnostics: For complex production lines that are difficult to keep running at top performance for long periods, IoT sensors could continually measure and transmit machine parameters, output, speed, consistency of cycles, and other variables to create a visual record of performance that could then be correlated with unplanned downtime; cause and effect could more easily be determined and machine performance improved.

PINC Solutions, Inc. markets connected sensors and software for managing truck fleets at plants and distribution centers.  It is a straightforward, practical application of IoT: trucks have RFID sensors that uniquely identify them; trucks are attached via software to delivery numbers, dock doors, destinations, and other information via a giant virtual whiteboard.

The benefits here are easy to understand: measure & reduce wait times at pickup and delivery points, reduce idling and searching in a yard full of trucks for the one you need, and provide real-time on-the-road status and ETA.

For more on this topic, check out this IoT primer published by Goldman Sachs.

For more articles like this, visit my site at softwaremoneypit.com.

Share:
Trends & Technologies

Big Data 101

May 10, 2015 by Matt Cook No Comments

Image: “Data Center.” by Stan Wlechers, CC license

So what is Big Data, particularly Big Data analytics? Why all the hype?

Big Data is what it implies: tons of data. We’re talking millions or billions of rows here – way too much for standard query tools accessing data on a disk.

What would constitute “tons” of data? Every bottle of “spring,” “purified” or “mineral” water that was scanned at a grocery store checkout during the month of July 2011; the brand, the price, the size, the name and location of the store, and the day of the week it was bought. That’s six pieces of data, multiplied by the estimated 3.3 billion bottles of water sold monthly in the United States.

Big Data analytics is the process of extracting meaning from all that data.

The analysis of big data is made possible by two developments:

1) The continuation of Moore’s law; that is, computer speed and memory have multiplied exponentially. This has enabled the processing of huge amounts of data without retrieving that data from disk storage; and

2) “Distributed” computing structures such as Hadoop have made it possible for the processing of large amounts of data to be done on multiple servers at once.

The hype you read about Big Data may be justified. Big data does have potential and should not be ignored. With the right software, a virtual picture of the data can be painted with more detail than ever before. Think of it as a photograph, illustration or sketch – with every additional line of clarification or sharpening of detail, the picture comes more into focus.

Michael Malone, writing in The Wall Street Journal, says that some really big things might be possible with big data:

“It could mean capturing every step in the path of every shopper in a store over the course of a year, or monitoring every vital sign of a patient every second for the course of his illness….Big data offers measuring precision in science, business, medicine and almost every other sector never before possible.”

But should your enterprise pursue Big Data analytics? It may already have. If your company processes millions of transactions or has millions of customers, you have a lot of data to begin with.

You need three things to enable Big Data analytics:

  1. A way to get the data, whether out of your transaction systems or from external sources, and into a database. Typically this is done with ETL or Extract, Transform, and Load software tools such as Informatica. Jobs are set up and the data is pulled every hour, day, etc., put into a file and either pushed or pulled into a storage environment.
  2. Superfast data processing. Today, an in-memory database (a database with enormous amounts of RAM and massively parallel processing) can be acquired and used on a software-as-service basis from Amazon Web Services at a very reasonable cost.
  3. User interface analytics tools that present the data in the visual form you prefer. Vendors include Oracle, Teradata, Tableau, Information Builders, Qlikview, Hyperion, and many others. The market here is moving toward data visualization via low-cost, software-as-a-service tools that allow you to aggregate disparate sources of data (internal and external systems, social media, and public sources like weather and demographic statistics.
Share:

Categories

  • Strategy & Management
  • Trends & Technologies

Archives

© 2017 Copyright Matthew David Cook // All rights reserved