Matthew Cook -
Software Money Pit Blog
    Strategy & Management
    Trends & Technologies
    Case Studies
About Matt
Buy Matt's Book
Matthew Cook -
  • Software Money Pit Blog
    • Strategy & Management
    • Trends & Technologies
    • Case Studies
  • About Matt
  • Buy Matt’s Book
Trends & Technologies

An Intro to Analytics Vendors

June 20, 2016 by Matt Cook No Comments

Image by David Bleasdale, CC license

Analytics is one of the top buzzwords in business software today. Analytics software is often marketed as a tool for business intelligence, data mining or insights. It’s the crystal ball software: tell me things I don’t already know, and show me ah-hahs or other exciting revelations that, if acted on, will increase sales, cut costs or produce some other benefit.

The essential elements for analytics are:

1) A design for your ‘stack’ which is just a term for layers: usually at the bottom you have a data layer, then a translation layer, then on top of that some kind of user interface layer. The translation and user interface layers are usually provided by the analytics vendor; you provide a place for data storage.

2) A way to send the data to your data storage, automatically, which is usually referred to as “ETL” or extract, transform, and load. SnapLogic and Informatica are two vendors who offer these tools.

3) Some way to “harmonize” the data, which means define each data element and how it will be used in analytics. “Sales” will mean such and such, “Gross Margin” will be defined as ……

These three components can be on-premise in your building or in a cloud hosted by a vendor.

SAS, based in North Carolina, has long pioneered this space, and now many business software firms claim to provide “robust analytics.” The problem: what constitutes “analytics”? Canned reports are not analytics. So you’ll need to shop this category knowing that probably the most serious applications will come from firms that are dedicated to analytics.

International Data Corporation (IDC) reports that the business analytics software market is projected to grow at a 9.8% annual rate through 2016. IDC describes the market as dominated by giants Oracle, SAP and IBM, with SAS, Teradata, Informatica and Microstrategy rounding out the top 10 in terms of sales revenue. Although the top 10 account for 70% of the market, IDC reports that “there is a large and competitive market that represents the remaining 30%…hundreds of ISVs (Independent Software Vendors) worldwide operate in the 12 segments of the business analytics market…some provide a single tool or application, others offer software that spans multiple market segments.”

Here are some other interesting analytics or business intelligence (BI) products: Qliktech provides easy-to-develop dashboards with graphical representations as well as tabular and exportable reports. Its Qlikview software is an “in-memory” application, which means that it stores data from multiple sources in RAM, allowing the user to see multiple views of the data, filtered and sorted according to different criteria.

Information Builders (IB) is a software company classified by advisory firm Gartner as a leader in BI applications. IB’s main application, WebFocus, is a flexible, user-friendly tool that is popular with sales teams because salespeople use it while visiting customers to enhance their selling messages with facts and visual interpretations of data.

WebFocus has a “natural language” search capability, making it useful to monitor and analyze social media.
Birst, named by Gartner as a challenger in the BI space, is a cloud-based (SaaS) application that offers “self-service BI,” deployment to mobile devices, adaptive connectors to many different types of data sources, in-memory analytics, drill-down capabilities, and data visualization. The Birst tool also has a data management layer, allowing users to link data, create relationships and indexes, and load data into a data store.  Tableau is another similar vendor.

It’s useful to start small and experiment with analytics.  People in your organization with good quantitative skills and imagination can experiment with tools, usually at very low cost.  Soon you will see some interesting results and will want to do more…but make sure to put in place some rules about what constitutes sanctioned and official “analytics” in your organization, to prevent uncontrolled proliferation of un-validated information.

Share:
Trends & Technologies

What Is In-Memory Computing?

June 5, 2015 by Matt Cook No Comments

Image: Memory Bus by ARendle, CC license.

In-memory computing, also known as massively parallel computing, is composed of two things: 1) huge amounts of RAM; and 2) huge amounts of processing power.

In-memory computing is another technology leapfrogging the traditional data warehouse. An in-memory architecture uses data that is in the main memory (also known as Random Access Memory, or RAM) of a computer, rather than data on a hard disk.

Data retrieval from a disk is the slowest part of any analytical query, because the software has to “find and fetch” the data you want, and queries accessing very large amounts of data just can’t be done in a feasible amount of time.

You’ve probably already experienced this. I work with people who launch some SAP queries that take an hour or more to run. These people would like to query even larger amounts of data but don’t even bother trying because they know SAP might just stop in midstream or take so long that the information isn’t worth the effort.

An in-memory setup eliminates “find and fetch” because the data isn’t even stored on a disk; it’s available right there in the main memory of the application, which means it is available for selection and use in your inquiry.

It also means that the way you collect, sort, analyze, chart, use and interpret data should change dramatically – from a fixed and limited process to a more natural and iterative process. The in-memory technology makes it possible to gather information in a way that is a lot like your normal thought process.

Your brain is like an in-memory computer. To make a decision, you first start with the information you have in your head. Then you gather what is missing, using the web, asking questions, reading the newspaper. Your brain immediately processes each new piece of information and sometimes in seconds you’ve made your decision.

This new paradigm – massive data storage connected to super fast computing power – will change what we ask for. No longer will we ask for a report on sales by customer, by date, by region, by product. Instead we will want every single piece of data related to any sale of anything to anyone, say, for the past two years–every single invoice, credit, return, price, discount, the person who sold it, the commission paid on it, the color of the product, the shipment date, delivery data, invoice payment amount, date of payment – everything. This will become the expectation in all areas of an enterprise.

Amazon Web Services (AWS) is one place to secure this type of environment.  The costs for 20 to 40 terabytes of storage is about the same as the monthly rent of a Manhattan apartment.

Share:
Trends & Technologies

Big Data 101

May 10, 2015 by Matt Cook No Comments

Image: “Data Center.” by Stan Wlechers, CC license

So what is Big Data, particularly Big Data analytics? Why all the hype?

Big Data is what it implies: tons of data. We’re talking millions or billions of rows here – way too much for standard query tools accessing data on a disk.

What would constitute “tons” of data? Every bottle of “spring,” “purified” or “mineral” water that was scanned at a grocery store checkout during the month of July 2011; the brand, the price, the size, the name and location of the store, and the day of the week it was bought. That’s six pieces of data, multiplied by the estimated 3.3 billion bottles of water sold monthly in the United States.

Big Data analytics is the process of extracting meaning from all that data.

The analysis of big data is made possible by two developments:

1) The continuation of Moore’s law; that is, computer speed and memory have multiplied exponentially. This has enabled the processing of huge amounts of data without retrieving that data from disk storage; and

2) “Distributed” computing structures such as Hadoop have made it possible for the processing of large amounts of data to be done on multiple servers at once.

The hype you read about Big Data may be justified. Big data does have potential and should not be ignored. With the right software, a virtual picture of the data can be painted with more detail than ever before. Think of it as a photograph, illustration or sketch – with every additional line of clarification or sharpening of detail, the picture comes more into focus.

Michael Malone, writing in The Wall Street Journal, says that some really big things might be possible with big data:

“It could mean capturing every step in the path of every shopper in a store over the course of a year, or monitoring every vital sign of a patient every second for the course of his illness….Big data offers measuring precision in science, business, medicine and almost every other sector never before possible.”

But should your enterprise pursue Big Data analytics? It may already have. If your company processes millions of transactions or has millions of customers, you have a lot of data to begin with.

You need three things to enable Big Data analytics:

  1. A way to get the data, whether out of your transaction systems or from external sources, and into a database. Typically this is done with ETL or Extract, Transform, and Load software tools such as Informatica. Jobs are set up and the data is pulled every hour, day, etc., put into a file and either pushed or pulled into a storage environment.
  2. Superfast data processing. Today, an in-memory database (a database with enormous amounts of RAM and massively parallel processing) can be acquired and used on a software-as-service basis from Amazon Web Services at a very reasonable cost.
  3. User interface analytics tools that present the data in the visual form you prefer. Vendors include Oracle, Teradata, Tableau, Information Builders, Qlikview, Hyperion, and many others. The market here is moving toward data visualization via low-cost, software-as-a-service tools that allow you to aggregate disparate sources of data (internal and external systems, social media, and public sources like weather and demographic statistics.
Share:
Trends & Technologies

Tell Me Again Why I Should Care About Hyperscale Computing?

May 2, 2015 by Matt Cook No Comments

Photo: “Trails in the Sand,” Dubai, by Kamal Kestell, CC license

If “Humanscale” computing is managing bags of sand, “Hyperscale” computing is managing each individual grain of sand in every bag.

“Hyperscale” computing (HC) is the processing of data, messages or transactions on a scale orders of magnitude larger than traditional computing.  HC is becoming a need for many businesses.  Why?

Consider a company that sells bottled water.  Its main business used to be selling truckloads full of cases of water to big grocery chains.  It has 25 different products, or Stock Keeping Units (SKUs).  The big grocery chains then distributed cases of water to its stores, which numbered 20,000.  The data requirements for the water company’s computers was manageable, even as the company grew rapidly.

Now, the company wants to analyze the performance of its products on store shelves by measuring things like velocity (how fast the product turns), price compared to competing products, and out-of-stocks.  It’s customers — the big grocery chains — are offering to supply data from their systems on every scan of every product in every store, because they too want to improve the performance of products on the shelf.

In one month during the summer, about 3.5 billion bottles of water are sold.  A data file from just one big grocery chain runs to 3 million lines.  How and where will you process this data?  Traditional databases will be too slow.  You will need superfast databases that distribute computing to many servers — this is called in-memory, or massively parallel computing.  This is an example of hyperscale computing.

Other examples where you would need HC: selling direct to consumers through their smartphones, where you might have to process millions of transactions say, during the Christmas holiday season; gathering machine data every second to monitor a machine’s performance (a General Electric turbofan jet engine generates 5,000 data points per second, which amounts to 30 terabytes every 30 minutes); and managing millions of product-attribute combinations.

The computing tools for hyperscale will not be found in your ERP system.  Trying to engineer your existing systems to handle hyperscale data and transactions will be a costly failure.  But there are tools available on the market today, and many of them are found in cloud applications, and in application hosting providers.

Cloud application and hosting vendors usually have much larger data processing capabilities, including automatic failover and redundant servers.  You can take advantage of this capacity.  For example, you can obtain, from a leading application hosting provider, at a cost less than the monthly rent of an apartment in New York City, 30 terabytes of storage and a massively parallel computing environment.

My advice:

  • Identify areas of your business that are significantly under-scaled, or where you have large gaps in business needs compared to processing capability;
  • Pick one and design a pilot project (many vendors are willing to do this with you at very low cost);
  • Measure results and benefits, and if beneficial, expand the solution to other parts of your business.

It’s probably not OK to ignore this trend.  Even of you don’t need HC today, think about the future and where commerce is going.  If you don’t gain the capability for hyperscale computing, one or more of your competitors probably will.

 

Share:

Categories

  • Strategy & Management
  • Trends & Technologies

Archives

© 2017 Copyright Matthew David Cook // All rights reserved