Splunk Your Magic Data, Froggy!

We doubt if anyone under age 40 will know what the play-on-words within our title means, but business software developer, Splunk, Inc., recently debuted the latest version of its data analytics software.  Splunk 4.3 creates a data pipeline that begins with raw machine data that is acted upon for “event analysis” and ultimately ends up being displayed for users.

Input
The first segment of the four-step data pipeline, called Input, is the acquisition of raw data in digital formats from sources such as websites, applications, servers, customer feedback forms, log files, network feeds and infrastructure devices that power business operations.  The data is segmented into 64k blocks, identified with a timestamp and a number of character encoding keys.  Then these blocks are prepared for data analytics services in the next segment of the pipeline.

Parsing
The second data pipeline segment is Parsing which is the first of two major Splunk predictive data modeling steps.  Parsing is where data is examined, analyzed, and undergoes the initial stages of event processing.  An event is the least common unit (i.e., the DNA) of the Splunk data analytics procedure.  Event processing manipulates and compresses the data in preparation for indexing.  The blocks of data are further subdivided into individual lines. Then, important fields are extracted and multiline-events (picture the equivalent of a word-wrapped line in a table or spreadsheet) from the subdivided block information are pared down to usable portions of the datasets.

Indexing
The main analytics preparation segment is Indexing where the parsed raw data events from the previous segment are written to the search index on disk.

Search
According to Splunk literature, the search function “manages all aspects of how the user sees and uses the indexed data, including interactive and scheduled searches, reports and charts, dashboards, and alerts. As part of its search function, Splunk stores user-created knowledge objects, such as saved searches, event types, views, and field extractions.”

banner

Improved Features
Among many other innovations, Splunk 4.3 includes:

  • The charts and timelines no longer use Flash-based displays so Splunk can be used no matter what the user’s device is: computer, iPad, iPhone, or any device with an installed web browser.
  • User dashboards can be defined and edited entirely through the Splunk User Interface.  And they can be repositioned simply by dragging and dropping them.  The upgraded UI can accommodate ten times the number of simultaneous users as did the previous version.
  • The Visual Panel Editor allows the changing of chart types and various chart properties without having to edit the XML.
  • Whether an analyst or CEO, take any data of interest and turn it into compelling tables and visualizations.
  • Integrated real-time and historical search results.  Upon executing a real-time search in the new version, historical data is back-filled into the display for comparison to the real time data as it streams in.  This provides the historical perspective often needed in many real-time analytics cases as a means for referencing the real time data against a previous baseline.
  • The Data Input Previewer takes the uncertainly out of indexing file-based data by showing the data that is about to be indexed and preview how the even-segmenting and time-stamp extraction will be handled.  Users can see a temporary display of what the data analytics will be committing to a full blown indexing strategy.

And, finally, Splunk offers potential customers a limited (60-day; index 500Mb max) trial version that can be downloaded for free at their website.