Advances in technology, acquisitions and mergers, consolidation, and regulatory compliances led to noteworthy growth in data sources and volume resulting in a more complicated and dynamic business environment than ever before. Existing technologies for Big Data let organizations significantly improve ROIs (Return on Investment) from the current data warehouse environment.
Data warehouse
modernization helps maximize the value of data | Source: tech-dynamics
Generating
and profiling stable data architecture to consolidate data sources is a rising
challenge for any organization. Organizations must build a plan to swiftly acclimate to vicissitudes to successfully implement and manage an
enterprise data warehouse.
Traditionally
built data warehouses with centric technologies and architecture were 15-20
years old, and were never designed to handle the volume, variety, and velocity
of the present’s datacentric applications. And no matter how old or
sophisticated the organization’s DW (Data warehouse) and the situations of
the environs surrounding it, it is, in all probability needed to be modernized
in one or more ways it can. That is because DWs and the requirements for its
purpose continue to evolve, extend, and modernize to support business
requirements and modern technologies. Many users must get involved by
realigning the DW environment with new business requirements and innovative
technology challenges. Once realigned, DWs need a policy for nonstop
modernization.
All
aspects of DataWarehousing Modernization Solutions take up various configurations,
from server upgrading and tweaking for data models to adding new platforms
into the EDWE (Extended Data Warehouse Environment) to replacing the key DW
platform. DW modernization may include using attributes hitherto untapped, such
as in-memory databases, in-database analytics, real-time functions, and data
federation or virtualization. Analytics, data integration, and reporting are also modernizing, and the DW is further pressured to provision data in many
ways that empower modern end-user practices such as advanced analytics, data
preparation, self-service data access, and visualization. The arrival of big
data has caused resultant provisioning to be more business-critical and much
more problematic. Notably, modernization also affects user’s proficiencies,
staffing, and team structure.
The
leading drivers behind DW modernization, according to the survey include
realigning the DW with fresh business goals, expanding DW scale for big data,
empowering new analytics applications, and taking up new devices or data types
and their linked practices. The main beneficiaries of modernization cover
analytics, business management, and real-time operations, and the foremost
obstacles involve difficulties with designs, funding, governance, staffing,
platforms, and much more.
Use Case: Data Warehouse
Modernization
The
DWM (Data Warehouse Modernization) is built on an organization’s present data
warehouse infrastructure, taking advantage of big data technologies to augment
its value. It is designed to maximize the value of the data warehouse
environment and not to replace it. DW modernization arises from two fundamental
requirements, namely, the need to leverage an assortment of information to gain
new business insights and the optimization of the warehouse infrastructure.
Leveraging an assortment
of information –
Relying
on data warehousing results in organizations forcing them to abandon invaluable
information. Organizations would like to be able to analyze multi-structured
information, but the warehouse is not built for it. Additionally, demanding
lower latency, organizations need data in minutes or hours, not weeks or
months. Moreover, organizations require query access to data.
Optimization of the
warehouse infrastructure –
The volumes of warehouse data of today
are attaining immense levels, putting tremendous stress on the data warehouse. The
data warehouse may not be expensive, but then, when organizations try to store
and analyze everything in that environment, performance will suffer resulting in cost increases.
Presently, there are three types of data warehouse modernization, as shown in the figure above.
1. Pre-processing hub or
landing zone –
This is used when a Hadoop capability is needed as a staging area for data before ascertaining what data would be moved to the data warehouse. Organizations can process and analyze streaming data in real-time to determine what would be stored, without having to store it first, either directly in the warehouse or in Hadoop by using InfoSphere Streams. InfoSphere Data Explorer can be used for early exploration, to determine what data to move to run deeper analytics or cheaper storage. In some cases, data would not need to be stored; being able to process and work on information as it is occurring could lead to reduced storage in the warehouse. Furthermore, data can be cleansed and transformed before loading to the warehouse.
IBM InfoSphere |
Source: IBM
This gives organizations the ability to perform analytics that might have been previously done in the data warehouse by utilizing a stream of computing analytics on data in motion, thus boosting the warehouse and enabling new types of analysis. Hadoop ad hoc analysis of information applies to any combination of structured, unstructured, or enterprise data allowing for deeper analytics than is usually possible. Furthermore, streaming data can be filtered down to find the high-value subset of data of interest which then can be stored in InfoSphere BigInsights or data warehouse.
3. Query-able data store –
In this methodology, aged data or seldom accessed data could be unloaded from the
warehouse and application databases using data integration software and tools,
in this scenario Hadoop, which helps boost the warehouse from a size and
performance point of view. This helps organizations store cold, low-touch data
in low-cost storage yet keep it within InfoSphere BigInsights using
query or BI tools. This would get rid of the need to move it back from the
warehouse on an ongoing basis, providing active archives. InfoSphere Data
Explorer can be used to view and navigate every bit of the data stored in
InfoSphere BigInsights.
The Evolution of the DW System
Architecture
The
growth of the multi-platform DWE is the development of the DW system architecture. Thus, the modifications at the system architecture level
are quite a widespread practice of DW modernization. This comprises easy upgrades
and patches for hardware and software servers or tools at one end and
organizations are adding up the latest data platforms and analytics appliances to
the extended DWEs to house vast data volumes, neoteric data types, and recent
analytics processing workloads, at the other end.
The
types of platforms being added to the DWE include those centered on appliances,
advanced analytics, columns, event processing, and Hadoop. These platforms always
complement the DW without replacing it. Every user organization and its DW is a
one-off scenario, and so too every modernization program. But even so, a handful of
common situations, drivers, and outcomes have arisen. The usual scenarios range
from hardware and software server upgrading to the periodic addition of new data
issues, sources, tables, and sizes/dimensions. Though data types and data
velocities are branching out more pugnaciously, consequently, data
modernization increasingly involves users diversifying their software
portfolios to consist of tools and programs created for big data from new
sources. While portfolios expand, a sizable number of data warehouses are
evolving and modernizing into complicated and crossbreed multi-platform DWEs
(Data Warehouse Environments). Even though delimited by complimentary systems
and tools, the old-style data warehouse is nonetheless the key core of the modern-day
DWE. Most of all, a handful of organizations are removing existing DW
platforms and replacing them with the latest DWs boosted for the present
needs in analytics, big data, high-performance, cost control, and real-time
operation. But, irrespective of what modernization methodology at present is in
play, all still need important modifications to the logical stratum and systems
frameworks of the extended DWE.
DW
professionals have many opportunities of what is inside the average data
warehouse to initiate or expand the use of new technological advances, such as
in-database analytics, in-memory processing, multiplatform federated queries,
MPP (Massively Parallel Processing), and Hadoop. The best systems can
similarly be streamlined by adapting agile, logical, lean, and virtual means or
by way of moving to modern team configurations, such as the knowledge or
excellence center.
Outside
of the DW, multiple disciplines have their specific modern-day advances that
necessitate support from a more modern DW. As an example, as BI (Business
Intelligence) is presently undergoing modernization, it needs DW to provision
the data for modern BI run-throughs, such as data exploration, visualization,
and self-service. Another example is, that modernistic business practices want
bigger, fresher, and newer data so the business can compete on analytics,
gain possession of actionable business value from the big new data, and
real-time business monitoring.
Modern Data Stack | xenonstack.com/ |
Top Reasons Why Modernizing Data Warehouse Is Extremely Important
Modernization
is all about extending existing DW infrastructure and leveraging big data
technologies to enhance its capabilities. Time-honored architectures are not designed
to deal with the 3Vs, (Volume, Variety, and Velocity) of today’s data-centric
corporate world and it requires endless hardware and service investments just
to gain minimal performance benefits. These architectures result in over-laden
and pricey data warehouses that require 3-6 months to add new data sources.
With the arrival of big data, businesses could benefit from modern technologies.
Here are the top reasons modernizing your data warehouse is extremely
important:
1.
Advanced Analytics –
The analytics age is here, and many businesses have heavily invested in building OLAP (Online Analytical Processing) applications and reporting, However, these businesses are currently making a swift shift towards innovative forms of analytics such as predictive and prescriptive models and leverage the strength of big data.
2.
Speed –
RDBMS (Relational Database Management Systems) are
designed for OLTP (Online Transaction Processing) data entries
operating on a single record at a time. Time-honored DWs are built on OLTP
platforms. Improvising the performance of these OLTP databases and supporting
DW for operating vast queries, organizations, and RDBMS vendors have resorted
to devising rules such as aggregating tables, materializing views, data
partitioning policies, and indices. Nonetheless, data warehousing operation
necessitates access to a vast quantity of records to validly perform even plain
analytics. DW implementations often face problems with limited human resources,
processing power, and storage required to maintain this approach. Also, this
approach is not capable of delivering an environment for real-time analytics.
In this scenario, organizations have begun to appreciate the significance of
“bringing time-critical situational awareness to data” which could only be done
with real-time analytics, getting analytics closer to real-time business
operations.
3.
Scalability -
Typically,
DWs tend to grow quickly in size, triggering pricey issues with scalability and
execution. With big data, organizations could accept the complete advantage of
commodity hardware to generate a flexible, data-centric solution.
4.
Productivity -
Time-honored
ADLC (Application Development Life-Cycle) systems of requirements collecting,
modeling, and development take many months. Organizations have taken up agile
development systems where frequent deliverables are achieved in data
warehousing, BI, and analytics. Furthermore, DW modernization makes available
the facility to influence data such as e-mail, social, and mobile and to
pinpoint new metrics that may be better at predicting behavior. The new metrics
can easily be integrated into an organization’s existing BI queries, analysis,
dashboards, and reports, every one of which increases productivity and leads to
data recovery, profiling, and data visualization. It further increases the
organization’s aptitude to massage and parse amorphous data (for example, log
files, text files, etc.), discover predictive measures in the amorphous
data, and swiftly feed that data into existing DW.
5. Costs -
Modernization
does not automatically mean a comprehensive refit of a data warehouse. This
approach only identifies and eliminates those existing investments that are not
generating ROI. DW modernization not only intensifies an organization’s ability
to amplify speeds and feeds in the data environment, but it also provides a huge
opportunity to optimize the overall costs in areas such as upgrades and
storage.
Leading Drivers for Data
Warehouse Modernization
The
average DW professional is painstakingly working to meet requirements posed by
some leading drivers of Data Warehouse Modernization simultaneously. These drivers
identified and grouped into eight broad areas, were discussed as follows.
Business concerns –
According
to a survey, DW-to-business positioning is the foremost driver for
modernization. Most drivers that data warehouse professionals are experiencing
in the modernization of DW are technical in nature. However, the most urgent
driver is the need to realign a DW so that it supports business goals (39% of
respondents) and as urgent is the necessity of running the business centered on
analytics and numbers (29%). Other business includes cost reductions (19%),
data privacy and security (16%), regulatory compliance (14%), and pressures on
competitiveness (13%).
Performance and technical
scale –
The
second remarkably familiar driver for DW modernization, which accounts for 37%
is to have greater scale, and speed, and to increase the capacity for growing data,
analyses, reports, users, etc. This comes as no surprise at all because DW
professionals have been improving their technology load for years
just to stay ahead of capacity. Yet, the arrival of big data in recent years,
the BI democratization, and the rapidly increasing programs for innovative
analytics have very much exacerbated this driver. Additionally, other
performance and scale concerns that need to be addressed include the increasing data volumes which account for 31%, the technological warehouse performance
at 23%, and multiple, diverse workloads optimization at 14%.
Need for modern analytics –
Based
on survey results, close to the top of the priority list is the increasing need
for modern run-throughs in analytics, such as graph, mining, and statistics; not
OLAP at 35%. Many organizations, despite new applications of advanced
analytics, continue to modernize their established reporting at 31% and OLAP at
12%. Take note also that modern analytics only complement and do not replace standard
reporting and OLAP; each delivers exclusive guidance and insights and hence is
necessitated by the modern organization.
Leveraging new data-driven
advantages –
Recently,
open source, vendor, and consulting companies have brought us new ways and
tools of leveraging data for organizational advantage of which many users see the business value and are thus far eager to adopt modern practices for data
exploration, prep, and profiling at 27%, enterprise data hub practices (data
lake or data vault) at 20%, analytical DW at 14%, and data virtualization at
12%. Likewise, at 23%, many professionals in data management are adopting modern practices for agile growth for they facilitate nimble business
practices, 23%.
Enabling real-time
fresh-data operations –
By
now, well-established are data-driven methodologies that empower real-time
business operations based on fresh data (26%), which includes operational BI,
management dashboards, and performance management. The majority or mostly BI-driven
organizations at present have programs in place for these, though these need
adequate modernization to gain traction for faster performance in fetching and
delivering real-time data and to give dashboards some modern features, such as
self-service access to data, data prep, and data visualization. In other
related areas, a few users are effectively working with the DW to embed its data in
daily processes (16%), usually in near real-time.
Problem fixing –
Data
warehouses are akin to most other IT systems, and to the same degree as they age,
the design itself and the enabling technologies in them become obsolete or in
simple terms not at all pertinent to an evolving organization. As a result,
some of the modernizations of DWs are purely driven by problems within the
current design or the architecture (24%) or problems within the present,
fundamental DW platforms (16%).
New big data –
Mostly,
DW professionals and some interrelated personnel in BI, data integration, and
analytics have in one way or another worked with data that is inter-relational
or if not, structured. Their very skills and tool portfolios are very much
tuned to inter-relational data and technologies, e.g., SQL, are at present
being seriously challenged by the data types and formats diversification,
specifically non-relational, unstructured, social (20%), and the data sources
diversification, such as GPS, sensors, machines (15%). The arrival of streaming
data (12%) is a special case that brings both of those together. For those organizations that are living through these forms of big new data, the data’s
atypical formats and sources are propelling DW professionals to bring up-to-date both skills and tool portfolios and platforms.
New data platforms –
Several
users are metamorphosing to Hadoop implementation and integration (18%) along
with other manifestations of NoSQL implementation and integration (7%), largely because older data platforms are not at all times suited to big new
data, in addition to other extreme volumes of traditional data enterprise. But
for some organizations, cloud, or SaaS adoption (11%) makes available a data platform,
elastically scalable at a low cost.
Benefits of Modernizing a
DW and Related Programs
Five
areas of DW modernization offering benefits according to twdi’s Best Practices
Report 2016 Q2 reports are as follows.
1. Analytics -
Charting
at the top, the most common beneficial area concerns analytics in general,
including exploration and visualization at 53%, and to a lesser degree, users
likewise see benefits, particularly for analytics applications, for example,
fraud detection at 15%, customer base segmentation at 12%, risk mitigation, and
management (risk quantification, 11%), consumer behavior comprehension as
observed in clickstreams at 10% and understanding business change at 10%.
2. Business –
Businesses
ranked their activities high among the potential benefits of modernization,
extending from decision-making, which accounts for 52%, to operating efficiency
at 34%. Still, a small number of respondents sense that modernization could
significantly address new business requirements (28%), competitive advantage
enhancement (28%), and the reinvigoration of both business and technology
systems of processes (10%).
3. Real-time –
A
frequent subject matter throughout the survey is how modern platforms, tools,
and features are key to facilitating recurring report and analysis cycles,
operating in near real-time (37%).
4. Methods and practices –
A
need for modern methods and best practices that could improve the agility of
delivery solutions (33%), DW management and maintenance (20%), and the
automation of the design, deployment, and operation of the data warehouse
(12%).
5. Funding and costs -
Some
respondents sense that modernization can help in leveraging big data with an investment
return (16%), data assets monetization (12%), and keeping within limits the
costs for the DW environment (7%).
Modernization
activities depend on the kind with which you are involved. It could be a
considerable extent of work and soak up a sizeable amount of resources.
Moreover, modernization is risky when not well-planned or well-supported
enough, and a few DW professionals consider most forms of modernization as a
distraction from the data-to-day, meat-and-potatoes work that must be done. But
accordingly, DW modernization is beyond doubt the opportunity it is hyped up to
be, according to most respondents.
____________________
Jose Richard P.
Archival has been writing about the telecommunications industry for several
years now specializing in electronics and DAS (Distributed Antenna System). Where
on the side, travel and the home & office industry writing is also a past time.
Graduated with a Bachelor of Science in Electronics and Communications
Engineering and a post-graduate master’s in business administration. He is also
enthusiastic about traveling and the outdoors.